WO2003065720A1 - Video conferencing and method of operation - Google Patents

Video conferencing and method of operation Download PDF

Info

Publication number
WO2003065720A1
WO2003065720A1 PCT/EP2002/014337 EP0214337W WO03065720A1 WO 2003065720 A1 WO2003065720 A1 WO 2003065720A1 EP 0214337 W EP0214337 W EP 0214337W WO 03065720 A1 WO03065720 A1 WO 03065720A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
active
video images
multimedia
speakers
Prior art date
Application number
PCT/EP2002/014337
Other languages
French (fr)
Inventor
Arthur Lallet
Original Assignee
Motorola Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc filed Critical Motorola Inc
Priority to JP2003565169A priority Critical patent/JP2005516557A/en
Priority to KR10-2004-7011846A priority patent/KR20040079973A/en
Publication of WO2003065720A1 publication Critical patent/WO2003065720A1/en
Priority to FI20041039A priority patent/FI20041039A/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234327Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
    • H04N21/2402Monitoring of the downstream path of the transmission network, e.g. bandwidth available
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • H04N21/25808Management of client data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26208Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists the scheduling operation being performed under constraints
    • H04N21/26216Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists the scheduling operation being performed under constraints involving the channel capacity, e.g. network bandwidth
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440227Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/443OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • H04N21/4621Controlling the complexity of the content stream or additional data, e.g. lowering the resolution or bit-rate of the video stream for a mobile client with a small screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor

Definitions

  • This invention relates to video conferencing.
  • the invention is applicable to, but not limited to, a video switching mechanism in H.323 and/or SIP based centralised videoconferences, using layered video coding.
  • IP Internet protocol
  • MCU Multipoint control unit
  • the MCU is an endpoint on the network that provides the capability for three or more terminals and/or communication gateways to participate in a multipoint conference.
  • the MCU may also connect two terminals in a point-to-point conference such that they have the ability to evolve into a multipoint conference .
  • FIG. 1 a known centralised conferencing model 100 is shown.
  • Centralised conferences make use of an MCU-based conference bridge. All terminals (endpoints) 120, 122, 125 send and receive media information 130 in the form of audio, video, and/or data signals, as well as control information streams 140, to/from the MCU 110. These transmissions are done in a point-to-point fashion. This is shown in Figure 1.
  • An MCU 110 consists of a Multipoint Controller (MC) , and zero or more Multipoint Processors (MP) .
  • the MC handles the call set-up and call signalling negotiations between all terminals, to determine common capabilities for audio and video processing.
  • the MC 110 does not deal directly with any of the media streams. This is left to the MP, which mixes, switches, and processes audio, video, and/or data bits.
  • MCUs provide the ability to host multi- location seminars, sales meetings, group conferences and other ⁇ face- to- face ' communications. It is also known that multipoint conferences can be used in various applications, for example:
  • MCU-based systems will play an important role in multimedia communications over IP based networks in the future.
  • Multipoint multimedia conference systems can be set-up using various methods, for example as specified by the H.323 and SIP session layer protocol standards. References for SIP can be found at : http: //www. ietf.org/rfc/rfc2543. txt , and http: //www. cs . Columbia. edu/ ⁇ hgs/sip .
  • the first frame of a video sequence includes a comprehensive amount of image data, generally referred to as intra coded information.
  • the intra-coded frame as it is the first frame, provides a substantial portion of the image to be displayed.
  • This intra-coded frame is followed by inter-coded (predicted) information, which generally includes data relating to changes in the image that is being transmitted.
  • predicted inter-coded information contains much less information than intra-coded information.
  • a known technique solves this problem by analysing the audio streams and forwarding the name and video stream of the active speaker to all the participants.
  • the MCU often performs this function.
  • the MCU can then send the name of the speaker and the corresponding video and audio stream to all the participants by switching the appropriate input multimedia stream to the output ports/paths.
  • Video switching is a well-known technique that aims at delivering to each endpoint a single video stream, equivalent to arranging multiple point-to-point sessions.
  • the video switching can be: (i) Voice activated switching, where the MCU transmits the video of the active speaker. (ii) Timed activated switching, where the video of each participant is transmitted one after another at a predetermined time interval . (iii) Individual video selection switching, where each endpoint can request the participant video stream that he/she wishes to receive.
  • the MCU 220 for example positioned within an Internet protocol (IP) based network 210, contains a switch 230.
  • the MCU 220 receives the video streams 255, 265, 275, 285 of all the participants (user equipment) 250, 260, 270, 280.
  • the MCU may also receive, separately, a combined (multiplexed) audio stream 290 from the participants who are speaking.
  • the MCU 220 selects one of the video streams and sends this video stream 240 to all the participants 250, 260, 270, 280.
  • the video of each participant can be sent to all the participants.
  • this approach suffers in a wireless based conference due to the bandwidth limitation.
  • video is transmitted as a series of still images/pictures. Since the quality of a video signal can be affected during coding or compression of the video signal, it is known to include additional information 'layers' based on the difference between the video signal and the encoded video bit stream. The inclusion of additional layers enables the quality of the received signal, following decoding and/or decompression, to be enhanced. Hence, a hierarchy of pictures and enhancement pictures partitioned into one or more layers is used to produce a layered video bit stream.
  • enhancements to the video signal may be added to the base layer either by: (i) Increasing the resolution of the picture (spatial scalability) ; (ii) Including error information to improve the Signal to Noise Ratio of the picture (SNR scalability) ; or (iii) Including extra pictures to increase the frame rate (temporal scalability) .
  • Such enhancements may be applied to the whole picture, or to an arbitrarily shaped object within the picture, which is termed object-based scalability.
  • object-based scalability In order to preserve the disposable nature of the temporal enhancement layer, the H.263+ standard dictates that pictures included in the temporal scalability mode should be bi-directionally predicted (B) pictures, as shown in the video stream of FIG. 3.
  • FIG. 3 shows a schematic illustration of a scalable video arrangement 300 illustrating B picture prediction dependencies, as known in the field of video coding techniques.
  • An initial intra-coded frame (Ii) 310 is followed by a bi-directionally predicted frame (B 2 ) 320. This, in turn, is followed by a (uni-directional) predicted frame (P 3 ) 330, and again followed by a second bi-directionally predicted frame (B 4 ) 340. This again, in turn, is followed by a (uni-directional) predicted frame
  • FIG. 4 is a schematic illustration of a layered video arrangement, known in the field of video coding techniques .
  • a layered video bit stream includes a base layer 405 and one or more enhancement layers 435.
  • the base layer (layer 1) includes one or more intra-coded pictures (I pictures) 410 sampled, coded and/or compressed from the original video signal pictures. Furthermore, the base layer will include a plurality of predicted inter- coded pictures (P pictures) 420, 430 predicted from the intra-coded picture (s) 410.
  • I pictures intra-coded pictures
  • P pictures predicted inter- coded pictures
  • enhancement layers layers 2 or 3 or more 435
  • three types of picture may be used:
  • the vertical arrows from the lower layer illustrate that the picture in the enhancement layer is predicted from a reconstructed approximation of that picture in the reference (lower) layer.
  • scalable video coding has been used with multicast multimedia conferences, and only in the context of point-to-point or multicast video communication.
  • wireless networks do not currently support multicasting.
  • each layer is sent in separate multicast sessions, with the receiver deciding itself whether to register to one or more sessions .
  • a method of relaying video images in a multimedia videoconference as claimed in claim 1, a video conferencing arrangement for relaying video images, as claimed in claim 7, a wireless device for participating in a videoconference, as claimed in claim 11, a multipoint processor, as claimed in claim 12, a video communication system, as claimed in claim 16, a media resource function, as claimed in claim 18, a video communication unit, as claimed in claim 19 or claim 20, a storage medium, as claimed in claim 23.
  • Further aspects of the present invention are as claimed in the dependent claims.
  • inventive concepts of the present invention address the disadvantages of prior art arrangements by providing a video switching method to improve the identification of the participants and speakers in a videoconference.
  • This invention makes use of layered video coding, in order to provide a better usage of the bandwidth available for each user.
  • FIG. 1 shows a known centralised conferencing model.
  • FIG. 2 shows a functional diagram of a traditional video switching mechanism.
  • FIG. 3 is a schematic illustration of a video arrangement showing picture prediction dependencies, as known in the field of video coding techniques .
  • FIG. 4 is a schematic illustration of a layered video arrangement , known in the field of video coding techniques .
  • FIG. 5 shows a functional diagram of a video switching mechanism, in accordance with a preferred embodiment of the invention.
  • FIG. 6 shows a functional block diagram/flowchart of a multipoint processing unit, in accordance with a preferred embodiment of the invention.
  • FIG. 7 shows a video display of a wireless device participating in a videoconference using the preferred embodiment of the present invention.
  • FIG. 8 shows a UMTS (3GPP) communication system adapted in accordance with the preferred embodiment of the present invention.
  • the preferred embodiment of the present invention proposes a new video switching mechanism for multimedia conferences that makes use of layered video coding.
  • layered video coding has only been used to partition a video bit stream into more than one layer: a base layer and one or several enhancement layers, as described above with respect to FIG. 4.
  • These known techniques for scalable video communication are described in detail in standards such as H.263 and MPEG-4.
  • the inventor of the present invention has recognised the benefits to be gained by adapting the concept of layered video coding and applying the adapted concepts to multimedia videoconference applications.
  • the present invention defines a different type of scalable video coding focused for use in multimedia conferences, in contrast to point-to-point or multicast video communication.
  • FIG. 5 a functional block diagram 500 of a video switching mechanism is shown, in accordance with the preferred embodiment of the invention.
  • the MCU 520 for example positioned within an Internet protocol (IP) based network 510, contains a switch 530.
  • IP Internet protocol
  • the MCU 520 receives 'layered' video streams including a base layer 552, 562, 572, 582 and one or more enhancement layer streams 555, 565, 575, 585 of all the participants (user equipment) 550, 560, 570, 580. Only one enhancement layer video stream is per participant shown for clarity purposes only.
  • the MCU 520 may also receive, separately, a combined (multiplexed) audio stream 590 from the participants.
  • the MCU 520 selects the base layer video streams of a number of active speakers 535 and the enhancement layer 540 of the most active speaker, using switch 530.
  • the MCU 520 then sends these video streams 535, 540 to all the participants 550, 560, 570, 580.
  • the selection process to determine the most active speaker is preferably performed by the MCU 520 analysing the audio streams 590 in order to determine first whom all the active speakers are.
  • the most active speaker is then preferably determined in the multipoint processor unit, as described with reference to FIG. 6.
  • the one or more base layers and one enhancement layer are preferably sent to the participants according to a priority level based on the activity of each participant.
  • the multipoint processing unit (MP) 600 has been adapted to facilitate the new video switching mechanism, in accordance with a preferred embodiment of the invention and as shown in FIG. 6.
  • the MP 600 still receives the audio stream 590 from the participants' video/multimedia communication units, through a packet-filtering module 610 and routes this audio stream to a packet routing module 630. However, the audio stream is now also routed to a speaker identification module 620 that analyses the audio streams 590 in order to determine who are the active speakers.
  • the speaker identification module 620 allocates a priority level based on the activity of each participant and determines :
  • the speaker identification module 620 then forwards the priority level information to the switching module 640 that has been adapted to deal with priority level of speakers, in accordance with the preferred embodiment of the present invention. Furthermore, the switching module 640 has been adapted to receive layered video streams, including video base layer streams 552, 562, 572 and 582 and video enhancement layer streams 555, 565, 575 and 585 from the participants' video communication units through the packet filtering module 610. The switching module 640 uses this speaker information to send the video base layers of the secondary (lesser) active speakers and the most active speaker and only the video enhancement layer of the most active speaker, to all the participants, via the packet routing module 630.
  • the one or more receiving ports of the multipoint processor have therefore been adapted to receive layered video streams, including base layer video streams 552,
  • the switching module 640 may only select one base layer video image and corresponding one or more enhancement layers if it is determined that there is only one active speaker. This speaker then automatically is designated as the most active speaker for transmitting to one or more user equipment 550, 560, 570 and 580.
  • the enhancement layer When the most active speaker is constantly changing, as can happen in videoconferences, the enhancement layer will be constantly switching.
  • the inventor of the present invention has recognised a potential problem with such constant and rapid switching. Under such circumstances ' , the first frame may need to be converted into an Intra frame (El) if it was actually a predicted frame (EP) from a speaker who was previously only a secondary active speaker.
  • El Intra frame
  • EP predicted frame
  • the video base layer streams 552, 562, 572 and 582 and video enhancement layer streams 555, 565, 575 and 585 from the packet-filtering module 610 are preferably input to a de-packetisation function 680.
  • the de-packetisation function 680 demultiplexes the video streams and provides the demultiplexed video streams to a video decoder and buffer function 670.
  • the video decoder and buffer function 670 receives the indication of the most active speaker 622. After extracting the video stream information for the most active speaker, the video decoder and buffer function 670 provides bi-directionally predicted (BP) 675 and/or predicted (EP) video stream data of the most active speaker 622 to an 'EP frame to El frame Transcoding Module' 660.
  • BP bi-directionally predicted
  • EP predicted
  • Module' 660 processes the input video streams to provide the primary speaker enhancement layer video stream, as an Intra-coded (El) frame.
  • the primary speaker enhancement layer video stream is then input to a packetisation function 650, where it is packetised and input to the switching module 640.
  • the switching module 640 then combines the primary speaker enhancement layer video stream, with the video base layer streams 552, 562, 572 and 582 of the secondary active speakers and routes the combined multimedia stream to the packet routing module 630.
  • the packet routing module then routes the information to the participants in accordance with the method of FIG. 5.
  • the video switching module 640 uses the output of the 'EP frame to El frame Transcoding module' 660 when it determines that the primary speaker has changed.
  • module 660 could also be included in the MP 600 to perform the same function for the secondary speakers, when they are deemed to have changed.
  • the speaker identification module 620 (or switching module 640) may make a request for a new Intra-frame.
  • the switching module 640 may wait for a new Intra frame of the new secondary active speaker before sending the corresponding video base layer stream to all the participants.
  • more classes of speakers can be used. By using more classes of speakers, a finer scalability of the multimedia messages can be attained, as the identification of speakers is improved, especially for large videoconferences .
  • predicted frame to Intra frame conversion could be added for one or more of the base layers streams.
  • the switching module 640 can quickly switch between the base layers without having to wait for a new Intra frame .
  • FIG. 7 shows the video display 710 of a wireless device 700 taking part in a videoconference using the preferred embodiment of the present invention.
  • improved video communication is achieved.
  • the participants are now able to receive better video quality of the most active speaker 720, by lowering the video quality of the lesser (secondary) active speakers 730, and providing no video for the inactive speakers .
  • the video communication device receives the enhancement layer and base layer of the most active speaker 720, the base layers of the secondary active speakers 730 and no video from inactive speakers.
  • a video communication unit can provide a constantly updated video image of the most active speaker in a larger, higher resolution display, whilst smaller displays can display secondary (lesser) active speakers.
  • the wireless device 700 preferably has a primary video display 710 for displaying a higher quality video image of the most active speaker, and one or more second distinct displays for displaying respective lesser active speakers.
  • the manipulation of the respective video images into the respective displays is performed by a processor (not shown) that is operably coupled to the video displays.
  • the processor receives an indication of a most active speaker 720 and lesser active speakers, and determines which video image received should be displayed in the first display and which video image (s) received from the lesser active speakers 730 should be displayed in the second display.
  • the second display may be configured to provide a lower quality video image of the lesser active speakers, thereby saving cost.
  • a preferred application of the aforementioned invention is in the Third Generation Partnership Project (3GPP) specification for wide-band code-division multiple access (WCDMA) standard.
  • 3GPP Third Generation Partnership Project
  • WCDMA wide-band code-division multiple access
  • the invention can be applied to the IP Multimedia Domain (described in the 3G TS 25. xxx series of specifications), which is planning to incorporate H.323/SIP MCU into the 3GPP network.
  • the MCU will be hosted by the Media Resource Function (MRF) 890A, see Figure 8.
  • MRF Media Resource Function
  • FIG. 8 shows, a 3GPP (UMTS) communication system/network 800, in a hierarchical form, which is capable of being adapted in accordance with the preferred embodiment of the present invention.
  • the communication system 800 is compliant with, and contains network elements capable of operating over, a UMTS and/or a GPRS air-interface.
  • the network is conveniently considered as comprising: (i) A user equipment domain 810, made up of:
  • a mobile equipment domain 830 (b) A mobile equipment domain 830; and (ii) An infrastructure domain 840, made up of: (c) An access network domain 850, and
  • a core network domain 860 which is, in turn, made up of (at least) :
  • a serving network domain 870 and (dii) a transit network domain 880 and (diii) an IP multimedia domain 890, with multimedia being provided by SIP (ETF RFC2543) .
  • UE 830A receives data from a user SIM 820A in the USIM domain 820 via the wired Cu interface.
  • the UE 830A communicates data with a Node B 850A in the network access domain 850 via the wireless Uu interface.
  • the Node Bs 850A contain one or more transceiver units and communicate with the rest of the cell-based system infrastructure, for example RNC 850B, via an I u b interface, as defined in the UMTS specification.
  • the RNC 850B communicates with other RNCs (not shown) via the Iur interface.
  • the RNC 850B communicates with a SGSN 870A in the serving network domain 870 via the Iu interface.
  • the SGSN 870A communicates with a GGSN 870B via the Gn interface, and the SGSN 870A communicates with a VLR server 870C via the Gs interface.
  • the SGSN 870A communicates with the MCU (not shown) that resides within the media resource function (890A) in the IP Multimedia domain 890. The communication is performed via the Gi interface.
  • the GGSN 870B (and/or SSGN) is responsible for UMTS (or GPRS) interfacing with a Public Switched Data Network (PSDN) 880A such as the Internet or a Public Switched Telephone Network (PSTN) .
  • PSDN Public Switched Data Network
  • PSTN Public Switched Telephone Network
  • the SGSN 870A performs a routing and tunnelling function for traffic within say, a UMTS core network, whilst a GGSN 870B links to external packet networks, in this case ones accessing the UMTS mode of the system.
  • the RNC 850B is the UTRAN element responsible for the control and allocation of resources for numerous Node Bs 850A; typically 50 to 100 Node B's may be controlled by one RNC 850B.
  • the RNC 850B also provides reliable delivery of user traffic over the air interfaces. RNCs communicate with each other (via the interface Iur) to support handover and macro diversity.
  • the SGSN 870A is the UMTS Core Network element responsible for Session Control and interface to the Location Registers (HLR and VLR) .
  • the SGSN is a large centralised controller for many RNCs .
  • the GGSN 870B is the UMTS Core Network element responsible for concentrating and tunnelling user data within the core packet network to the ultimate destination (e.g., an internet service provider (ISP) ) .
  • ISP internet service provider
  • user data includes multimedia and related signalling data to/from the IP multimedia domain 890.
  • the MRF is split into a Multimedia Resource Function Controller (MRFC) 892A and a Multimedia Resource Function Processor (MRFP) 891A.
  • MRFC 892A provides the Multimedia Resource Function Controller (MRFC) 892A.
  • MC Multipoint Controller
  • MP Multipoint Processor
  • the protocol used across the Mr reference point/interface 893A is SIP (as defined by RFC 2543) .
  • the call-state control function (CSCF) 895A acts as a call server and handles multimedia call signalling.
  • the elements SGSN 870A, GGSN 870B and all parts within the MRF 890A are adapted to facilitate multimedia messages as herein before described.
  • the UE 830A, Node B 850A and RNC 850B may also be adapted to facilitate improved multimedia messages as hereinbefore described.
  • the adaptation may be implemented in the respective communication units in any suitable manner.
  • new apparatus may be added to a conventional communication unit, or alternatively existing parts of a conventional communication unit may be adapted, for example by reprogramming one or more processors therein.
  • the required adaptation may be implemented in the form of processor-implementable instructions stored on a storage medium, such as a floppy disk, hard disk, PROM, RAM or any combination of these or other storage multimedia. It is also within the contemplation of the invention that such adaptation of multimedia messages may alternatively be controlled, implemented in full or implemented in part by adapting any other suitable part of the communication system 800.
  • processing operations may be performed at any appropriate node such as any other appropriate type of base station, base station controller, mobile switching centre or operational and management controller, etc.
  • any appropriate node such as any other appropriate type of base station, base station controller, mobile switching centre or operational and management controller, etc.
  • the aforementioned steps may be carried out by various components distributed at different locations or entities within any suitable network or system.
  • the video conferencing method using layered video coding preferably when applied in a centralised videoconference, as described above, provides the following advantages: (i) The identification of the speakers is much improved compared to traditional systems, because the bandwidth is shared to allow one or more enhancement layers and several base layers to be sent instead of only one full quality video stream. (ii) The video switching when the active speaker changes is much smoother using the inventive concepts herein described, because it defines several states active speaker, second most active speakers, inactive speakers. (iii) The video quality of the most active speaker is improved.
  • Improved video communication units can display a variety of speakers, with each displayed image being dependent upon a priority level associated with the respective video communication unit's transmission.
  • a method of relaying video images in a multimedia videoconference between a plurality of multimedia user equipment includes the steps of transmitting layered video images by a number of the plurality of user equipment wherein the layered video images include a base layer and one or more enhancement layers and receiving the transmitted layered video images at a multipoint control unit.
  • a number of base layer video images of a number of active speakers are selected and one or more enhancement layers of a most active speaker.
  • the multipoint control unit transmits the number of base layer video images of a number of active speakers and one or more enhancement layers of the most active speaker to one or more of the plurality of multimedia user equipment .
  • a video conferencing arrangement for relaying video images between a plurality of user equipment has been described.
  • a wireless device for participating in a videoconference has been described, where a number of participants transmit video images.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Graphics (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A method of relaying video images in a multimedia videoconference between a plurality of multimedia user equipment (550, 560, 570, 580) includes the step of transmitting layered video images by a number of said plurality of user equipment wherein said layered video images inlcude a base layer (552, 562, 572, 582) and one or more enhancement layers (555, 565, 575, 585). The transmitted layered video images are received at a multipoint control unit (520), where a number of base layer video images of a number of active speakers (535) and one or more enhancement layers (540) of a most active speaker are selected. the multipoint control unit (520) transmits the base layer video images, and one or more enhancement layers (540) of the most active speaker, to one or more of the plurality of multimedia user equipment (550, 560, 570, 580). The identification of the speakers is much improved compared to traditional videoconference systems, as the available bandwidth is shared to allow one enhancement layer and several base layers to be sent, instead of only one full quality video stream.

Description

Video Conferencing System And Method Of Operation
Field of the Invention
This invention relates to video conferencing. The invention is applicable to, but not limited to, a video switching mechanism in H.323 and/or SIP based centralised videoconferences, using layered video coding.
Background of the Invention As the pace of business accelerates, and relationships spread around the world, the need to bridge communication distances quickly and economically has become a major challenge. Bringing customers and staff together efficiently is critical to being successful in an ever more competitive marketplace. Businesses are looking for flexible solutions that support real-time information sharing across countries and continents using various communication methods, such as voice, video, image data and any combination thereof.
In particular, multi-national organisations have an increasing desire to eliminate costly travel and link multiple locations in order to let groups within the organisation communicate more efficiently and effectively. A multipoint conferencing system operating over an
Internet protocol (IP) network seeks to address this need. In the field of this invention, it is known that terminals exchange audio and video streams in real-time in multipoint videoconferences . The conventional method to set-up multipoint conferences over an IP network is to use a Multipoint control unit (MCU) . The MCU is an endpoint on the network that provides the capability for three or more terminals and/or communication gateways to participate in a multipoint conference. The MCU may also connect two terminals in a point-to-point conference such that they have the ability to evolve into a multipoint conference .
Referring first to FIG. 1, a known centralised conferencing model 100 is shown. Centralised conferences make use of an MCU-based conference bridge. All terminals (endpoints) 120, 122, 125 send and receive media information 130 in the form of audio, video, and/or data signals, as well as control information streams 140, to/from the MCU 110. These transmissions are done in a point-to-point fashion. This is shown in Figure 1.
An MCU 110 consists of a Multipoint Controller (MC) , and zero or more Multipoint Processors (MP) . The MC handles the call set-up and call signalling negotiations between all terminals, to determine common capabilities for audio and video processing. The MC 110 does not deal directly with any of the media streams. This is left to the MP, which mixes, switches, and processes audio, video, and/or data bits.
In this manner, MCUs provide the ability to host multi- location seminars, sales meetings, group conferences and other ^face- to- face ' communications. It is also known that multipoint conferences can be used in various applications, for example:
(i) Executives and managers at multiple locations can meet face- to-face' , share real-time information, and make decisions more quickly without any loss of time, expense, and demands of travelling; (ii) Project teams and knowledge workers can coordinate individual tasks, and view and revise shared documents, presentations, designs, and files in a real time manner; and (iii) Students, trainees, and employees at remote locations can access shared educational/training resources across any distance or time zones.
Consequently, it is envisaged that MCU-based systems will play an important role in multimedia communications over IP based networks in the future.
Such multimedia communication often employs video transmission. In such transmissions, a sequence of images, often referred to as frames, is transmitted between transmitting and receiving units. Multipoint multimedia conference systems can be set-up using various methods, for example as specified by the H.323 and SIP session layer protocol standards. References for SIP can be found at : http: //www. ietf.org/rfc/rfc2543. txt , and http: //www. cs . Columbia. edu/~hgs/sip .
Furthermore, for example in systems using ITU H.263 video compression [ITU-T Recommendation, H.263, Video Coding for Low Bit Rate Communication'], the first frame of a video sequence includes a comprehensive amount of image data, generally referred to as intra coded information. The intra-coded frame, as it is the first frame, provides a substantial portion of the image to be displayed. This intra-coded frame is followed by inter-coded (predicted) information, which generally includes data relating to changes in the image that is being transmitted. Hence, predicted inter-coded information contains much less information than intra-coded information.
In traditional multimedia conferencing system, the users need to identify themselves when they speak, so that the receiving terminals know who is speaking. Clearly if the transmitting terminal fails to identify itself, the listening users will have to guess who is speaking.
A known technique solves this problem by analysing the audio streams and forwarding the name and video stream of the active speaker to all the participants. In a centralized conferencing system, the MCU often performs this function. The MCU can then send the name of the speaker and the corresponding video and audio stream to all the participants by switching the appropriate input multimedia stream to the output ports/paths.
Video switching is a well-known technique that aims at delivering to each endpoint a single video stream, equivalent to arranging multiple point-to-point sessions.
The video switching can be: (i) Voice activated switching, where the MCU transmits the video of the active speaker. (ii) Timed activated switching, where the video of each participant is transmitted one after another at a predetermined time interval . (iii) Individual video selection switching, where each endpoint can request the participant video stream that he/she wishes to receive.
Referring now to FIG. 2, a functional diagram of a traditional video switching mechanism 200 is shown. In a traditional centralised conferencing system, the video switching is performed as follows. The MCU 220, for example positioned within an Internet protocol (IP) based network 210, contains a switch 230. The MCU 220 receives the video streams 255, 265, 275, 285 of all the participants (user equipment) 250, 260, 270, 280. The MCU may also receive, separately, a combined (multiplexed) audio stream 290 from the participants who are speaking. The MCU 220 then selects one of the video streams and sends this video stream 240 to all the participants 250, 260, 270, 280.
Such traditional systems have the disadvantage that they only send the video stream of the active speaker. The users might still have a problem in identifying the speaker of the video stream if several speakers are talking at the same time, or if the active speaker is constantly changing. This is particularly the case with large videoconferences .
Alternatively, the video of each participant can be sent to all the participants. However, this approach suffers in a wireless based conference due to the bandwidth limitation.
In the field of video technology, it is known that video is transmitted as a series of still images/pictures. Since the quality of a video signal can be affected during coding or compression of the video signal, it is known to include additional information 'layers' based on the difference between the video signal and the encoded video bit stream. The inclusion of additional layers enables the quality of the received signal, following decoding and/or decompression, to be enhanced. Hence, a hierarchy of pictures and enhancement pictures partitioned into one or more layers is used to produce a layered video bit stream.
In a layered (scalable) video bit stream, enhancements to the video signal may be added to the base layer either by: (i) Increasing the resolution of the picture (spatial scalability) ; (ii) Including error information to improve the Signal to Noise Ratio of the picture (SNR scalability) ; or (iii) Including extra pictures to increase the frame rate (temporal scalability) .
Such enhancements may be applied to the whole picture, or to an arbitrarily shaped object within the picture, which is termed object-based scalability. In order to preserve the disposable nature of the temporal enhancement layer, the H.263+ standard dictates that pictures included in the temporal scalability mode should be bi-directionally predicted (B) pictures, as shown in the video stream of FIG. 3.
FIG. 3 shows a schematic illustration of a scalable video arrangement 300 illustrating B picture prediction dependencies, as known in the field of video coding techniques. An initial intra-coded frame (Ii) 310 is followed by a bi-directionally predicted frame (B2) 320. This, in turn, is followed by a (uni-directional) predicted frame (P3) 330, and again followed by a second bi-directionally predicted frame (B4) 340. This again, in turn, is followed by a (uni-directional) predicted frame
(P5) 350, and so on.
FIG. 4 is a schematic illustration of a layered video arrangement, known in the field of video coding techniques . A layered video bit stream includes a base layer 405 and one or more enhancement layers 435.
The base layer (layer 1) includes one or more intra-coded pictures (I pictures) 410 sampled, coded and/or compressed from the original video signal pictures. Furthermore, the base layer will include a plurality of predicted inter- coded pictures (P pictures) 420, 430 predicted from the intra-coded picture (s) 410.
In the enhancement layers (layers 2 or 3 or more) 435, three types of picture may be used:
(i) Bi-directionally predicted (B) pictures (not shown) ; (ii) Enhanced intra (El) pictures 440 based on the intra- coded picture (s) 410 of the base layer 405; and
(iii) Enhanced predicted (EP) pictures 450, 460, based on the inter-coded predicted pictures 420, 430 of the base layer 405.
The vertical arrows from the lower layer illustrate that the picture in the enhancement layer is predicted from a reconstructed approximation of that picture in the reference (lower) layer.
In summary, scalable video coding has been used with multicast multimedia conferences, and only in the context of point-to-point or multicast video communication. However, wireless networks do not currently support multicasting. Furthermore, with multicasting, each layer is sent in separate multicast sessions, with the receiver deciding itself whether to register to one or more sessions .
A need therefore exists for an improved video conferencing arrangement and method of operation, wherein the abovementioned disadvantages may be alleviated.
Statement of Invention
In accordance with the present invention there is provided a method of relaying video images in a multimedia videoconference, as claimed in claim 1, a video conferencing arrangement for relaying video images, as claimed in claim 7, a wireless device for participating in a videoconference, as claimed in claim 11, a multipoint processor, as claimed in claim 12, a video communication system, as claimed in claim 16, a media resource function, as claimed in claim 18, a video communication unit, as claimed in claim 19 or claim 20, a storage medium, as claimed in claim 23. Further aspects of the present invention are as claimed in the dependent claims.
In summary, the inventive concepts of the present invention address the disadvantages of prior art arrangements by providing a video switching method to improve the identification of the participants and speakers in a videoconference. This invention makes use of layered video coding, in order to provide a better usage of the bandwidth available for each user.
Brief Description of the Drawings
FIG. 1 shows a known centralised conferencing model. FIG. 2 shows a functional diagram of a traditional video switching mechanism.
FIG. 3 is a schematic illustration of a video arrangement showing picture prediction dependencies, as known in the field of video coding techniques .
FIG. 4 is a schematic illustration of a layered video arrangement , known in the field of video coding techniques .
Exemplary embodiments of the present invention will now be described, with reference to the accompanying drawings, in which:
FIG. 5 shows a functional diagram of a video switching mechanism, in accordance with a preferred embodiment of the invention. FIG. 6 shows a functional block diagram/flowchart of a multipoint processing unit, in accordance with a preferred embodiment of the invention.
FIG. 7 shows a video display of a wireless device participating in a videoconference using the preferred embodiment of the present invention.
FIG. 8 shows a UMTS (3GPP) communication system adapted in accordance with the preferred embodiment of the present invention.
Description of Preferred Embodiments
In summary, the preferred embodiment of the present invention proposes a new video switching mechanism for multimedia conferences that makes use of layered video coding. Previously, layered video coding has only been used to partition a video bit stream into more than one layer: a base layer and one or several enhancement layers, as described above with respect to FIG. 4. These known techniques for scalable video communication are described in detail in standards such as H.263 and MPEG-4.
However, the inventor of the present invention has recognised the benefits to be gained by adapting the concept of layered video coding and applying the adapted concepts to multimedia videoconference applications. In this manner, the present invention defines a different type of scalable video coding focused for use in multimedia conferences, in contrast to point-to-point or multicast video communication.
Referring now to FIG. 5, a functional block diagram 500 of a video switching mechanism is shown, in accordance with the preferred embodiment of the invention. In contrast to a traditional centralised conferencing system, the video switching is performed as follows. The MCU 520, for example positioned within an Internet protocol (IP) based network 510, contains a switch 530.
It is noteworthy that the MCU 520 receives 'layered' video streams including a base layer 552, 562, 572, 582 and one or more enhancement layer streams 555, 565, 575, 585 of all the participants (user equipment) 550, 560, 570, 580. Only one enhancement layer video stream is per participant shown for clarity purposes only.
The MCU 520 may also receive, separately, a combined (multiplexed) audio stream 590 from the participants. The MCU 520 then selects the base layer video streams of a number of active speakers 535 and the enhancement layer 540 of the most active speaker, using switch 530. The MCU 520 then sends these video streams 535, 540 to all the participants 550, 560, 570, 580.
The selection process to determine the most active speaker is preferably performed by the MCU 520 analysing the audio streams 590 in order to determine first whom all the active speakers are. The most active speaker is then preferably determined in the multipoint processor unit, as described with reference to FIG. 6. The one or more base layers and one enhancement layer are preferably sent to the participants according to a priority level based on the activity of each participant.
In order to effect the improved, but more complex, video switching mechanism of FIG. 5, the multipoint processing unit (MP) 600 has been adapted to facilitate the new video switching mechanism, in accordance with a preferred embodiment of the invention and as shown in FIG. 6.
The MP 600 still receives the audio stream 590 from the participants' video/multimedia communication units, through a packet-filtering module 610 and routes this audio stream to a packet routing module 630. However, the audio stream is now also routed to a speaker identification module 620 that analyses the audio streams 590 in order to determine who are the active speakers. The speaker identification module 620 allocates a priority level based on the activity of each participant and determines :
(i) The most active speaker 622,
(ii) Any other active speakers 625, and by default
(iii) Any remaining inactive speakers. The speaker identification module 620 then forwards the priority level information to the switching module 640 that has been adapted to deal with priority level of speakers, in accordance with the preferred embodiment of the present invention. Furthermore, the switching module 640 has been adapted to receive layered video streams, including video base layer streams 552, 562, 572 and 582 and video enhancement layer streams 555, 565, 575 and 585 from the participants' video communication units through the packet filtering module 610. The switching module 640 uses this speaker information to send the video base layers of the secondary (lesser) active speakers and the most active speaker and only the video enhancement layer of the most active speaker, to all the participants, via the packet routing module 630.
The one or more receiving ports of the multipoint processor have therefore been adapted to receive layered video streams, including base layer video streams 552,
562, 572 and 582 and enhancement layer video streams 555, 565, 575 and 585, from a plurality of user equipment 550, 560, 570 and 580. It is within the contemplation of the invention that the switching module 640, may only select one base layer video image and corresponding one or more enhancement layers if it is determined that there is only one active speaker. This speaker then automatically is designated as the most active speaker for transmitting to one or more user equipment 550, 560, 570 and 580.
When the most active speaker is constantly changing, as can happen in videoconferences, the enhancement layer will be constantly switching. The inventor of the present invention has recognised a potential problem with such constant and rapid switching. Under such circumstances', the first frame may need to be converted into an Intra frame (El) if it was actually a predicted frame (EP) from a speaker who was previously only a secondary active speaker.
To address this potential problem, the video base layer streams 552, 562, 572 and 582 and video enhancement layer streams 555, 565, 575 and 585 from the packet-filtering module 610 are preferably input to a de-packetisation function 680. The de-packetisation function 680 demultiplexes the video streams and provides the demultiplexed video streams to a video decoder and buffer function 670.
To synchronise and co-ordinate the video decoding, the video decoder and buffer function 670 receives the indication of the most active speaker 622. After extracting the video stream information for the most active speaker, the video decoder and buffer function 670 provides bi-directionally predicted (BP) 675 and/or predicted (EP) video stream data of the most active speaker 622 to an 'EP frame to El frame Transcoding Module' 660. The 'EP frame to El frame Transcoding
Module' 660 processes the input video streams to provide the primary speaker enhancement layer video stream, as an Intra-coded (El) frame.
The primary speaker enhancement layer video stream is then input to a packetisation function 650, where it is packetised and input to the switching module 640. The switching module 640 then combines the primary speaker enhancement layer video stream, with the video base layer streams 552, 562, 572 and 582 of the secondary active speakers and routes the combined multimedia stream to the packet routing module 630. The packet routing module then routes the information to the participants in accordance with the method of FIG. 5.
In the preferred embodiment of the present invention, the video switching module 640 uses the output of the 'EP frame to El frame Transcoding module' 660 when it determines that the primary speaker has changed.
It is within the contemplation of the invention that one or more modules that are similar to module 660 could also be included in the MP 600 to perform the same function for the secondary speakers, when they are deemed to have changed. Otherwise, in the embodiment that uses a single 'EP frame to El frame Transcoding module' 660 to transcode the video stream of only the primary speaker, when say an inactive speaker becomes a secondary active speaker, the speaker identification module 620 (or switching module 640) may make a request for a new Intra-frame. Alternatively, the switching module 640 may wait for a new Intra frame of the new secondary active speaker before sending the corresponding video base layer stream to all the participants.
In addition to the preferred embodiment of the present invention, where more than one enhancement layer is available for use, it is within the contemplation of the invention that more classes of speakers can be used. By using more classes of speakers, a finer scalability of the multimedia messages can be attained, as the identification of speakers is improved, especially for large videoconferences .
It is also within the contemplation of the invention that predicted frame to Intra frame conversion could be added for one or more of the base layers streams. In this manner, the switching module 640 can quickly switch between the base layers without having to wait for a new Intra frame .
FIG. 7 shows the video display 710 of a wireless device 700 taking part in a videoconference using the preferred embodiment of the present invention. By implementing the inventive concepts hereinbefore described, improved video communication is achieved. In particular, for a given bandwidth, the participants are now able to receive better video quality of the most active speaker 720, by lowering the video quality of the lesser (secondary) active speakers 730, and providing no video for the inactive speakers . In order to provide such improved video conferencing, the video communication device receives the enhancement layer and base layer of the most active speaker 720, the base layers of the secondary active speakers 730 and no video from inactive speakers.
In such a manner, a video communication unit can provide a constantly updated video image of the most active speaker in a larger, higher resolution display, whilst smaller displays can display secondary (lesser) active speakers.
The wireless device 700 preferably has a primary video display 710 for displaying a higher quality video image of the most active speaker, and one or more second distinct displays for displaying respective lesser active speakers. Preferably, the manipulation of the respective video images into the respective displays is performed by a processor (not shown) that is operably coupled to the video displays. The processor receives an indication of a most active speaker 720 and lesser active speakers, and determines which video image received should be displayed in the first display and which video image (s) received from the lesser active speakers 730 should be displayed in the second display. Advantageously, the second display may be configured to provide a lower quality video image of the lesser active speakers, thereby saving cost.
It is anticipated that MCU-based systems will facilitate multimedia communications over IP based networks in the future. Therefore, the inventor of the present invention envisages that the herein described techniques could be incorporated in any H.323/SIP based multipoint multimedia conferences or systems that make use of MCU.
A preferred application of the aforementioned invention is in the Third Generation Partnership Project (3GPP) specification for wide-band code-division multiple access (WCDMA) standard. In particular, the invention can be applied to the IP Multimedia Domain (described in the 3G TS 25. xxx series of specifications), which is planning to incorporate H.323/SIP MCU into the 3GPP network. The MCU will be hosted by the Media Resource Function (MRF) 890A, see Figure 8.
FIG. 8 shows, a 3GPP (UMTS) communication system/network 800, in a hierarchical form, which is capable of being adapted in accordance with the preferred embodiment of the present invention. The communication system 800 is compliant with, and contains network elements capable of operating over, a UMTS and/or a GPRS air-interface.
The network is conveniently considered as comprising: (i) A user equipment domain 810, made up of:
(a) A user SIM (USIM) domain 820 and
(b) A mobile equipment domain 830; and (ii) An infrastructure domain 840, made up of: (c) An access network domain 850, and
(d) A core network domain 860, which is, in turn, made up of (at least) :
(di) a serving network domain 870, and (dii) a transit network domain 880 and (diii) an IP multimedia domain 890, with multimedia being provided by SIP (ETF RFC2543) .
In the mobile equipment domain 830, UE 830A receives data from a user SIM 820A in the USIM domain 820 via the wired Cu interface. The UE 830A communicates data with a Node B 850A in the network access domain 850 via the wireless Uu interface. Within the network access domain 850, the Node Bs 850A contain one or more transceiver units and communicate with the rest of the cell-based system infrastructure, for example RNC 850B, via an Iub interface, as defined in the UMTS specification.
The RNC 850B communicates with other RNCs (not shown) via the Iur interface. The RNC 850B communicates with a SGSN 870A in the serving network domain 870 via the Iu interface. Within the serving network domain 870, the SGSN 870A communicates with a GGSN 870B via the Gn interface, and the SGSN 870A communicates with a VLR server 870C via the Gs interface. In accordance with the preferred embodiment of the present invention, the SGSN 870A communicates with the MCU (not shown) that resides within the media resource function (890A) in the IP Multimedia domain 890. The communication is performed via the Gi interface.
The GGSN 870B (and/or SSGN) is responsible for UMTS (or GPRS) interfacing with a Public Switched Data Network (PSDN) 880A such as the Internet or a Public Switched Telephone Network (PSTN) . The SGSN 870A performs a routing and tunnelling function for traffic within say, a UMTS core network, whilst a GGSN 870B links to external packet networks, in this case ones accessing the UMTS mode of the system.
The RNC 850B is the UTRAN element responsible for the control and allocation of resources for numerous Node Bs 850A; typically 50 to 100 Node B's may be controlled by one RNC 850B. The RNC 850B also provides reliable delivery of user traffic over the air interfaces. RNCs communicate with each other (via the interface Iur) to support handover and macro diversity.
The SGSN 870A is the UMTS Core Network element responsible for Session Control and interface to the Location Registers (HLR and VLR) . The SGSN is a large centralised controller for many RNCs .
The GGSN 870B is the UMTS Core Network element responsible for concentrating and tunnelling user data within the core packet network to the ultimate destination (e.g., an internet service provider (ISP) ) . Such user data includes multimedia and related signalling data to/from the IP multimedia domain 890. Within the IP multimedia domain 890, the MRF is split into a Multimedia Resource Function Controller (MRFC) 892A and a Multimedia Resource Function Processor (MRFP) 891A. The MRFC 892A provides the
Multipoint Controller (MC) functionalities, whereas the MRFP 891A provides the Multipoint Processor (MP) functionalities, as described previously.
The protocol used across the Mr reference point/interface 893A is SIP (as defined by RFC 2543) . The call-state control function (CSCF) 895A acts as a call server and handles multimedia call signalling.
Thus, in accordance with the preferred embodiment of the invention the elements SGSN 870A, GGSN 870B and all parts within the MRF 890A are adapted to facilitate multimedia messages as herein before described. Furthermore, the UE 830A, Node B 850A and RNC 850B may also be adapted to facilitate improved multimedia messages as hereinbefore described.
More generally, the adaptation may be implemented in the respective communication units in any suitable manner. For example, new apparatus may be added to a conventional communication unit, or alternatively existing parts of a conventional communication unit may be adapted, for example by reprogramming one or more processors therein. As such, the required adaptation may be implemented in the form of processor-implementable instructions stored on a storage medium, such as a floppy disk, hard disk, PROM, RAM or any combination of these or other storage multimedia. It is also within the contemplation of the invention that such adaptation of multimedia messages may alternatively be controlled, implemented in full or implemented in part by adapting any other suitable part of the communication system 800.
Although the above elements are typically provided as discrete and separate units (on their own respective software/hardware platforms) , divided across the mobile equipment domain 830, access network domain 850 and the serving network domain 870, it is envisaged that other configurations can be applied.
Further, in the case of other network infrastructures, such as a GSM network, implementation of the processing operations may be performed at any appropriate node such as any other appropriate type of base station, base station controller, mobile switching centre or operational and management controller, etc. Alternatively, the aforementioned steps may be carried out by various components distributed at different locations or entities within any suitable network or system.
The video conferencing method using layered video coding, preferably when applied in a centralised videoconference, as described above, provides the following advantages: (i) The identification of the speakers is much improved compared to traditional systems, because the bandwidth is shared to allow one or more enhancement layers and several base layers to be sent instead of only one full quality video stream. (ii) The video switching when the active speaker changes is much smoother using the inventive concepts herein described, because it defines several states active speaker, second most active speakers, inactive speakers. (iii) The video quality of the most active speaker is improved.
(iv) Improved video communication units can display a variety of speakers, with each displayed image being dependent upon a priority level associated with the respective video communication unit's transmission.
A method of relaying video images in a multimedia videoconference between a plurality of multimedia user equipment has been described. The method includes the steps of transmitting layered video images by a number of the plurality of user equipment wherein the layered video images include a base layer and one or more enhancement layers and receiving the transmitted layered video images at a multipoint control unit. A number of base layer video images of a number of active speakers are selected and one or more enhancement layers of a most active speaker. The multipoint control unit transmits the number of base layer video images of a number of active speakers and one or more enhancement layers of the most active speaker to one or more of the plurality of multimedia user equipment .
In addition, a video conferencing arrangement for relaying video images between a plurality of user equipment has been described. Furthermore, a wireless device for participating in a videoconference has been described, where a number of participants transmit video images.

Claims

Claims
1. A method of. relaying video images in a multimedia videoconference between a plurality of multimedia user equipment (550, 560, 570, 580), the method comprising the steps of : transmitting layered video images by a number of said plurality of user equipment, wherein said layered video images include a base layer (552, 562, 572, 582) and one or more enhancement layers (555, 565, 575, 585); receiving said transmitted layered video images at a multipoint control unit (520) ; selecting a number of base layer video images of a number of active speakers (535) and one or more enhancement layers (540) of a most active speaker; and transmitting, by said multipoint control unit (520) said number of base layer video images of a number of active speakers (535) and one or more enhancement layers (540) of the most active speaker to one or more of the plurality of multimedia user equipment (550, 560, 570, 580) .
2. The method of relaying video images in a multimedia videoconference according to Claim 1, wherein the step of selecting further comprises the step of: analysing a number of audio data streams (590) , transmitted by said plurality of multimedia user equipment (550, 560, 570, 580), in order to determine the number of active speakers and/or said most active speaker.
3. The method of relaying video images in a multimedia videoconference according to Claim 1 or Claim 2, the method further characterised by the step of : assigning a priority level to each layered video image and/or said audio data stream transmitted by a respective user equipment; and selecting a number of base layer video images (535) and one or more enhancement layers (540) for transmitting to said one or more of said plurality of multimedia user equipment (550, 560, 570, 580), based on said assigned priority level .
4. The method of relaying video images in a multimedia videoconference according to any preceding Claim, the method further characterised by the step of : transcoding (660) a first predicted frame of a video image of the most active speaker to an intra-coded frame, for enhancing the video quality of the most active speaker.
5. The method of relaying video images in a multimedia videoconference according to any preceding Claim, the method further characterised by the step of : receiving by said multipoint control unit (520) , when more than one enhancement layer is available, an indication of a class of said one or more speakers with each layered video image transmission, in order to provide a finer scalability of said video images.
6. The method of relaying video images in a multimedia videoconference according to any preceding Claim, the method further characterised by the step of: converting a predicted frame into an Intra-coded frame for one or more base layer video streams.
7. A video conferencing arrangement for relaying video images between a plurality of user equipment (550, 560, 570, 580), the video conferencing arrangement comprising: a multipoint control unit (520) , adapted to receive a number of layered video images transmitted by a number of said plurality of user equipment, wherein said layered video images include a base layer (552, 562, 572, 582) and one or more enhancement layers (555, 565, 575, 585); and a video switching module (530) , operably coupled to said multipoint control unit (520) and adapted to select a number of base layer video images of a number of active speakers (535) and one or more enhancement layers (540) of a most active speaker; wherein said multipoint control unit (520) being further adapted to transmit said number of base layer video images of a number of active speakers (535) and one or more enhancement layers (540) of the most active speaker to one or more of the plurality of user equipment (550, 560, 570, 580) .
8. The video conferencing arrangement according to Claim 7, further characterised by: a predicted frame to intra-coded frame transcoding module (660) , operably coupled to said video switching module (530) , to provide a most active speaker enhancement layer video stream, as an Intra-coded frame, if said multipoint control unit (520) received said frame initially as a predicted frame.
9. The video conferencing arrangement according to Claim 7 or Claim 8, further characterised by: a speaker identification module (620) that analyses a number of audio streams (590) in order to determine a number of active speakers and/or said most active speaker.
10. The video conferencing arrangement according to Claim 9, wherein said speaker identification module (620) allocates a priority level based on a determined activity of each participant to determine one or more of: a most active speaker (622) , any other active speakers (625) , and any inactive speakers.
11. A wireless device (700) for participating in a videoconference where a plurality of participants transmit video images, the wireless device (700) comprising: a video display (710) having a first display and one or more second distinct displays for displaying respective participants (720, 730) from the plurality of participants; and a processor, operably coupled to said video display, for receiving an indication of a most active speaker (720) and less active speakers (730) , and determining that said video image received from said most active speaker (720) should be displayed in said first display offering a higher quality video image, and said video images received from said number of said lesser active speakers (730) should be displayed in said one or more second display offering a lower quality video image.
12. A multipoint processor comprising: one or more receiving ports adapted to receive layered video streams, including base layer video streams (552, 562, 572 582) and enhancement layer video streams (555, 565, 575, 585) , from a plurality of user equipment (550, 560, 570, 580) ; and a switching module (640) , operably coupled to said one or more receiving ports, selecting a number of base layer video images of a number of active speakers (535) and one or more enhancement layers (540) of the most active speaker, for transmitting to one or more user equipment (550, 560, 570, 580) .
13. The multipoint processor according to Claim 12, further characterised by: a speaker identification module (620) operably coupled to said one or more receiving ports for analysing a number of audio streams (590) received from a number of said plurality of user equipment, in order to determine a number of active speakers and/or said most active speaker.
14. The multipoint processor according to Claim 12 or Claim 13, wherein said speaker identification module (620) allocates a priority level based on a determined activity of a number of participants to determine one or more of the following: a most active speaker (622), any other active speakers (625), and any inactive speakers.
15. The multipoint processor according to any of preceding Claims 12 to 14, further characterised by: a predicted frame to intra-coded frame transcoding Module (660) operably coupled to said switching module (640) , to convert the enhancement layer video stream of said most active speaker to an Intra-coded frame if it has been received at a respective port as a predicted frame.
16. A video communication system adapted to perform the method steps of any of claims 1 to 6, or adapted to incorporate the video conferencing arrangement of any of Claims 7 to 10, or adapted to incorporate the multipoint processor of any of Claims 12 to 15.
17. The video communication system according to Claim 16, wherein the video communication system is compatible with the UMTS communication standard (800) having an Internet Protocol multimedia domain (890) to facilitate videoconferencing communication.
18. A media resource function (890A) adapted to perform the method steps of any of claims 1 to 6, or adapted to incorporate the video conferencing arrangement of any of Claims 7 to 10, or adapted to incorporate the multipoint processor of any of Claims 12 to 15.
19. A video communication unit (700) adapted to receive layered videoconference images generated in accordance with the method of claims 1 to 6.
20. A video communication unit adapted to generate layered videoconference images for use in the method of claims 1 to 6, or to transmit layered videoconference images generated in accordance with the method of claims 1 to 6.
21. The video communication unit according to Claim 19, wherein the video communication unit is one of: a Node B (850A) , a RNC (850B) , a SGSN (870A) , a GGSN (870B) , a MRF (890A) .
22. The method of relaying video images in a multimedia videoconference of claims 1 to 6 or the video conferencing arrangement of any of Claims 7 to 10, or multipoint processor of any of Claims 12 to 15 or the video communication system of Claim 16 or 17, or the media resource function (890A) of Claim 18 or the video communication unit of Claim 19, 20, or 21, adapted to facilitate videoconference images based on the H.323 standard or SIP standard.
23. A storage medium storing processor-implementable instructions for controlling a processor to carry out the method of any of claims 1 to 6.
PCT/EP2002/014337 2002-01-30 2002-12-16 Video conferencing and method of operation WO2003065720A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2003565169A JP2005516557A (en) 2002-01-30 2002-12-16 Video conferencing system and operation method
KR10-2004-7011846A KR20040079973A (en) 2002-01-30 2002-12-16 Video conferencing and method of operation
FI20041039A FI20041039A (en) 2002-01-30 2004-07-29 Video conferencing and implementation method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0202101.2 2002-01-30
GB0202101A GB2384932B (en) 2002-01-30 2002-01-30 Video conferencing system and method of operation

Publications (1)

Publication Number Publication Date
WO2003065720A1 true WO2003065720A1 (en) 2003-08-07

Family

ID=9930013

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2002/014337 WO2003065720A1 (en) 2002-01-30 2002-12-16 Video conferencing and method of operation

Country Status (7)

Country Link
JP (1) JP2005516557A (en)
KR (1) KR20040079973A (en)
CN (1) CN1618233A (en)
FI (1) FI20041039A (en)
GB (1) GB2384932B (en)
HK (1) HK1058450A1 (en)
WO (1) WO2003065720A1 (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005036878A1 (en) 2003-10-08 2005-04-21 Cisco Technology, Inc. System and method for performing distributed video conferencing
KR100695206B1 (en) 2005-09-12 2007-03-14 엘지전자 주식회사 Mobile communication terminal for sharing device buffer and sharing buffer method using the same
CN100401765C (en) * 2005-03-24 2008-07-09 华为技术有限公司 Video conference controlling method
EP2046041A1 (en) * 2007-10-02 2009-04-08 Alcatel Lucent Multicast router, distribution system,network and method of a content distribution
JP2009518996A (en) * 2005-09-07 2009-05-07 ヴィドヨ,インコーポレーテッド System and method for conference server architecture for low latency and distributed conferencing applications
US7822811B2 (en) * 2006-06-16 2010-10-26 Microsoft Corporation Performance enhancements for video conferencing
EP2292016A1 (en) * 2008-06-09 2011-03-09 Vidyo, Inc. Improved view layout management in scalable video and audio communication systems
US20120013705A1 (en) * 2010-07-15 2012-01-19 Cisco Technology, Inc. Switched multipoint conference using layered codecs
WO2012100410A1 (en) * 2011-01-26 2012-08-02 青岛海信信芯科技有限公司 Method, video terminal and system for enabling multi-party video calling
US8264521B2 (en) 2007-04-30 2012-09-11 Cisco Technology, Inc. Media detection and packet distribution in a multipoint conference
US8279260B2 (en) 2005-07-20 2012-10-02 Vidyo, Inc. System and method for a conference server architecture for low delay and distributed conferencing applications
US8319820B2 (en) 2008-06-23 2012-11-27 Radvision, Ltd. Systems, methods, and media for providing cascaded multi-point video conferencing units
US8334891B2 (en) 2007-03-05 2012-12-18 Cisco Technology, Inc. Multipoint conference video switching
US8436889B2 (en) 2005-12-22 2013-05-07 Vidyo, Inc. System and method for videoconferencing using scalable video coding and compositing scalable video conferencing servers
US8502858B2 (en) 2006-09-29 2013-08-06 Vidyo, Inc. System and method for multipoint conferencing with scalable video coding servers and multicast
US8659636B2 (en) 2003-10-08 2014-02-25 Cisco Technology, Inc. System and method for performing distributed video conferencing
US8773494B2 (en) 2006-08-29 2014-07-08 Microsoft Corporation Techniques for managing visual compositions for a multimedia conference call
US9030523B2 (en) 2011-04-21 2015-05-12 Shah Talukder Flow-control based switched group video chat and real-time interactive broadcast
CN105450976A (en) * 2014-08-28 2016-03-30 鸿富锦精密工业(深圳)有限公司 Video conference processing method and system
CN107610032A (en) * 2016-07-12 2018-01-19 恩智浦美国有限公司 Method and apparatus for the graph layer in managing graphic display module
US10021348B1 (en) 2017-07-21 2018-07-10 Lenovo (Singapore) Pte. Ltd. Conferencing system, display method for shared display device, and switching device

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006067124A (en) 2004-08-25 2006-03-09 Nec Corp Method and device for switching image encoded data, system, and program
CN100417220C (en) * 2004-09-28 2008-09-03 中兴通讯股份有限公司 Method for holding multi-point video conference by terminal dialing
FR2875665A1 (en) * 2005-01-04 2006-03-24 France Telecom Video bit stream highlighting method for transmitting stream to videoconference participants, involves adjusting value of encoding quality parameter of video bit stream based on measured value of predefined parameter of audio bit stream
US7535484B2 (en) * 2005-03-14 2009-05-19 Sony Ericsson Mobile Communications Ab Communication terminals that vary a video stream based on how it is displayed
US20060244813A1 (en) * 2005-04-29 2006-11-02 Relan Sandeep K System and method for video teleconferencing via a video bridge
JP2007158410A (en) 2005-11-30 2007-06-21 Sony Computer Entertainment Inc Image encoder, image decoder, and image processing system
KR100666995B1 (en) * 2006-01-16 2007-01-10 삼성전자주식회사 Method and system for providing the differential media data of meltimedia conference
KR100874024B1 (en) * 2007-09-18 2008-12-17 주식회사 온게임네트워크 Station and method for internet broadcasting interaction type-content and record media recoded program realizing the same
US7869705B2 (en) 2008-01-21 2011-01-11 Microsoft Corporation Lighting array control
US8130257B2 (en) 2008-06-27 2012-03-06 Microsoft Corporation Speaker and person backlighting for improved AEC and AGC
KR101234495B1 (en) * 2009-10-19 2013-02-18 한국전자통신연구원 Terminal, node device and method for processing stream in video conference system
US8780978B2 (en) * 2009-11-04 2014-07-15 Qualcomm Incorporated Controlling video encoding using audio information
KR101636716B1 (en) 2009-12-24 2016-07-06 삼성전자주식회사 Apparatus of video conference for distinguish speaker from participants and method of the same
JP5999873B2 (en) * 2010-02-24 2016-09-28 株式会社リコー Transmission system, transmission method, and program
US20110276894A1 (en) * 2010-05-07 2011-11-10 Audrey Younkin System, method, and computer program product for multi-user feedback to influence audiovisual quality
GB201017382D0 (en) 2010-10-14 2010-11-24 Skype Ltd Auto focus
WO2012072276A1 (en) * 2010-11-30 2012-06-07 Telefonaktiebolaget L M Ericsson (Publ) Transport bit-rate adaptation in a multi-user multi-media conference system
GB2491852A (en) * 2011-06-13 2012-12-19 Thales Holdings Uk Plc Rendering Active Speaker Image at Higher Resolution than Non-active Speakers at a Video Conference Terminal
KR101183864B1 (en) 2012-01-04 2012-09-19 휴롭 주식회사 Hub system for supporting voice/data share among wireless communication stations and method thereof
CN103533294B (en) * 2012-07-03 2017-06-20 中国移动通信集团公司 The sending method of video data stream, terminal and system
JP6174501B2 (en) * 2014-02-17 2017-08-02 日本電信電話株式会社 Video conference server, video conference system, and video conference method
WO2015153593A1 (en) * 2014-03-31 2015-10-08 Polycom, Inc. System and method for a hybrid topology media conferencing system
EP3425891B1 (en) * 2016-02-29 2021-01-06 Audio-Technica Corporation Conference system
US10708728B2 (en) * 2016-09-23 2020-07-07 Qualcomm Incorporated Adaptive modulation order for multi-user superposition transmissions with non-aligned resources
CN107968768A (en) * 2016-10-19 2018-04-27 中兴通讯股份有限公司 Sending, receiving method and device, system, the video relaying of Media Stream
CN106572320A (en) * 2016-11-11 2017-04-19 上海斐讯数据通信技术有限公司 Multiparty video conversation method and system
CN111314738A (en) * 2018-12-12 2020-06-19 阿里巴巴集团控股有限公司 Data transmission method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0711080A2 (en) * 1994-11-01 1996-05-08 AT&T Corp. Picture composition with coded picture data streams for multimedia communications systems
WO1998026592A1 (en) * 1996-12-09 1998-06-18 Siemens Aktiengesellschaft Method and telecommunications system for supporting multimedia services via an interface and a correspondingly configured subscriber terminal
EP0905976A1 (en) * 1997-03-17 1999-03-31 Matsushita Electric Industrial Co., Ltd. Method of processing, transmitting and receiving dynamic image data and apparatus therefor

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0654322A (en) * 1992-07-28 1994-02-25 Fujitsu Ltd System for controlling picture data adaption in tv conference using multi-spot controller
DE69515838T2 (en) * 1995-01-30 2000-10-12 International Business Machines Corp., Armonk Priority-controlled transmission of multimedia data streams via a telecommunication line
US6798838B1 (en) * 2000-03-02 2004-09-28 Koninklijke Philips Electronics N.V. System and method for improving video transmission over a wireless network
US20020093531A1 (en) * 2001-01-17 2002-07-18 John Barile Adaptive display for video conferences

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0711080A2 (en) * 1994-11-01 1996-05-08 AT&T Corp. Picture composition with coded picture data streams for multimedia communications systems
WO1998026592A1 (en) * 1996-12-09 1998-06-18 Siemens Aktiengesellschaft Method and telecommunications system for supporting multimedia services via an interface and a correspondingly configured subscriber terminal
EP0905976A1 (en) * 1997-03-17 1999-03-31 Matsushita Electric Industrial Co., Ltd. Method of processing, transmitting and receiving dynamic image data and apparatus therefor

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1678951A1 (en) * 2003-10-08 2006-07-12 Cisco Technology, Inc. System and method for performing distributed video conferencing
US8659636B2 (en) 2003-10-08 2014-02-25 Cisco Technology, Inc. System and method for performing distributed video conferencing
EP1678951A4 (en) * 2003-10-08 2012-03-14 Cisco Tech Inc System and method for performing distributed video conferencing
WO2005036878A1 (en) 2003-10-08 2005-04-21 Cisco Technology, Inc. System and method for performing distributed video conferencing
CN100401765C (en) * 2005-03-24 2008-07-09 华为技术有限公司 Video conference controlling method
US8279260B2 (en) 2005-07-20 2012-10-02 Vidyo, Inc. System and method for a conference server architecture for low delay and distributed conferencing applications
US9338213B2 (en) 2005-09-07 2016-05-10 Vidyo, Inc. System and method for a conference server architecture for low delay and distributed conferencing applications
JP2009518996A (en) * 2005-09-07 2009-05-07 ヴィドヨ,インコーポレーテッド System and method for conference server architecture for low latency and distributed conferencing applications
US8872885B2 (en) 2005-09-07 2014-10-28 Vidyo, Inc. System and method for a conference server architecture for low delay and distributed conferencing applications
JP2013102524A (en) * 2005-09-07 2013-05-23 Vidyo Inc System and method on highly reliable basic layer trunk
JP2012239184A (en) * 2005-09-07 2012-12-06 Vidyo Inc System and method for conference server architecture for low-delay and distributed conferencing application
KR100695206B1 (en) 2005-09-12 2007-03-14 엘지전자 주식회사 Mobile communication terminal for sharing device buffer and sharing buffer method using the same
US8436889B2 (en) 2005-12-22 2013-05-07 Vidyo, Inc. System and method for videoconferencing using scalable video coding and compositing scalable video conferencing servers
US7822811B2 (en) * 2006-06-16 2010-10-26 Microsoft Corporation Performance enhancements for video conferencing
US8773494B2 (en) 2006-08-29 2014-07-08 Microsoft Corporation Techniques for managing visual compositions for a multimedia conference call
US10187608B2 (en) 2006-08-29 2019-01-22 Microsoft Technology Licensing, Llc Techniques for managing visual compositions for a multimedia conference call
US9635314B2 (en) 2006-08-29 2017-04-25 Microsoft Technology Licensing, Llc Techniques for managing visual compositions for a multimedia conference call
US8502858B2 (en) 2006-09-29 2013-08-06 Vidyo, Inc. System and method for multipoint conferencing with scalable video coding servers and multicast
US8334891B2 (en) 2007-03-05 2012-12-18 Cisco Technology, Inc. Multipoint conference video switching
US8736663B2 (en) 2007-04-30 2014-05-27 Cisco Technology, Inc. Media detection and packet distribution in a multipoint conference
US8264521B2 (en) 2007-04-30 2012-09-11 Cisco Technology, Inc. Media detection and packet distribution in a multipoint conference
US9509953B2 (en) 2007-04-30 2016-11-29 Cisco Technology, Inc. Media detection and packet distribution in a multipoint conference
EP2046041A1 (en) * 2007-10-02 2009-04-08 Alcatel Lucent Multicast router, distribution system,network and method of a content distribution
EP2292016A4 (en) * 2008-06-09 2013-09-18 Vidyo Inc Improved view layout management in scalable video and audio communication systems
EP2292016A1 (en) * 2008-06-09 2011-03-09 Vidyo, Inc. Improved view layout management in scalable video and audio communication systems
US9071883B2 (en) 2008-06-09 2015-06-30 Vidyo, Inc. System and method for improved view layout management in scalable video and audio communication systems
US8319820B2 (en) 2008-06-23 2012-11-27 Radvision, Ltd. Systems, methods, and media for providing cascaded multi-point video conferencing units
US8553068B2 (en) * 2010-07-15 2013-10-08 Cisco Technology, Inc. Switched multipoint conference using layered codecs
US20120013705A1 (en) * 2010-07-15 2012-01-19 Cisco Technology, Inc. Switched multipoint conference using layered codecs
WO2012100410A1 (en) * 2011-01-26 2012-08-02 青岛海信信芯科技有限公司 Method, video terminal and system for enabling multi-party video calling
US9030523B2 (en) 2011-04-21 2015-05-12 Shah Talukder Flow-control based switched group video chat and real-time interactive broadcast
CN105450976A (en) * 2014-08-28 2016-03-30 鸿富锦精密工业(深圳)有限公司 Video conference processing method and system
CN107610032A (en) * 2016-07-12 2018-01-19 恩智浦美国有限公司 Method and apparatus for the graph layer in managing graphic display module
CN107610032B (en) * 2016-07-12 2023-09-12 恩智浦美国有限公司 Method and apparatus for managing graphics layers within a graphics display component
US10021348B1 (en) 2017-07-21 2018-07-10 Lenovo (Singapore) Pte. Ltd. Conferencing system, display method for shared display device, and switching device

Also Published As

Publication number Publication date
JP2005516557A (en) 2005-06-02
CN1618233A (en) 2005-05-18
FI20041039A (en) 2004-09-29
HK1058450A1 (en) 2004-05-14
GB2384932B (en) 2004-02-25
KR20040079973A (en) 2004-09-16
GB2384932A (en) 2003-08-06
GB0202101D0 (en) 2002-03-13

Similar Documents

Publication Publication Date Title
WO2003065720A1 (en) Video conferencing and method of operation
US11503250B2 (en) Method and system for conducting video conferences of diverse participating devices
US8289369B2 (en) Distributed real-time media composer
US7627629B1 (en) Method and apparatus for multipoint conferencing
US8514265B2 (en) Systems and methods for selecting videoconferencing endpoints for display in a composite video image
KR100880150B1 (en) Multi-point video conference system and media processing method thereof
CN101147400A (en) Split screen multimedia video conference
US9743043B2 (en) Method and system for handling content in videoconferencing
CN111385515B (en) Video conference data transmission method and video conference data transmission system
CN104980683A (en) Implement method and device for video telephone conference
US20140002584A1 (en) Method of selecting conference processing device and video conference system using the method
CN105122791A (en) Method and a device for optimizing large scaled video conferences
US7907594B2 (en) Marking keyframes for a communication session
EP2227013A2 (en) Virtual distributed multipoint control unit
GB2378601A (en) Replacing intra-coded frame(s) with frame(s) predicted from the first intra-coded frame
CN113612964B (en) Interactive teaching processing method and device, computer equipment and storage medium
CN112839197B (en) Image code stream processing method, device, system and storage medium
Mankin et al. The design of a digital amphitheater
Jia et al. Efficient 3G324M protocol Implementation for Low Bit Rate Multipoint Video Conferencing.
Joskowicz Video Conferencing Technologies: Past, Present and Future
Chatras Telepresence: Immersive Experience and Interoperability
Gharai et al. High Definition Conferencing: Present, Past and Future
Carli et al. Multimedia proxy adaptive scheduler driven by perceptual quality for multi-user environment

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 1877/DELNP/2004

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 2003565169

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 20041039

Country of ref document: FI

WWE Wipo information: entry into national phase

Ref document number: 20028277430

Country of ref document: CN

Ref document number: 1020047011846

Country of ref document: KR

122 Ep: pct application non-entry in european phase