GB2384932A - Video conferencing system that provides higher quality images for most active speaker - Google Patents

Video conferencing system that provides higher quality images for most active speaker Download PDF

Info

Publication number
GB2384932A
GB2384932A GB0202101A GB0202101A GB2384932A GB 2384932 A GB2384932 A GB 2384932A GB 0202101 A GB0202101 A GB 0202101A GB 0202101 A GB0202101 A GB 0202101A GB 2384932 A GB2384932 A GB 2384932A
Authority
GB
United Kingdom
Prior art keywords
video
active
video images
speakers
speaker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB0202101A
Other versions
GB0202101D0 (en
GB2384932B (en
Inventor
Arthur Lallet
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Motorola Solutions Inc
Original Assignee
Motorola Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc filed Critical Motorola Inc
Priority to GB0202101A priority Critical patent/GB2384932B/en
Publication of GB0202101D0 publication Critical patent/GB0202101D0/en
Priority to PCT/EP2002/014337 priority patent/WO2003065720A1/en
Priority to CNA028277430A priority patent/CN1618233A/en
Priority to JP2003565169A priority patent/JP2005516557A/en
Priority to KR10-2004-7011846A priority patent/KR20040079973A/en
Publication of GB2384932A publication Critical patent/GB2384932A/en
Priority to HK04100650A priority patent/HK1058450A1/en
Application granted granted Critical
Publication of GB2384932B publication Critical patent/GB2384932B/en
Priority to FI20041039A priority patent/FI20041039A/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234327Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
    • H04N21/2402Monitoring of the downstream path of the transmission network, e.g. bandwidth available
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • H04N21/25808Management of client data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26208Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists the scheduling operation being performed under constraints
    • H04N21/26216Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists the scheduling operation being performed under constraints involving the channel capacity, e.g. network bandwidth
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440227Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/443OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • H04N21/4621Controlling the complexity of the content stream or additional data, e.g. lowering the resolution or bit-rate of the video stream for a mobile client with a small screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor

Abstract

A method of relaying video images in a multimedia videoconference includes the step of transmitting layered video images by a number of user equipment wherein said layered video images include a base layer (552, 562, 572, 582) and one or more enhancement layers (555, 565, 575, 585). The transmitted layered video images are received at a multipoint control unit (520), where a number of base layer video images of a number of active speakers (535) and one or more enhancement layers (540) of a most active speaker are selected. The multipoint control unit (520) transmits the base layer video images, and one or more enhancement layers (540) of the most active speaker, to one or more of the plurality of multimedia user equipment (550, 560, 570, 580). Also disclosed is a telephone including a number of displays allowing a most active speaker to be displayed on a screen at higher quality while other speakers are displayed on separate screens at lower quality.

Description

<Desc/Clms Page number 1>
Video Conferencing System And Method Of Operation Field of the Invention This invention relates to video conferencing. The invention is applicable to, but not limited to, a video switching mechanism in H. 323 and/or SIP based centralised videoconferences, using layered video coding.
Background of the Invention As the pace of business accelerates, and relationships spread around the world, the need to bridge communication distances quickly and economically has become a major challenge. Bringing customers and staff together efficiently is critical to being successful in an ever more competitive marketplace. Businesses are looking for flexible solutions that support real-time information sharing across countries and continents using various communication methods, such as voice, video, image data and any combination thereof.
In particular, multi-national organisations have an increasing desire to eliminate costly travel and link multiple locations in order to let groups within the organisation communicate more efficiently and effectively. A multipoint conferencing system operating over an Internet protocol (IP) network seeks to address this need. In the field of this invention, it is known that terminals exchange audio and video streams in realtime in multipoint videoconferences. The conventional method to set-up multipoint conferences over an IP
<Desc/Clms Page number 2>
network is to use a Multipoint control unit (MCU). The MCU is an endpoint on the network that provides the capability for three or more terminals and/or communication gateways to participate in a multipoint conference. The MCU may also connect two terminals in a point-to-point conference such that they have the ability to evolve into a multipoint conference.
Referring first to FIG. 1, a known centralised conferencing model 100 is shown. Centralised conferences make use of an MCU-based conference bridge. All terminals (endpoints) 120,122, 125 send and receive media information 130 in the form of audio, video, and/or data signals, as well as control information streams 140, to/from the MCU 110. These transmissions are done in a point-to-point fashion. This is shown in Figure 1.
An MCU 110 consists of a Multipoint Controller (MC), and zero or more Multipoint Processors (MP). The MC handles the call set-up and call signalling negotiations between all terminals, to determine common capabilities for audio and video processing. The MC 110 does not deal directly with any of the media streams. This is left to the MP, which mixes, switches, and processes audio, video, and/or data bits.
In this manner, MCUs provide the ability to host multilocation seminars, sales meetings, group conferences and other'face-to-face'communications. It is also known that multipoint conferences can be used in various applications, for example:
<Desc/Clms Page number 3>
(i) Executives and managers at multiple locations can meet'face-to-face', share real-time information, and make decisions more quickly without any loss of time, expense, and demands of travelling; (ii) Project teams and knowledge workers can coordinate individual tasks, and view and revise shared documents, presentations, designs, and files in a real time manner ; and (iii) Students, trainees, and employees at remote locations can access shared educational/training resources across any distance or time zones.
Consequently, it is envisaged that MCU-based systems will play an important role in multimedia communications over IP based networks in the future.
Such multimedia communication often employs video transmission. In such transmissions, a sequence of images, often referred to as frames, is transmitted between transmitting and receiving units. Multipoint multimedia conference systems can be set-up using various methods, for example as specified by the H. 323 and SIP session layer protocol standards. References for SIP can be found at: http://www. ietf. org/rfc/rfc2543. txt, and http://www. cs. columbia. edu/-hgs/sip.
Furthermore, for example in systems using ITU H. 263 video compression [ITU-T Recommendation, H. 263,'Video Coding for Low Bit Rate Communication'], the first frame of a video sequence includes a comprehensive amount of image data, generally referred to as intra coded information.
<Desc/Clms Page number 4>
The intra-coded frame, as it is the first frame, provides a substantial portion of the image to be displayed. This intra-coded frame is followed by inter-coded (predicted) information, which generally includes data relating to changes in the image that is being transmitted. Hence, predicted inter-coded information contains much less information than intra-coded information.
In traditional multimedia conferencing system, the users need to identify themselves when they speak, so that the receiving terminals know who is speaking. Clearly if the transmitting terminal falls to identify itself, the listening users will have to guess who is speaking.
A known technique solves this problem by analysing the audio streams and forwarding the name and video stream of the active speaker to all the participants. In a centralized conferencing system, the MCU often performs this function. The MCU can then send the name of the speaker and the corresponding video and audio stream to all the participants by switching the appropriate input multimedia stream to the output ports/paths.
Video switching is a well-known technique that aims at delivering to each endpoint a single video stream, equivalent to arranging multiple point-to-point sessions.
The video switching can be: (i) Voice activated switching, where the MCU transmits the video of the active speaker.
(ii) Timed activated switching, where the video of each participant is transmitted one after another at a predetermined time interval.
<Desc/Clms Page number 5>
(iii) Individual video selection switching, where each endpoint can request the participant video stream that he/she wishes to receive.
Referring now to FIG. 2, a functional diagram of a traditional video switching mechanism 200 is shown. In a traditional centralised conferencing system, the video switching is performed as follows. The MCU 220, for example positioned within an Internet protocol (IP) based network 210, contains a switch 230. The MCU 220 receives the video streams 255,265, 275,285 of all the participants (user equipment) 250,260, 270,280. The MCU may also receive, separately, a combined (multiplexed) audio stream 290 from the participants who are speaking. The MCU 220 then selects one of the video streams and sends this video stream 240 to all the participants 250,260, 270, 280.
Such traditional systems have the disadvantage that they only send the video stream of the active speaker. The users might still have a problem in identifying the speaker of the video stream if several speakers are talking at the same time, or if the active speaker is constantly changing. This is particularly the case with large videoconferences.
Alternatively, the video of each participant can be sent to all the participants. However, this approach suffers in a wireless based conference due to the bandwidth limitation.
<Desc/Clms Page number 6>
In the field of video technology, it is known that video is transmitted as a series of still images/pictures.
Since the quality of a video signal can be affected during coding or compression of the video signal, it is known to include additional information'layers'based on the difference between the video signal and the encoded video bit stream. The inclusion of additional layers enables the quality of the received signal, following decoding and/or decompression, to be enhanced. Hence, a hierarchy of pictures and enhancement pictures partitioned into one or more layers is used to produce a layered video bit stream.
In a layered (scalable) video bit stream, enhancements to the video signal may be added to the base layer either by: (i) Increasing the resolution of the picture (spatial scalability) ; (ii) Including error information to improve the Signal to Noise Ratio of the picture (SNR scalability); or (iii) Including extra pictures to increase the frame rate (temporal scalability).
Such enhancements may be applied to the whole picture, or to an arbitrarily shaped object within the picture, which is termed object-based scalability. In order to preserve the disposable nature of the temporal enhancement layer, the H. 263+ standard dictates that pictures included in the temporal scalability mode should be bi-directionally predicted (B) pictures, as shown in the video stream of FIG. 3.
<Desc/Clms Page number 7>
FIG. 3 shows a schematic illustration of a scalable video arrangement 300 illustrating B picture prediction dependencies, as known in the field of video coding techniques. An initial intra-coded frame (Ii) 310 is followed by a bi-directionally predicted frame (B2) 320.
This, in turn, is followed by a (uni-directional) predicted frame (P3) 330, and again followed by a second bi-directionally predicted frame (B4) 340. This again, in turn, is followed by a (uni-directional) predicted frame (Ps) 350, and so on.
FIG. 4 is a schematic illustration of a layered video arrangement, known in the field of video coding techniques. A layered video bit stream includes a base layer 405 and one or more enhancement layers 435.
The base layer (layer 1) includes one or more intra-coded pictures (I pictures) 410 sampled, coded and/or compressed from the original video signal pictures.
Furthermore, the base layer will include a plurality of predicted inter-coded pictures (P pictures) 420,430 predicted from the intra-coded picture (s) 410.
In the enhancement layers (layers 2 or 3 or more) 435, three types of picture may be used: (i) Bi-directionally predicted (B) pictures (not shown) ; (ii) Enhanced intra (EI) pictures 440 based on the intracoded picture (s) 410 of the base layer 405; and
<Desc/Clms Page number 8>
(iii) Enhanced predicted (EP) pictures 450,460, based on the inter-coded predicted pictures 420,430 of the base layer 405.
The vertical arrows from the lower layer illustrate that the picture in the enhancement layer is predicted from a reconstructed approximation of that picture in the reference (lower) layer.
In summary, scalable video coding has been used with multicast multimedia conferences, and only in the context of point-to-point or multicast video communication.
However, wireless networks do not currently support multicasting. Furthermore, with multicasting, each layer is sent in separate multicast sessions, with the receiver deciding itself whether to register to one or more sessions.
A need therefore exists for an improved video conferencing arrangement and method of operation, wherein the abovementioned disadvantages may be alleviated.
Statement of Invention In accordance with a first aspect of the present invention there is provided a method of relaying video images in a multimedia videoconference, as claimed in claim 1.
In accordance with a second aspect of the present invention there is provided a video conferencing
<Desc/Clms Page number 9>
arrangement for relaying video images, as claimed in claim 7.
In accordance with a third aspect of the present invention there is provided a wireless device for participating in a videoconference, as claimed in claim
11.
In accordance with a fourth aspect of the present invention there is provided a multipoint processor, as claimed in claim 12.
In accordance with a fifth aspect of the present invention there is provided a video communication system, as claimed in claim 16.
In accordance with a sixth aspect of the present invention there is provided a media resource function, as claimed in claim 18.
In accordance with a seventh aspect of the present invention there is provided a video communication unit, as claimed in claim 19 or claim 20.
In accordance with a eighth aspect of the present invention there is provided a storage medium, as claimed in claim 23.
Further aspects of the present invention are as claimed in the dependent claims.
<Desc/Clms Page number 10>
In summary, the inventive concepts of the present invention address the disadvantages of prior art arrangements by providing a video switching method to improve the identification of the participants and speakers in a videoconference. This invention makes use of layered video coding, in order to provide a better usage of the bandwidth available for each user.
Brief Description of the Drawings FIG. 1 shows a known centralised conferencing model.
FIG. 2 shows a functional diagram of a traditional video switching mechanism.
FIG. 3 is a schematic illustration of a video arrangement showing picture prediction dependencies, as known in the field of video coding techniques.
FIG. 4 is a schematic illustration of a layered video arrangement, known in the field of video coding techniques.
Exemplary embodiments of the present invention will now be described, with reference to the accompanying drawings, in which: FIG. 5 shows a functional diagram of a video switching mechanism, in accordance with a preferred embodiment of the invention.
<Desc/Clms Page number 11>
FIG. 6 shows a functional block diagram/flowchart of a multipoint processing unit, in accordance with a preferred embodiment of the invention.
FIG. 7 shows a video display of a wireless device participating in a videoconference using the preferred embodiment of the present invention.
FIG. 8 shows a UMTS (3GPP) communication system adapted in accordance with the preferred embodiment of the present invention.
Description of Preferred Embodiments In summary, the preferred embodiment of the present invention proposes a new video switching mechanism for multimedia conferences that makes use of layered video coding. Previously, layered video coding has only been used to partition a video bit stream into more than one layer: a base layer and one or several enhancement layers, as described above with respect to FIG. 4. These known techniques for scalable video communication are described in detail in standards such as H. 263 and MPEG- 4.
However, the inventor of the present invention has recognised the benefits to be gained by adapting the concept of layered video coding and applying the adapted concepts to multimedia videoconference applications. In this manner, the present invention defines a different type of scalable video coding focused for use in
<Desc/Clms Page number 12>
multimedia conferences, in contrast to point-to-point or multicast video communication.
Referring now to FIG. 5, a functional block diagram 500 of a video switching mechanism is shown, in accordance with the preferred embodiment of the invention. In contrast to a traditional centralised conferencing system, the video switching is performed as follows. The MCU 520, for example positioned within an Internet protocol (IP) based network 510, contains a switch 530.
It is noteworthy that the MCU 520 receives'layered' video streams including a base layer 552,562, 572,582 and one or more enhancement layer streams 555,565, 575, 585 of all the participants (user equipment) 550,560, 570,580. Only one enhancement layer video stream is per participant shown for clarity purposes only.
The MCU 520 may also receive, separately, a combined (multiplexed) audio stream 590 from the participants.
The MCU 520 then selects the base layer video streams of a number of active speakers 535 and the enhancement layer 540 of the most active speaker, using switch 530. The MCU 520 then sends these video streams 535,540 to all the participants 550,560, 570,580.
The selection process to determine the most active speaker is preferably performed by the MCU 520 analysing the audio streams 590 in order to determine first whom all the active speakers are. The most active speaker is then preferably determined in the multipoint processor unit, as described with reference to FIG. 6. The one or
<Desc/Clms Page number 13>
more base layers and one enhancement layer are preferably sent to the participants according to a priority level based on the activity of each participant.
In order to effect the improved, but more complex, video switching mechanism of FIG. 5, the multipoint processing unit (MP) 600 has been adapted to facilitate the new video switching mechanism, in accordance with a preferred embodiment of the invention and as shown in FIG. 6.
The MP 600 still receives the audio stream 590 from the participants'video/multimedia communication units, through a packet-filtering module 610 and routes this audio stream to a packet routing module 630. However, the audio stream is now also routed to a speaker identification module 620 that analyses the audio streams 590 in order to determine who are the active speakers.
The speaker identification module 620 allocates a priority level based on the activity of each participant and determines: (i) The most active speaker 622, (ii) Any other active speakers 625, and by default (iii) Any remaining inactive speakers.
The speaker identification module 620 then forwards the priority level information to the switching module 640 that has been adapted to deal with priority level of speakers, in accordance with the preferred embodiment of the present invention. Furthermore, the switching module 640 has been adapted to receive layered video streams, including video base layer streams 552,562, 572 and 582
<Desc/Clms Page number 14>
and video enhancement layer streams 555, 565, 575 and 585 from the participants'video communication units through the packet filtering module 610. The switching module 640 uses this speaker information to send the video base layers of the secondary (lesser) active speakers and the most active speaker and only the video enhancement layer of the most active speaker, to all the participants, via the packet routing module 630.
The one or more receiving ports of the multipoint processor have therefore been adapted to receive layered video streams, including base layer video streams 552, 562, 572 and 582 and enhancement layer video streams 555, 565, 575 and 585, from a plurality of user equipment 550, 560, 570 and 580. It is within the contemplation of the invention that the switching module 640, may only select one base layer video image and, corresponding one or more enhancement layers if it is determined that there is only one active speaker. This speaker then automatically is designated as the most active speaker for transmitting to one or more user equipment 550, 560, 570 and 580.
When the most active speaker is constantly changing, as can happen in videoconferences, the enhancement layer will be constantly switching. The inventor of the present invention has recognised a potential problem with such constant and rapid switching. Under such circumstances, the first frame may need to be converted into an Intra frame (EI) if it was actually a predicted frame (EP) from a speaker who was previously only a secondary active speaker.
<Desc/Clms Page number 15>
To address this potential problem, the video base layer streams 552,562, 572 and 582 and video enhancement layer streams 555,565, 575 and 585 from the packet-filtering module 610 are preferably input to a de-packetisation function 680. The de-packetisation function 680 demultiplexes the video streams and provides the demultiplexed video streams to a video decoder and buffer function 670.
To synchronise and co-ordinate the video decoding, the video decoder and buffer function 670 receives the indication of the most active speaker 622. After extracting the video stream information for the most active speaker, the video decoder and buffer function 670 provides bi-directionally predicted (BP) 675 and/or predicted (EP) video stream data of the most active speaker 622 to an'EP frame to EI frame Transcoding Module'660. The'EP frame to EI frame Transcoding Module'660 processes the input video streams to provide the primary speaker enhancement layer video stream, as an Intra-coded (EI) frame.
The primary speaker enhancement layer video stream is then input to a packetisation function 650, where it is packetised and input to the switching module 640. The switching module 640 then combines the primary speaker enhancement layer video stream, with the video base layer streams 552,562, 572 and 582 of the secondary active speakers and routes the combined multimedia stream to the packet routing module 630. The packet routing module then routes the information to the participants in accordance with the method of FIG. 5.
<Desc/Clms Page number 16>
In the preferred embodiment of the present invention, the video switching module 640 uses the output of the'EP frame to EI frame Transcoding module'660 when it determines that the primary speaker has changed.
It is within the contemplation of the invention that one or more modules that are similar to module 660 could also be included in the MP 600 to perform the same function for the secondary speakers, when they are deemed to have changed. Otherwise, in the embodiment that uses a single 'EP frame to EI frame Transcoding module'660 to transcode the video stream of only the primary speaker, when say an inactive speaker becomes a secondary active speaker, the speaker identification module 620 (or switching module 640) may make a request for a new Intraframe. Alternatively, the switching module 640 may wait for a new Intra frame of the new secondary active speaker before sending the corresponding video base layer stream to all the participants.
In addition to the preferred embodiment of the present invention, where more than one enhancement layer is available for use, it is within the contemplation of the invention that more classes of speakers can be used. By using more classes of speakers, a finer scalability of the multimedia messages can be attained, as the identification of speakers is improved, especially for large videoconferences.
It is also within the contemplation of the invention that predicted frame to Intra frame conversion could be added
<Desc/Clms Page number 17>
for one or more of the base layers streams. In this manner, the switching module 640 can quickly switch between the base layers without having to wait for a new Intra frame.
FIG. 7 shows the video display 710 of a wireless device 700 taking part in a videoconference using the preferred embodiment of the present invention. By implementing the inventive concepts hereinbefore described, improved video communication is achieved. In particular, for a given bandwidth, the participants are now able to receive better video quality of the most active speaker 720, by lowering the video quality of the lesser (secondary) active speakers 730, and providing no video for the inactive speakers. In order to provide such improved video conferencing, the video communication device receives the enhancement layer. and base layer of the most active speaker 720, the base layers of the secondary active speakers 730 and no video from inactive speakers.
In such a manner, a video communication unit can provide a constantly updated video image of the most active speaker in a larger, higher resolution display, whilst smaller displays can display secondary (lesser) active speakers.
The wireless device 700 preferably has a primary video display 710 for displaying a higher quality video image of the most active speaker, and one or more second distinct displays for displaying respective lesser active speakers. Preferably, the manipulation of the respective video images into the respective displays is performed by
<Desc/Clms Page number 18>
a processor (not shown) that is operably coupled to the video displays. The processor receives an indication of a most active speaker 720 and lesser active speakers, and determines which video image received should be displayed in the first display and which video image (s) received from the lesser active speakers 730 should be displayed in the second display. Advantageously, the second display may be configured to provide a lower quality video image of the lesser active speakers, thereby saving cost.
It is anticipated that MCU-based systems will facilitate multimedia communications over IP based networks in the future. Therefore, the inventor of the present invention envisages that the herein described techniques could be incorporated in any H. 323/SIP based multipoint multimedia conferences or systems that make use of MCU.
A preferred application of the aforementioned invention is in the Third Generation Partnership Project (3GPP) specification for wide-band code-division multiple access (WCDMA) standard. In particular, the invention can be applied to the IP Multimedia Domain (described in the 3G TS 25. xxx series of specifications), which is planning to incorporate H. 323/SIP MCU into the 3GPP network. The MCU will be hosted by the Media Resource Function (MRF) 890A, see Figure 8.
FIG. 8 shows, a 3GPP (UMTS) communication system/network 800, in a hierarchical form, which is capable of being adapted in accordance with the preferred embodiment of the present invention. The communication system 800 is
<Desc/Clms Page number 19>
compliant with, and contains network elements capable of operating over, a UMTS and/or a GPRS air-interface.
The network is conveniently considered as comprising: (i) A user equipment domain 810, made up of: (a) A user SIM (USIM) domain 820 and (b) A mobile equipment domain 830; and (ii) An infrastructure domain 840, made up of: (c) An access network domain 850, and (d) A core network domain 860, which is, in turn, made up of (at least): (di) a serving network domain 870, and (dii) a transit network domain 880 and (diii) an IP multimedia domain 890, with multimedia being provided by SIP (ETF RFC2543).
In the mobile equipment domain. 830, UE 830A receives data from a user SIM 820A in the USIM domain 820 via the wired Cu interface. The UE 830A communicates data with a Node B 850A in the network access domain 850 via the wireless Uu interface. Within the network access domain 850, the Node Bs 850A contain one or more transceiver units and communicate with the rest of the cell-based system infrastructure, for example RNC 850B, via an Iub interface, as defined in the UMTS specification.
The RNC 850B communicates with other RNCs (not shown) via the Iur interface. The RNC 850B communicates with a SGSN 870A in the serving network domain 870 via the Iu interface. Within the serving network domain 870, the SGSN 870A communicates with a GGSN 870B via the Gn interface, and the SGSN 870A communicates with a VLR
<Desc/Clms Page number 20>
server 870C via the Gs interface. In accordance with the preferred embodiment of the present invention, the SGSN 870A communicates with the MCU (not shown) that resides within the media resource function (890A) in the IP Multimedia domain 890. The communication is performed via the Gi interface.
The GGSN 870B (and/or SSGN) is responsible for UMTS (or GPRS) interfacing with a Public Switched Data Network (PSDN) 880A such as the Internet or a Public Switched Telephone Network (PSTN). The SGSN 870A performs a routing and tunnelling function for traffic within say, a UMTS core network, whilst a GGSN 870B links to external packet networks, in this case ones accessing the UMTS mode of the system.
The RNC 850B is the UTRAN element responsible for the control and allocation of resources for numerous Node Bs 850A ; typically 50 to 100 Node B's may be controlled by one RNC 850B. The RNC 850B also provides reliable delivery of user traffic over the air interfaces. RNCs communicate with each other (via the interface Iur) to support handover and macro diversity.
The SGSN 870A is the UMTS Core Network element responsible for Session Control and interface to the Location Registers (HLR and VLR). The SGSN is a large centralised controller for many RNCs.
The GGSN 870B is the UMTS Core Network element responsible for concentrating and tunnelling user data within the core packet network to the ultimate
<Desc/Clms Page number 21>
destination (e. g. , an internet service provider (ISP)).
Such user data includes multimedia and related signalling data to/from the IP multimedia domain 890. Within the IP multimedia domain 890, the MRF is split into a Multimedia Resource Function Controller (MRFC) 892A and a Multimedia Resource Function Processor (MRFP) 891A. The MRFC 892A provides the Multipoint Controller (MC) functionalities, whereas the MRFP 891A provides the Multipoint Processor (MP) functionalities, as described previously.
The protocol used across the Mr reference point/interface 893A is SIP (as defined by RFC 2543). The call-state control function (CSCF) 895A acts as a call server and handles multimedia call signalling.
Thus, in accordance with the preferred embodiment of the invention the elements SGSN 870A, GGSN 870B and all parts within the MRF 890A are adapted to facilitate multimedia messages as herein before described. Furthermore, the UE 830A, Node B 850A and RNC 850B may also be adapted to facilitate improved multimedia messages as hereinbefore described.
More generally, the adaptation may be implemented in the respective communication units in any suitable manner.
For example, new apparatus may be added to a conventional communication unit, or alternatively existing parts of a conventional communication unit may be adapted, for example by reprogramming one or more processors therein.
As such, the required adaptation may be implemented in the form of processor-implementable instructions stored on a storage medium, such as a floppy disk, hard disk,
<Desc/Clms Page number 22>
PROM, RAM or any combination of these or other storage multimedia.
It is also within the contemplation of the invention that such adaptation of multimedia messages may alternatively be controlled, implemented in full or implemented in part by adapting any other suitable part of the communication system 800.
Although the above elements are typically provided as discrete and separate units (on their own respective software/hardware platforms), divided across the mobile equipment domain 830, access network domain 850 and the serving network domain 870, it is envisaged that other configurations can be applied.
Further, in the case of other network infrastructures, such as a GSM network, implementation of the processing operations may be performed at any appropriate node such as any other appropriate type of base station, base station controller, mobile switching centre or operational and management controller, etc.
Alternatively, the aforementioned steps may be carried out by various components distributed at different locations or entities within any suitable network or system.
It will be understood that the video conferencing method using layered video coding, preferably when applied in a centralised videoconference, as described above, provides the following advantages:
<Desc/Clms Page number 23>
(i) The identification of the speakers is much improved compared to traditional systems, because the bandwidth is shared to allow one or more enhancement layers and several base layers to be sent instead of only one full quality video stream.
(ii) The video switching when the active speaker changes is much smoother using the inventive concepts herein described, because it defines several states active speaker, second most active speakers, inactive speakers.
(iii) The video quality of the most active speaker is improved.
(iv) Improved video communication units can display a variety of speakers, with each displayed image being dependent upon a priority level associated with the respective video communication unit's transmission.
Whilst the specific and preferred implementations of the embodiments of the present invention are described above, it is clear that one skilled in the art could readily apply variations and modifications of such inventive concepts.
In summary, a method of relaying video images in a multimedia videoconference between a plurality of multimedia user equipment has been described. The method includes the steps of transmitting layered video images by a number of the plurality of user equipment wherein the layered video images include a base layer and one or more enhancement layers and receiving the transmitted layered video images at a multipoint control unit. A number of base layer video images of a number of active speakers are selected and one or more enhancement layers
<Desc/Clms Page number 24>
of a most active speaker. The multipoint control unit transmits the number of base layer video images of a number of active speakers and one or more enhancement layers of the most active speaker to one or more of the plurality of multimedia user equipment.
In addition, a video conferencing arrangement for relaying video images between a plurality of user equipment has been described. The video conferencing arrangement includes a multipoint control unit adapted to receive a number of layered video images transmitted by a number of the plurality of user equipment. The layered video images include a base layer and one or more enhancement layers. A video-switching module is operably coupled to the multipoint control unit and selects a number of base layer video images of a number of active speakers and one or more enhancement layers of a most active speaker. The multipoint control unit is further adapted to transmit the number of base layer video images of a number of active speakers and one or more enhancement layers of the most active speaker to one or more of the plurality of user equipment.
Furthermore, a wireless device for participating in a videoconference has been described, where a number of participants transmit video images. The wireless device includes a video display having a first display and one or more second distinct displays for displaying respective participants from the plurality of participants; and a processor, operably coupled to the video display, for receiving an indication of a most active speaker and less active speakers. The processor
<Desc/Clms Page number 25>
determines that the video image received from the most active speaker should be displayed in the first display offering a higher quality video image, and the video images received from the number of less active speakers should be displayed in the one or more second displays offering a lower quality video image.
In addition, a multipoint processor has been described that includes one or more receiving ports adapted to receive layered video streams, including base layer video streams and enhancement layer video streams, from a plurality of user equipment. A switching module is operably coupled to the one or more receiving ports, to select a number of base layer video images of a number of active speakers, and one or more enhancement layers of the most active speaker for transmitting to one or more user equipment.
A video communication system and video communication unit and media resource function, adapted to facilitate or perform the aforementioned inventive concepts have also been described.
Thus, an improved video conferencing arrangement and method of operation have been described, wherein the aforementioned disadvantages associated with prior art arrangements have been substantially alleviated.

Claims (27)

  1. Claims 1. A method of relaying video images in a multimedia videoconference between a plurality of multimedia user equipment (550,560, 570,580), the method comprising the steps of: transmitting layered video images by a number of said plurality of user equipment, wherein said layered video images include a base layer (552,562, 572,582) and one or more enhancement layers (555,565, 575,585) ; receiving said transmitted layered video images at a multipoint control unit (520) ; selecting a number of base layer video images of a number of active speakers (535) and one or more enhancement layers (540) of a most active speaker; and transmitting, by said multipoint control unit (520) said number of base layer video images of a number of active speakers (535) and one or more enhancement layers (540) of the most active speaker to one or more of the plurality of multimedia user equipment (550,560, 570, 580).
  2. 2. The method of relaying video images in a multimedia videoconference according to Claim 1, wherein the step of selecting further comprises the step of: analysing a number of audio data streams (590), transmitted by said plurality of multimedia user equipment (550, 560,570, 580), in order to determine the number of active speakers and/or said most active speaker.
  3. 3. The method of relaying video images in a multimedia videoconference according to Claim 1 or Claim 2, the method further characterised by the step of:
    <Desc/Clms Page number 27>
    assigning a priority level to each layered video image and/or said audio data stream transmitted by a respective user equipment; and selecting a number of base layer video images (535) and one or more enhancement layers (540) for transmitting to said one or more of said plurality of multimedia user equipment (550,560, 570,580), based on said assigned priority level.
  4. 4. The method of relaying video images in a multimedia videoconference according to any preceding Claim, the method further characterised by the step of: transcoding (660) a first predicted frame of a video image of the most active speaker to an intra-coded frame, for enhancing the video quality of the most active speaker.
  5. 5. The method of relaying video images in a multimedia videoconference according to any preceding Claim, the method further characterised by the step of: receiving by said multipoint control unit (520), when more than one enhancement layer is available, an indication of a class of said one or more speakers with each layered video image transmission, in order to provide a finer scalability of said video images.
  6. 6. The method of relaying video images in a multimedia videoconference according to any preceding Claim, the method further characterised by the step of: converting a predicted frame into an Intra-coded frame for one or more base layer video streams.
    <Desc/Clms Page number 28>
  7. 7. A video conferencing arrangement for relaying video images between a plurality of user equipment (550, 560,570, 580), the video conferencing arrangement comprising: a multipoint control unit (520), adapted to receive a number of layered video images transmitted by a number of said plurality of user equipment, wherein said layered video images include a base layer (552,562, 572, 582) and one or more enhancement layers (555,565, 575, 585); and a video switching module (530), operably coupled to said multipoint control unit (520) and adapted to select a number of base layer video images of a number of active speakers (535) and one or more enhancement layers (540) of a most active speaker; wherein said multipoint control unit (520) being further adapted to transmit said number of base layer video images of a number of active speakers (535) and one or more enhancement layers (540) of the most active speaker to one or more of the plurality of user equipment (550, 560,570, 580).
  8. 8. The video conferencing arrangement according to Claim 7, further characterised by: a predicted frame to intra-coded frame transcoding module (660), operably coupled to said video switching module (530), to provide a most active speaker enhancement layer video stream, as an Intra-coded frame, if said multipoint control unit (520) received said frame initially as a predicted frame.
    <Desc/Clms Page number 29>
  9. 9. The video conferencing arrangement according to Claim 7 or Claim 8, further characterised by: a speaker identification module (620) that analyses a number of audio streams (590) in order to determine a number of active speakers and/or said most active speaker.
  10. 10. The video conferencing arrangement according to Claim 9, wherein said speaker identification module (620) allocates a priority level based on a determined activity of each participant to determine one or more of: a most active speaker (622), any other active speakers (625), and any inactive speakers.
  11. 11. A wireless device (700) for participating in a videoconference where a plurality of participants transmit video images, the wireless device (700) comprising: a video display (710) having a first display and one or more second distinct displays for displaying respective participants (720,730) from the plurality of participants; and a processor, operably coupled to said video display, for receiving an indication of a most active speaker (720) and less active speakers (730), and determining that said video image received from said most active speaker (720) should be displayed in said first display offering a higher quality video image, and said video images received from said number of said lesser active speakers (730) should be displayed in said one or more second display offering a lower quality video image.
    <Desc/Clms Page number 30>
  12. 12. A multipoint processor comprising: one or more receiving ports adapted to receive layered video streams, including base layer video streams (552,562, 572 582) and enhancement layer video streams (555,565, 575,585), from a plurality of user equipment (550,560, 570,580) ; and
    a switching module (640), operably coupled to said
    one or more receiving ports, selecting a number of base 0 layer video images of a number of active speakers (535) and one or more enhancement layers (540) of the most active speaker, for transmitting to one or more user equipment (550,560, 570,580).
  13. 13. The multipoint processor according to Claim 12, further characterised by: a speaker identification module (620) operably coupled to said one or more receiving ports for analysing a number of audio streams (590) received from a number of said plurality of user equipment, in order to determine a number of active speakers and/or said most active speaker.
  14. 14. The multipoint processor according to Claim 12 or Claim 13, wherein said speaker identification module (620) allocates a priority level based on a determined activity of a number of participants to determine one or more of the following: a most active speaker (622), any other active speakers (625), and any inactive speakers.
    <Desc/Clms Page number 31>
  15. 15. The multipoint processor according to any of preceding Claims 12 to 14, further characterised by: a predicted frame to intra-coded frame transcoding Module (660) operably coupled to said switching module (640), to convert the enhancement layer video stream of said most active speaker to an Intra-coded frame if it has been received at a respective port as a predicted frame.
  16. 16. A video communication system adapted to perform any of the method steps of claims 1 to 6, or adapted to incorporate the video conferencing arrangement of any of Claims 7 to 10, or adapted to incorporate the multipoint processor of any of Claims 12 to 15.
  17. 17. The video communication system according to Claim 16, wherein the video communication system is compatible with the UMTS communication standard (800) having an Internet Protocol multimedia domain (890) to facilitate videoconferencing communication.
  18. 18. A media resource function (890A) adapted to perform any of the method steps of claims 1 to 6, or adapted to incorporate the video conferencing arrangement of any of Claims 7 to 10, or adapted to incorporate the multipoint processor of any of Claims 12 to 15.
  19. 19. A video communication unit (700) adapted to receive layered videoconference images generated in accordance with the method of claims 1 to 6.
    <Desc/Clms Page number 32>
  20. 20. A video communication unit adapted to generate layered videoconference images in accordance with the method of claims 1 to 6, or to transmit layered videoconference images generated in accordance with the method of claims 1 to 6.
  21. 21. The video communication unit according to Claim 19, wherein the video communication unit is one of: a Node B (850A), a RNC (850B), a SGSN (870A), a GGSN (870B), a MRF (890A).
  22. 22. The method of relaying video images in a multimedia videoconference of claims 1 to 6 or the video conferencing arrangement of any of Claims 7 to 10, or multipoint processor of any of Claims 12 to 15 or the video communication system of Claim 16 or 17, or the media resource function (890A) of Claim 18 or the video communication unit of Claim 19,20, or 21, adapted to facilitate videoconference images based on the H. 323 standard or SIP standard.
  23. 23. A storage medium storing processor-implementable instructions for controlling a processor to carry out the method of any of claims 1 to 6.
  24. 24. A video switching method in centralised videoconferences using layered video coding substantially as hereinbefore described with reference to, and/or as illustrated by, FIG. 5 or FIG. 6 of the accompanying drawings.
    <Desc/Clms Page number 33>
  25. 25. A multipoint processor for use in a centralised videoconference using layered video coding substantially as hereinbefore described with reference to, and/or as illustrated by, FIG. 5 or FIG. 6 of the accompanying drawings.
  26. 26. A wireless device having one or more displays substantially as hereinbefore described with reference to, and/or as illustrated by, FIG. 7 of the accompanying drawings.
  27. 27. A communication system offering videoconferencing using layered video coding substantially as hereinbefore described with reference to, and/or as illustrated by, FIG. 8 of the accompanying drawings.
GB0202101A 2002-01-30 2002-01-30 Video conferencing system and method of operation Expired - Fee Related GB2384932B (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
GB0202101A GB2384932B (en) 2002-01-30 2002-01-30 Video conferencing system and method of operation
PCT/EP2002/014337 WO2003065720A1 (en) 2002-01-30 2002-12-16 Video conferencing and method of operation
CNA028277430A CN1618233A (en) 2002-01-30 2002-12-16 Video conferencing system and method of operation
JP2003565169A JP2005516557A (en) 2002-01-30 2002-12-16 Video conferencing system and operation method
KR10-2004-7011846A KR20040079973A (en) 2002-01-30 2002-12-16 Video conferencing and method of operation
HK04100650A HK1058450A1 (en) 2002-01-30 2004-01-30 Video conferencing system and method of operation
FI20041039A FI20041039A (en) 2002-01-30 2004-07-29 Video conferencing and implementation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0202101A GB2384932B (en) 2002-01-30 2002-01-30 Video conferencing system and method of operation

Publications (3)

Publication Number Publication Date
GB0202101D0 GB0202101D0 (en) 2002-03-13
GB2384932A true GB2384932A (en) 2003-08-06
GB2384932B GB2384932B (en) 2004-02-25

Family

ID=9930013

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0202101A Expired - Fee Related GB2384932B (en) 2002-01-30 2002-01-30 Video conferencing system and method of operation

Country Status (7)

Country Link
JP (1) JP2005516557A (en)
KR (1) KR20040079973A (en)
CN (1) CN1618233A (en)
FI (1) FI20041039A (en)
GB (1) GB2384932B (en)
HK (1) HK1058450A1 (en)
WO (1) WO2003065720A1 (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1633120A2 (en) * 2004-08-25 2006-03-08 Nec Corporation Method, apparatus, system, and program for switching image coded data
FR2875665A1 (en) * 2005-01-04 2006-03-24 France Telecom Video bit stream highlighting method for transmitting stream to videoconference participants, involves adjusting value of encoding quality parameter of video bit stream based on measured value of predefined parameter of audio bit stream
WO2006034618A1 (en) * 2004-09-28 2006-04-06 Zte Corporation A method for perfoming a video conference by terminal dial-up
WO2006097226A1 (en) * 2005-03-14 2006-09-21 Sony Ericsson Mobile Communications Ab Communication terminals that vary a video stream based on how it is displayed
EP1718077A2 (en) * 2005-04-29 2006-11-02 Broadcom Corporation System and method for video teleconferencing via a video bridge
US20080068446A1 (en) * 2006-08-29 2008-03-20 Microsoft Corporation Techniques for managing visual compositions for a multimedia conference call
EP1966917A2 (en) * 2005-09-07 2008-09-10 Vidyo, Inc. System and method for a conference server architecture for low delay and distributed conferencing applications
US7869705B2 (en) 2008-01-21 2011-01-11 Microsoft Corporation Lighting array control
WO2011056942A1 (en) * 2009-11-04 2011-05-12 Qualcomm Incorporated Controlling video encoding using audio information
GB2480138A (en) * 2010-05-07 2011-11-09 Intel Corp Multi-User Feedback Influencing Delivered Audiovisual Quality
US8130257B2 (en) 2008-06-27 2012-03-06 Microsoft Corporation Speaker and person backlighting for improved AEC and AGC
WO2012072276A1 (en) * 2010-11-30 2012-06-07 Telefonaktiebolaget L M Ericsson (Publ) Transport bit-rate adaptation in a multi-user multi-media conference system
US8279260B2 (en) 2005-07-20 2012-10-02 Vidyo, Inc. System and method for a conference server architecture for low delay and distributed conferencing applications
GB2491852A (en) * 2011-06-13 2012-12-19 Thales Holdings Uk Plc Rendering Active Speaker Image at Higher Resolution than Non-active Speakers at a Video Conference Terminal
US8436889B2 (en) 2005-12-22 2013-05-07 Vidyo, Inc. System and method for videoconferencing using scalable video coding and compositing scalable video conferencing servers
US8502858B2 (en) 2006-09-29 2013-08-06 Vidyo, Inc. System and method for multipoint conferencing with scalable video coding servers and multicast
US8848020B2 (en) 2010-10-14 2014-09-30 Skype Auto focus
EP2540080A4 (en) * 2010-02-24 2017-06-14 Ricoh Company, Limited Transmission system

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2537944C (en) * 2003-10-08 2010-11-30 Cisco Technology, Inc. System and method for performing distributed video conferencing
US8659636B2 (en) 2003-10-08 2014-02-25 Cisco Technology, Inc. System and method for performing distributed video conferencing
CN100401765C (en) * 2005-03-24 2008-07-09 华为技术有限公司 Video conference controlling method
KR100695206B1 (en) 2005-09-12 2007-03-14 엘지전자 주식회사 Mobile communication terminal for sharing device buffer and sharing buffer method using the same
JP2007158410A (en) 2005-11-30 2007-06-21 Sony Computer Entertainment Inc Image encoder, image decoder, and image processing system
KR100666995B1 (en) * 2006-01-16 2007-01-10 삼성전자주식회사 Method and system for providing the differential media data of meltimedia conference
US7822811B2 (en) * 2006-06-16 2010-10-26 Microsoft Corporation Performance enhancements for video conferencing
US8334891B2 (en) 2007-03-05 2012-12-18 Cisco Technology, Inc. Multipoint conference video switching
US8264521B2 (en) 2007-04-30 2012-09-11 Cisco Technology, Inc. Media detection and packet distribution in a multipoint conference
KR100874024B1 (en) * 2007-09-18 2008-12-17 주식회사 온게임네트워크 Station and method for internet broadcasting interaction type-content and record media recoded program realizing the same
EP2046041A1 (en) * 2007-10-02 2009-04-08 Alcatel Lucent Multicast router, distribution system,network and method of a content distribution
CA2727569C (en) 2008-06-09 2017-09-26 Vidyo, Inc. Improved view layout management in scalable video and audio communication systems
US8319820B2 (en) * 2008-06-23 2012-11-27 Radvision, Ltd. Systems, methods, and media for providing cascaded multi-point video conferencing units
KR101234495B1 (en) * 2009-10-19 2013-02-18 한국전자통신연구원 Terminal, node device and method for processing stream in video conference system
KR101636716B1 (en) 2009-12-24 2016-07-06 삼성전자주식회사 Apparatus of video conference for distinguish speaker from participants and method of the same
US8553068B2 (en) * 2010-07-15 2013-10-08 Cisco Technology, Inc. Switched multipoint conference using layered codecs
WO2012100410A1 (en) * 2011-01-26 2012-08-02 青岛海信信芯科技有限公司 Method, video terminal and system for enabling multi-party video calling
EP2700244B1 (en) 2011-04-21 2016-06-22 Shah Talukder Flow-control based switched group video chat and real-time interactive broadcast
KR101183864B1 (en) 2012-01-04 2012-09-19 휴롭 주식회사 Hub system for supporting voice/data share among wireless communication stations and method thereof
CN103533294B (en) * 2012-07-03 2017-06-20 中国移动通信集团公司 The sending method of video data stream, terminal and system
JP6174501B2 (en) * 2014-02-17 2017-08-02 日本電信電話株式会社 Video conference server, video conference system, and video conference method
CN106464842B (en) * 2014-03-31 2018-03-02 宝利通公司 Method and system for mixed topology media conference system
CN105450976B (en) * 2014-08-28 2018-08-07 南宁富桂精密工业有限公司 video conference processing method and system
CN109076128B (en) * 2016-02-29 2020-11-27 铁三角有限公司 Conference system
EP3270371B1 (en) * 2016-07-12 2022-09-07 NXP USA, Inc. Method and apparatus for managing graphics layers within a graphics display component
US10708728B2 (en) * 2016-09-23 2020-07-07 Qualcomm Incorporated Adaptive modulation order for multi-user superposition transmissions with non-aligned resources
CN107968768A (en) * 2016-10-19 2018-04-27 中兴通讯股份有限公司 Sending, receiving method and device, system, the video relaying of Media Stream
CN106572320A (en) * 2016-11-11 2017-04-19 上海斐讯数据通信技术有限公司 Multiparty video conversation method and system
JP6535431B2 (en) 2017-07-21 2019-06-26 レノボ・シンガポール・プライベート・リミテッド Conference system, display method for shared display device, and switching device
CN111314738A (en) * 2018-12-12 2020-06-19 阿里巴巴集团控股有限公司 Data transmission method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0724362A1 (en) * 1995-01-30 1996-07-31 International Business Machines Corporation Priority controlled transmission of multimedia streams via a telecommunication line
US5684527A (en) * 1992-07-28 1997-11-04 Fujitsu Limited Adaptively controlled multipoint videoconferencing system
WO2001065848A1 (en) * 2000-03-02 2001-09-07 Koninklijke Philips Electronics N.V. System and method for improving video transmission over a wireless network.
WO2002058390A1 (en) * 2001-01-17 2002-07-25 Ericsson Inc. Adaptive display for video conferences

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5629736A (en) * 1994-11-01 1997-05-13 Lucent Technologies Inc. Coded domain picture composition for multimedia communications systems
US6314302B1 (en) * 1996-12-09 2001-11-06 Siemens Aktiengesellschaft Method and telecommunication system for supporting multimedia services via an interface and a correspondingly configured subscriber terminal
CN1190081C (en) * 1997-03-17 2005-02-16 松下电器产业株式会社 Method and apparatus for processing, transmitting and receiving dynamic image data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5684527A (en) * 1992-07-28 1997-11-04 Fujitsu Limited Adaptively controlled multipoint videoconferencing system
EP0724362A1 (en) * 1995-01-30 1996-07-31 International Business Machines Corporation Priority controlled transmission of multimedia streams via a telecommunication line
WO2001065848A1 (en) * 2000-03-02 2001-09-07 Koninklijke Philips Electronics N.V. System and method for improving video transmission over a wireless network.
WO2002058390A1 (en) * 2001-01-17 2002-07-25 Ericsson Inc. Adaptive display for video conferences

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7653251B2 (en) 2004-08-25 2010-01-26 Nec Corporation Method, apparatus, system, and program for switching image coded data
EP1633120A3 (en) * 2004-08-25 2008-02-27 Nec Corporation Method, apparatus, system, and program for switching image coded data
EP1633120A2 (en) * 2004-08-25 2006-03-08 Nec Corporation Method, apparatus, system, and program for switching image coded data
WO2006034618A1 (en) * 2004-09-28 2006-04-06 Zte Corporation A method for perfoming a video conference by terminal dial-up
FR2875665A1 (en) * 2005-01-04 2006-03-24 France Telecom Video bit stream highlighting method for transmitting stream to videoconference participants, involves adjusting value of encoding quality parameter of video bit stream based on measured value of predefined parameter of audio bit stream
WO2006097226A1 (en) * 2005-03-14 2006-09-21 Sony Ericsson Mobile Communications Ab Communication terminals that vary a video stream based on how it is displayed
US7535484B2 (en) 2005-03-14 2009-05-19 Sony Ericsson Mobile Communications Ab Communication terminals that vary a video stream based on how it is displayed
EP1718077A2 (en) * 2005-04-29 2006-11-02 Broadcom Corporation System and method for video teleconferencing via a video bridge
EP1718077A3 (en) * 2005-04-29 2008-07-16 Broadcom Corporation System and method for video teleconferencing via a video bridge
US8279260B2 (en) 2005-07-20 2012-10-02 Vidyo, Inc. System and method for a conference server architecture for low delay and distributed conferencing applications
EP1966917A4 (en) * 2005-09-07 2011-07-06 Vidyo Inc System and method for a conference server architecture for low delay and distributed conferencing applications
EP1966917A2 (en) * 2005-09-07 2008-09-10 Vidyo, Inc. System and method for a conference server architecture for low delay and distributed conferencing applications
US9338213B2 (en) 2005-09-07 2016-05-10 Vidyo, Inc. System and method for a conference server architecture for low delay and distributed conferencing applications
US8872885B2 (en) 2005-09-07 2014-10-28 Vidyo, Inc. System and method for a conference server architecture for low delay and distributed conferencing applications
US8436889B2 (en) 2005-12-22 2013-05-07 Vidyo, Inc. System and method for videoconferencing using scalable video coding and compositing scalable video conferencing servers
US8773494B2 (en) * 2006-08-29 2014-07-08 Microsoft Corporation Techniques for managing visual compositions for a multimedia conference call
US9635314B2 (en) 2006-08-29 2017-04-25 Microsoft Technology Licensing, Llc Techniques for managing visual compositions for a multimedia conference call
EP2060104A1 (en) * 2006-08-29 2009-05-20 Microsoft Corporation Techniques for managing visual compositions for a multimedia conference call
US20080068446A1 (en) * 2006-08-29 2008-03-20 Microsoft Corporation Techniques for managing visual compositions for a multimedia conference call
EP2060104A4 (en) * 2006-08-29 2014-11-05 Microsoft Corp Techniques for managing visual compositions for a multimedia conference call
US8502858B2 (en) 2006-09-29 2013-08-06 Vidyo, Inc. System and method for multipoint conferencing with scalable video coding servers and multicast
US7869705B2 (en) 2008-01-21 2011-01-11 Microsoft Corporation Lighting array control
US8130257B2 (en) 2008-06-27 2012-03-06 Microsoft Corporation Speaker and person backlighting for improved AEC and AGC
WO2011056942A1 (en) * 2009-11-04 2011-05-12 Qualcomm Incorporated Controlling video encoding using audio information
US8780978B2 (en) 2009-11-04 2014-07-15 Qualcomm Incorporated Controlling video encoding using audio information
EP2540080A4 (en) * 2010-02-24 2017-06-14 Ricoh Company, Limited Transmission system
GB2480138A (en) * 2010-05-07 2011-11-09 Intel Corp Multi-User Feedback Influencing Delivered Audiovisual Quality
GB2480138B (en) * 2010-05-07 2012-04-18 Intel Corp System,method,and computer program product for multi-user feedback to influence audiovisual quality
US8848020B2 (en) 2010-10-14 2014-09-30 Skype Auto focus
WO2012072276A1 (en) * 2010-11-30 2012-06-07 Telefonaktiebolaget L M Ericsson (Publ) Transport bit-rate adaptation in a multi-user multi-media conference system
GB2491852A (en) * 2011-06-13 2012-12-19 Thales Holdings Uk Plc Rendering Active Speaker Image at Higher Resolution than Non-active Speakers at a Video Conference Terminal

Also Published As

Publication number Publication date
WO2003065720A1 (en) 2003-08-07
HK1058450A1 (en) 2004-05-14
GB0202101D0 (en) 2002-03-13
JP2005516557A (en) 2005-06-02
GB2384932B (en) 2004-02-25
KR20040079973A (en) 2004-09-16
CN1618233A (en) 2005-05-18
FI20041039A (en) 2004-09-29

Similar Documents

Publication Publication Date Title
GB2384932A (en) Video conferencing system that provides higher quality images for most active speaker
US11503250B2 (en) Method and system for conducting video conferences of diverse participating devices
US8289369B2 (en) Distributed real-time media composer
US7627629B1 (en) Method and apparatus for multipoint conferencing
US8514265B2 (en) Systems and methods for selecting videoconferencing endpoints for display in a composite video image
KR100880150B1 (en) Multi-point video conference system and media processing method thereof
CN101755454B (en) Method and apparatus for determining preferred image format between mobile video telephones
CN101147400A (en) Split screen multimedia video conference
US9743043B2 (en) Method and system for handling content in videoconferencing
CN104980683A (en) Implement method and device for video telephone conference
US20140002584A1 (en) Method of selecting conference processing device and video conference system using the method
CN105122791A (en) Method and a device for optimizing large scaled video conferences
CN111385515B (en) Video conference data transmission method and video conference data transmission system
EP2227013A2 (en) Virtual distributed multipoint control unit
GB2378601A (en) Replacing intra-coded frame(s) with frame(s) predicted from the first intra-coded frame
CN112839197B (en) Image code stream processing method, device, system and storage medium
Mankin et al. The design of a digital amphitheater
Jia et al. Efficient 3G324M protocol Implementation for Low Bit Rate Multipoint Video Conferencing.
Chatras Telepresence: Immersive Experience and Interoperability
Gharai et al. High Definition Conferencing: Present, Past and Future

Legal Events

Date Code Title Description
REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1058450

Country of ref document: HK

PCNP Patent ceased through non-payment of renewal fee

Effective date: 20080130