WO2015105408A1

WO2015105408A1 - Self-learning and intelligent system for continually improving quality and performance of multimedia conference

Info

Publication number: WO2015105408A1
Application number: PCT/MY2015/000002
Authority: WO
Inventors: Syed Osman AIHaddad SYED MOHAMED
Original assignee: Wafina Sdn. Bhd.
Priority date: 2014-01-08
Filing date: 2015-01-07
Publication date: 2015-07-16
Also published as: MY174662A

Abstract

The present invention relates to a system for delivering a high performance multimedia conference based on self-learning and intelligent processing. The invention also includes a system for implementing modular software based Multi-site Connection Unit (MCU) which further improves the performance and quality of service of multimedia conferencing systems. By learning various factors that affect quality and performance of a multimedia conference and by responding to such learning by intelligently generating an optimum conferencing policy to initiate, maintain and terminate a session of the multimedia conference, the present invention continually improves the quality and performance of multimedia conference. The modular software based MCU reduces the average resource requirements of multimedia conferencing servers by presenting dynamically scalable software MCU architecture. Adaptive correction and update with feedback of conference policy.

Description

SELF-LEARNING AND INTELLIGENT SYSTEM FOR CONTINUALLY IMPROVING QUALITY AND PERFORMANCE OF MULTIMEDIA CONFERENCE

FIELD OF INVENTION

The present invention generally relates to a self-learning and intelligent system for continually improving quality and performance of a multimedia conference.

BACKGROUND OF INVENTION

Multimedia applications are gaining wide acceptance in internet and other network domains. Multimedia conference is a computer network based communication and collaboration platform. It includes video, audio and text conferencing facilities with other built in services like data, whiteboard, desktop sharing, file sharing etc. Today companies use multimedia conference in an increasing pace. Multimedia conference is simple if there are only two parties in the conference. The conference can be set up and maintained in a similar way as public telephone networks. But when the participants are more, it requires sophisticated hardware and software facilities.

Traditionally different types of traffic required dedicated networks, for example video was transmitted over Integrated Service Digital Network (ISDN) and voice transmitted via Public Switched Telephone Network (PSTN). In the internet era, the idea of using of common channel to handle all kinds of traffic became a need of the time. The main advantages are reducing the operational cost of using separate networks and simplification of maintenance which lead to ease of installation and maintenance and corresponding economic benefits. Presenting text, graphics, video, animation, sound and files in an integrated way over Internet Protocols is becoming a trend.

An important issue in multimedia conference is performance of operation and quality of service (QoS). QoS includes providing all the clients with undisrupted audio and video quality. QoS comprises measurable factors like network availability, bandwidth, jitter, delay, loss and immeasurable factors such as emission priority and discard priority. When a multimedia conferencing system can provide expected video and audio quality to all the participants along with all other built in services performing as expected, we can say that the performance of the multimedia conferencing session is good. When the components of the multimedia conference are delivered with unexpected and unsatisfactory quality, we say that the performance of the multimedia system is degraded or bad. Multimedia conference is a form of human to human communications. Upon human nature the user expects real time effects from the multimedia conferencing applications. In order to make it real time, we have to ensure that the performance of the multimedia system is good. The result is a requirement for maximum possible QoS.

In the past the tendency to use multimedia conference was not high due to the low video quality caused by limitations of network components. But nowadays as technology developed and device capabilities enhanced, there are much improvements in the quality of multimedia conference. For example frame rate is one of the factors on which quality depended and frame rate have experienced a tremendous jump in value during present time. Another one is multicast supporting networks growth. There is also considerable increase in bandwidth capabilities.

Today, internet can handle different types of flows simultaneously, like audio, video, text, file etc., but some barriers are still in the path such as congestion capacity constraints and interaction diversity. Different ways are provisioned in different publications to overcome the above mentioned barriers. The following three are the most important:

• Over Provisioning

· Separate Networks

• QoS

Over Provisioning: It means increasing bandwidth, which assures offering various traffic types without any delay or conflict at once. It is a good solution for small networks, but when expansion is required, there might be a need to multiply network resource capacities. Because of high resource consumption and costly implementation, over provisioning isn't always a desirable solution.

Separate networks: This is another guideline that isolates network components for each kind of traffic which solve the interference among voice, video and data. But the issues of high bandwidth requirement and delay between various components like audio and video still remain. Furthermore cost is another concern in this solution. There is a possibility to solve the bandwidth problem by merging over provisioning for each separate network, but then the cost will burst. The above two methods are less preferred due to cost and resource overuse, unless a special situation arise.

QoS: QoS is an ability to have various behaviours toward different types of traffic to meet their requisites. Implementing QoS needs a predefined level of network capacity. Having supplied that would manage the traffic in the best way without additional resources.

End to end QoS of multimedia applications can be considered from three aspects: application, network and end points. Network and Different end points have distinct QoS parameters. Considering these facts QoS can be classified as follows:

• Application level QoS:

User related metrics like throughput, latency, availability and continuity of service (frame size, frame rate, image, audio clarity etc.).

• System level QoS:

End point system requirements such as CPU and OS requirements. A variety of systems can be connected to a conference, including Personal Computers, IP telephones, mobile phones, tablet PCs etc. The performance characteristics of these devices vary widely.

• Network level QoS: These are the most significant parameters of QoS which are communication related such as bandwidth, jitter, delay, loss and reliability (network availability).

Some of the QoS requirements are presented as follows: Bandwidth and throughput: Bandwidth is the available capacity of connection between two terminals as the most popular term for that is (bps). Throughput slightly differs from bandwidth as it stands for effective bandwidth that is provided by network.

Delay or latency: It specifies the time it takes for a packet to leave source until reaching the destination. Applications and network devices can cause delay. Jitter [delay variation): Jitter is an interval between subsequent packets. It is occurred by network congestion, route alternation etc.

Loss: It is amount of packets out of all that are not received at destination. The success of QoS depends on reducing this factor.

Reliability: Some applications are sensitive to packet loss such as real-time applications. Thus there must be some mechanism either in application or network to minimize the packet loss, such as forward error correction (FEC).

Different mechanisms are proposed to handle various issues relating to QoS in various publications and articles.

Presently there are different technologies available to set up a conference and maintain QoS.

There are two challenges for a good conferencing platform; providing facilities for a large number of clients and giving all of them highest possible QoS.

Generally, in multimedia conference, in order to provide QoS, different technologies are used. Two most important of them are Differentiated Service (DiffServ) and Integrated Service (IntServ).

IntServ performs QoS with the use of resource reservation and admission control mechanisms. IntServ relies on Resource Reservation Protocol (RSVP) to request expected QoS requirement from network and reserve bandwidth. If reservation attempt succeeds application can begin the communication, if not application may reduce its essentials to meet the agreement with network. IntServ assures network QoS metrics such as bandwidth, loss and delay so it can be named hard QoS. There are two main shortcomings for IntServ. One is that they have non-scalable architecture. Because of increasing overhead of continuous signaling and controlling flows due to RSVP architecture, IntServ is not suitable for enterprise networks. Another shortcoming is that all the devices along the path between end points must be RSVP enabled to satisfy required QoS. Because of these shortcomings IntServ is not much used in multimedia conferencing platforms. DiffServ model of QoS works based on classifying different classes of traffic. DiffServ uses a field in IP packet header which called DiffServ Code Point (DSCP) to mark services it would get from network. DSCP has two popular values:

• Expedited Forwarding (EF)

• Assured Forwarding (AF) EF: It provides the packet with low latency, loss and jitter to achieve the highest possible priority from the network. So it fits for voice only solutions like Voice over IP (VoIP). Since the priority for voice and video will be marked the same and in regard to larger size of video packets delay will increase for voice packet while waiting for video packets to be processed, Expedited Forwarding might not work well with multimedia conference. In addition, small size of EF queues lead to video packet loss growth.

AF: It guarantees the delivery of packets as long as the path is not oversubscribed. If congestion happens it will drop the packets according to a twelve DSCP value pattern. Some reasonable amount of jitter, delay and loss are tolerated. DiffServ model of QoS with DSCP value of AF is usually preferred in multimedia conference.

These are some of the shortcoming of DiffServ as mentioned in several publications.

Provisioning: Unlike RSVP/IntServ, DiffServ needs to be provisioned. Setting up the various classes throughout the network requires knowledge of the applications and traffic statistics for aggregates of traffic on the network. This process of application discovery and profiling can be time-consuming, although tools such as NBAR application discovery, protocol analysers, and Remote Monitoring (RMON) probes can make these activities easier. Billing and Monitoring: Management is still a big issue. Even though packets/sec, bytes/sec, and many other counters are available via the class-based Management Information Base (MIB), billing and monitoring are still difficult issues. For example, it may not be sufficient to prove to a customer that 9 million VoIP packets got the EF PHB treatment at all times, since it is possible that the qualitative nature of the calls that the customer made were very poor. Loss of Granularity: Even though QoS assurances are being made at the class level, it may be necessary to drill down to the flow-level to provide the requisite QoS. For example, although all HTTP traffic may have been classified as gold, and a bandwidth of 100Mbps assigned to it, there is no inherent mechanism to ensure that a single flow does not use up that allocated bandwidth

QoS and Routing: One of the biggest drawbacks of both the IntServ and DiffServ models is the fact that signalling/provisioning happens separately from the routing process. There may be a path in the network that has the required resources, even when RSVP/DiffServ fails to find the resources. True QoS, with maximum network utilization, will arrive with the combination of traditional QoS and routing.

Another important aspect in multimedia conference is providing conferencing facilities to a large number of clients and wide range of client systems like personal computers (PCs), tablet PCs, mobile phones, IP phones etc. A corporation with large number of employees may require conferencing between hundreds of its employees and a successful multimedia conferencing system should facilitate this. The requirements are high when a large university tries to conduct an online class room with multimedia facilities to its global student community. Many times more than one conferences need to be run simultaneously. Even if the participant count in a single multimedia conference may be low, the total number of people using a single conference facility will be large when multiple conferences are to be provided by a single conference service provider. This requirement can be high, for example, when a university is conducting multiple classes at a time or when several universities are using the multimedia services offered by a single global conferencing service provider simultaneously. In order to ease the job of call connecting and call maintaining, different companies use call administrators to set up and manage the call. Point to point communication, in which each participating parties in the conference communicating with each other, is an inefficient technique and requires large bandwidth. Another set up called Multipoint Control Unit (MCU) is used to facilitate a centralized call management facility.

A MCU is a device used to handle conferences with more than two participants. It is also called a multi-site control unit. It can be a dedicated hardware device or MCU software executed on a server. Thus an MCU connect between various devices in the conferences and these devices are called end points (EP) or clients. End points can be personal computers, IP telephones, mobile phones, tablet PCs etc. People or humans who are using the end points to take part in a multimedia conference are called participants of the multimedia conference or participants. In some cases participants can also be intelligent computer systems which can interact with human beings and/or with each other in a multimedia conference.

All the data from the end points are given to the MCU and the MCU mixes all the data and send the mixed data to all the EP. The mixed data contain video, audio, instant text messages, data etc. from all the end points so that each point can receive the stream from a single source. Therefor an MCU is sometimes referred to as a "bridge". Most MCUs allow multiple conferences simultaneously so that different conferencing can be carried out by a single MCU. There are situations where number of participants in a multimedia conference is large so that a single MCU alone cannot handle the conference. The same problem arises when the MCU can handle a single conference but cannot handle multiple conferences due to unavailability of resources. In this situation multiple MCUs are used to handle all the participants in a conference or to handle multiple conferences. The MCUs can be in one location or can be in geographically different locations like one in Kuala Lumpur, one in New Delhi, one in Washington DC etc. It describes a routing method to utilize the available resources of the MCUs so that maximum number of participants or maximum number of conferences can be allowed at a time. The limit depends on the sum of the available input ports of each MCU. The geographical locations of the MCU are selected in such a way to make the routing process easy.

There are several publications relating to the various technical aspects of the video conferencing system, especially MCU.

US Patent No. US 7,456,858 B2 describes a distributed MCU set up to manage conference calls with large number of participants or to manage multiple conferences.

US Patent No US 7,492,730 B2 describes cascading local conferences into multi-site conferences.

US Published Patent Application No. US 2013/0044647 Al describes a method of bandwidth extension in a MCU.

These prior art describe some of the relevant aspects of the present invention. Some details about the general background art of multimedia conference and distributed architectures are taken from the above publications. There are much hardware based and software based multimedia conferencing platform available in the market. Hardware based solutions are costly and only affordable to big enterprises. Therefore there is an increasing trend in multimedia conferencing applications to become software centric by using internet and client server architecture without adding much dedicated conferencing hardware. Software based architectures use at least one software based MCU running on a server.

There are many technologies in use for provisioning and providing QoS in multimedia conferencing services including what is explained here. There are several factors which affect QoS, like bandwidth, jitter, delay, loss and reliability (network availability). Also there are user centric factors like the users expectations of frame size, frame rate, image and audio clarity etc. which affect the QoS concept. These factors which affect the QoS are dependent on many other factors also. For example delay or jitter in a network may be caused by congestion due to excess traffic in the network and the excess traffic may be due to some other predictable or unpredictable factors like a large number of people viewing a football match online.

But all the current technologies use fixed policies or algorithms for conference set up, scheduling the MCUs, managing the conference and providing QoS. The above said policies or algorithms remain constant until another version of the conference software is released. Sometime a modification in the above said policies or algorithms are made only when a bug or flaw is detected or the system crashes due to some errors. Adaptive QoS methods available today also adjust the system performance by using fixed policies. These methods cannot learn and thereby adapt to changes happening in network, client device specifications or participant nature.

The drawbacks with IntServ QoS have made DiffServ QoS and option in multimedia conferencing systems using Internet Protocol, even though IntServ provides a guaranteed QoS. DiffServ also have to tackle several of its drawbacks. Some of them are explained earlier. Most of these drawbacks can be accounted to the fact that the policy developers for these QoS considers all the stake holders of a multimedia application in the internet have fixed behavioural model. For example, a user expectation of QoS may change from time to time. Network conditions also can change and that can be predicted by learning the network condition over time. The scope of a learning based intelligence can assure good QoS to multimedia conference. In order to increase the number of participants multiple MCUs are cascaded and used nowadays. Different MCUs can be located at different business locations of the service provider and clients or end points can be allotted to each MCU based on the location of the end point. There are some publications and implementations that describe or implement techniques to assign end points to MCU. The nature of the networks in the internet can change any time. The change may be temporary or permanent. For example the maximum allowed bandwidth for a server hosting an MCU may be increased by the service provider. This is a permanent change in the nature of the network.

In another case, there may be a temporary breakdown or slowdown of services due to increased traffic in a network. This breakdown or slowdown may be predictable, for example due to a million people viewing a football match, or unpredictable like a broken undersea cable.

Another case is the nature of the end point used by a participant. Sometimes a participant uses a fixed device for a long period of time in order to participate in multimedia conferences. Some participants frequently change the devices they use to participate in multimedia conferences.

There are many other properties of the network, end points, participants or other systems in a multimedia conferencing service that can change over time. There are multiple problems with using a fixed policy or algorithms for call set up, MCU scheduling and call management in reference to the above described changing nature of the network, clients, end points or other systems in a multimedia conferencing service. The above policies and algorithms lack the intelligence to track the changes which can affect quality and performance and adjust the policies so as to provide an expected QoS for the multimedia conference.

Another issue with the existing technologies is regarding software based Multi-site Connection Units (MCUs). The existing software based MCUs run as a single software module with a fixed number of input ports limited by the software application. These are very large software applications which demand high computing resources like memory and processing power all the time. When the participants is equal to or very near to the number of available ports of the software MCU, then the system works fine. The same is the case when there is large number of simultaneous multimedia conferences in single software MCU so that all or nearly all of the input ports are assigned to end points. But the above said two cases may not happen all the time. When the number of participants in a conference is very less than the number of available ports of a software MCU, resources of the server is wasted and power efficiency of the system will be degraded. This is because the software MCUs contains many components like audio and video mixers that require more resources and power. The whole software MCU will be running as a single application which requires large memory and processing power. A multimedia conference with small number of participants requires less processing power and resources. If the facilities of software MCU can be dynamically scaled based on requirement, power, memory and processing resources can be saved. SUMMARY OF THE INVENTION

Accordingly, the object of the present invention is to provide a system which overcomes the described disadvantages of the prior art. Accordingly, the present invention provides a self-learning and intelligent system for continually improving quality and performance of a multimedia conference comprising of (a) networked computer systems comprising of p) a means to collect information and metrics from the internet or any network in which the multimedia conference is established or from persons or devices taking part in the multimedia conference in the form of computer readable data, (n) a computer memory system for storing the collected computer-readable data for further processing, (iii) a processor and a self-learning, adaptive, intelligent and knowledge processing computer application and algorithm for manipulating the computer-readable data and (iv) a hardware and software communication interface for communicating computer-readable data between systems and subsystems, (b) at least one self-learning, adaptive, and intelligent computer system comprising of: (i) a means of accepting the computer readable data in various formats, (ii) a means of accepting conferencing policy information for previous sessions of various multimedia conferences, (iii) the self-learning, adaptive, intelligent and knowledge processing computer application and algorithm for processing computer readable data in a format together with information regarding at least one previous session of a multimedia conference and generating an updated conference policy for a currently running multimedia conference or generating a new conferencing policy for a session of a multimedia conference that is to be initiated in future and (iv) a means of implementing the conference policy thus updated to maintain and terminate a currently running multimedia conference with improved quality and performance or a means of implementing the newly generated conference policy to initiate, maintain and terminate a new session of a multimedia conference with improved quality and performance and (c) a system for feedback and evaluation of success of the updated conference policy or newly generated conference policy for a multimedia conference, in terms of improved quality and performance, comprising of: (i) a means for collecting information and metrics from the internet or any network in which the multimedia conference with the updated conferencing policy or new conferencing policy is established and/or from persons and/or devices taking part in multimedia conference in the form of computer readable data, wherein the collected information and metrics provide details regarding quality of service (QoS) and performance of the multimedia conference established with the updated conferencing policy or the new conferencing policy, (ii) a means to store the information and metrics so collected in a computer readable memory for further processing, (iii) a means to use a self-learning, adaptive, intelligent and knowledge processing computer application and algorithm for processing the information and metrics and (iv) a means to update the current policy if QoS and performance of operation is not improved or keeping the current policy if the QoS and performance of operation is improved. BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is now described in further detail by reference to figures wherein,

Fig. 1 is an exemplary implementation of a multimedia conferencing system using a conference controller;

Fig. 2 is an exemplary implementation of another conference controller with an external Master MCU running on master server;

Fig. 3 is an exemplary implementation of yet another conference controller with a modular software MCU, both of them running as a single application; and

Fig. 4 is the architectural block diagram of the Modular software based Multi-site Connection Unit [Modular Software MCU) whereby the dotted lines illustrate dynamic linking. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention generally relates to a self-learning and intelligent system for continually improving quality and performance of multimedia conferencing systems. A first aspect of the invention is to provide a system for continuously improving performance and quality of a multimedia conference by self-learning and intelligent processing (SLIP) method, which is summarized here. The performance and quality of a currently running multimedia conference session is continuously improved to provide the performance and quality expected by each participant. The quality and performance of a new session of multimedia conference will be improved as compared to the previous sessions of that conference.

The term previous session of a multimedia conference as referred to a new session or a future session of multimedia conference is defined as (i) a conference with identical or similar participating clients, (n) a conference with participating clients at identical or similar geographical locations, (iii) a conference with participants using identical or similar kind of end points, or (iv) a conference with identical or similar hardware and/or software and/or network conditions or any combination thereof. The system collects various information and metrics from any networks in which a multimedia conference is established and also from persons and devices taking part in multimedia conference, in the form of computer readable data. These collected information and metrics are stored in memories in various formats, for example organized knowledge format. At least one self-learning, adaptive, and intelligent computer system is used for processing the above computer readable data together with information, metrics and conference policies from at least one previous session of a multimedia conference. Based on this intelligent processing, an updated conference policy for a currently running multimedia conference or a new conferencing policy for a future session of multimedia conference is generated. Implementation of the updated conferencing policy on a currently running multimedia conference session or implementation of the new conferencing policy on a future multimedia conference session will improve the quality and performance of the respective multimedia conferences.

There is also a system for feedback and evaluation of success of the updated conference policy or newly generated conference policy for a multimedia conference, in terms of improved quality and performance. The feedback and evaluation process ensures a continual improvement in the quality and performance of a multimedia conference. A second aspect of the invention is modular software based multipoint connection unit (modular software MCU). The performance of a multimedia conference and related applications can be further improved by the modular software MCU. The modular software MCU comprises of a core controller, dynamically linkable micro MCU modules and dynamically linkable media accelerator modules. A dynamically linkable micro MCU module can accommodate a fixed number of participants when linked to the core controller.

In some embodiments of the invention, the core controller can act as standalone MCU for multimedia conferences with small number of participants. For this the core controller contains an embedded micro MCU. If more participants are to be added in a conference than the embedded micro MCU in the core controller can support, required numbers of micro MCU modules are dynamically linked to the core controller. But in some embodiments, the core controller does control functions only and need at least one dynamically linkable micro MCU to support end points in a multimedia conference.

The multisite processors (MP) in the embedded micro MCU and the dynamically linkable micro MCU module does processing on the multimedia contents like format conversion, mixing etc. If more processing capability is required in any multimedia conference, then sufficient numbers of dynamically linkable media accelerator modules are linked to the core controller. Dynamically linkable media accelerator module provides high performance processing on audio, video, data, text or other multimedia components communicated in a multimedia conference. This modular approach can save processing power, memory and other computational resources used in enabling multimedia conferences. It facilitates a capability on demand approach where capabilities of a multimedia conferencing system are scaled based on requirement.

The primary advantage of the present invention is that it can dynamically improve the QoS of multimedia conference and related multimedia applications. The system itself automatically improves the performance and QoS without any hardware or software update. The system can learn on what is affecting quality and performance and intelligently respond to the learning by improving the quality and services of multimedia services. The present invention can intelligently adapt to different requirements of quality and performance.

The present invention generates various conferencing policies on how to initiate, maintain and terminate various multimedia services, as in multimedia conferencing services. The present invention can work in conjunction with other QoS techniques, for example DiffServ. It overcomes the draw backs of present QoS techniques which uses Fixed methods and policies. DiffServ needs to be provisioned. Setting up the various classes throughout the network requires knowledge of the applications and traffic statistics for aggregates of traffic on the network. This process is time consuming with existing technologies. The self-learning process disclosed in the present invention can continuously learn various applications and traffic statistics intelligently and can make predictions and suggestions that can make provisioning easy. In case of billing and monitoring, the present invention can provide qualitative information of various services in a multimedia conference through intelligent learning and avoid billing and monitoring based on traffic only. Also the present invention will continuously improve the quality of the services, in order to achieve the expected QoS.

Based on intelligent self-learning the present invention can make predictions on expected data transfer by different applications and ensure smooth bandwidth allocation to various end points in a multimedia conferencing system. The present invention can find routing path which can assure enough resources to ensure expected performance and quality, through the SLIP process. Since the SLIP process updates its policies based on the learning, it can suggest various alternate routes in which a multimedia conference can be best enabled. Thus the present invention can assure maximum network utilization.

The varying nature of the endpoints used by the clients or frequent changes in the location of the client or changes in the network condition or many factors that affect the quality and performance of multimedia conference and related applications can be accommodated and the system presents the best performance for the future sessions by learning the past and present sessions. The scalable and dynamically linkable software MCU in the SLIP multimedia conference architecture can provide a highly efficient solution for a resource managed conference architecture. The modular software MCU consists of control applications, dynamically linkable micro MCU applications and dynamically linkable media accelerators in a single server. Only the required numbers of micro MCUs or media accelerators are cascaded, which means system resources are used efficiently.

Distributed MCUs explained in the prior arts can accommodate large number of clients and/or conferences simultaneously, but the present invention has the additional advantage of predicting the failure of an MCU through SLIP and accordingly reallocating the conference so that all the participants can have the same performance and QoS. In the following detailed description of the invention of exemplary embodiments of the invention, reference is made to the accompanying drawings, which form a part hereof, and in which is shown by way of illustration specific exemplary embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, but other embodiments may be utilized and other changes may be made without departing from the scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims. In the following description, numerous specific details are set forth to provide a thorough understanding of the invention. However, it is understood that the invention may be practiced without these specific details. In other instances, well-known structures and techniques known to one of ordinary skill in the art have not been shown in detail in order not to obscure the invention.

Fig. 1 illustrates a block diagram of an exemplary embodiment of a global multimedia conferencing system using the present invention. The illustrated embodiment is for reference only and one skilled in the art can appreciate the abstractions used. The system can be used by a large corporation having plurality of computing platforms, conference devices or conference end points, networking infrastructure and other systems and devices required for multimedia conference at plurality of locations or a global conference service provider having computing platforms, networking infrastructure and other systems and devices required for multimedia conference at plurality of locations in order to provide multimedia conferencing services to other corporations or individuals who possess multimedia conference end points and other required facilities for taking part in a multimedia conferencing service. The conference end points in this plurality of locations communicate with each other using internet or any other packet switching or circuit switching networks or combination of both. The communication between the conference end points at the above plurality of locations forms a multimedia conferencing session. The multimedia conferencing session between various conference end points is initiated, maintained and terminated by various computing platforms, computing applications and network infrastructures like master servers (105), conference servers (104) and conference controller (106A), MCUs etc.

In this invention, a global multimedia conferencing service provider or a company that want to set up a global multimedia conference of people, uses a collection of computer based applications, hereafter called multimedia conferencing system or multimedia conferencing architecture, to initiate, maintain and terminate multimedia conference using various kinds of conference end devices like PCs, tablet PCs, mobile phones etc.

The multimedia conferencing system is distributed globally in different computing devices of different capabilities and performance. Some of the computer based applications reside in powerful computers called webservers. In one exemplary implementation of the system, the global multimedia conferencing service provider has a plurality of sites or zones, where each site has a plurality of multimedia end points associated with at least one hardware MCU or at least one software MCU running on a web server. One unique aspect of this invention is modular software MCU in which every Software MCU will be composed of several dynamically linkable modules that can be linked together, depending upon the number of end points, to become single functional software MCU.

In one exemplary implementation of the invention, the global multimedia conferencing service provider has computer based applications, running on webservers called administration servers. The main part of the multimedia conferencing system called the Conference Control System or Conference Controller (106A) resides in the administration servers. The administration servers are also called master servers (105). The Conference Controller (106A) and related applications residing in the administration servers are the important part of SLIP based multimedia conferencing architecture disclosed here. There can be several preferred embodiments of the conference controller, examples of which are illustrated in Figs. 2 and 3. Not all the components of multimedia conferencing architecture are explained in this patent specification. Many existing components of multimedia conferencing architecture like gatekeepers, gateways etc. are not explained in this patent specification. A person skilled in the art can appreciate that, the abstractions used in this patent specification are for concentrating on the disclosed invention and its preferred applications. Explaining entire multimedia conferencing architecture is not intended in this patent specification.

In one exemplary implementation of the system, the participants log in into the website of the global conferencing service provider, using a web browser, to take part in a multimedia conference. In another implementation of the system, all the participants install a multimedia conferencing application on their respective conferencing devices. These applications are called conference interfaces. This application is used to take part in the multimedia conference without using a web browser. In both the above said implementations, an application called embedded learning module is installed on respective conferencing devices to learn many parameters that can affect the performance and quality of a multimedia conferencing system. The embedded learning module is an additional system that can help the SLIP process.

A unique part of this invention is that, in the above mentioned multimedia conferencing system; a SLIP based system is implemented to dynamically improve QoS and performance characteristics of multimedia services.

A series of processes hereafter called a conferencing policy is executed by the present invention to begin, execute and terminate a particular multimedia conference in order to give maximum satisfaction for the participants in terms of quality and performance of the multimedia conference. For example, a particular participant will feel an improved video quality in a particular conference session as compared to a previous conference session. In another example, participants will feel a reduced connecting time with other the participants as compared to previous conference session.

A conferencing policy is a series of methods and processes executed by the conference control system to initiate, execute and terminate a particular session of a multimedia conference. The complete steps in beginning, executing and terminating a particular session of multimedia conference are defined by the conferencing policy. It includes selecting the location of MCUs, linking various geographically distributed MCUs, linking various end points to most suitable MCU, selecting number of micro MCUs required in modular software MCU, linking the dynamically linkable micro MCU modules [1, 2....n) in a modular software MCU, linking the dynamically linkable media controllers in a modular software MCU, providing best routing policies to the routing devices if required, determining the quality of multimedia content required for each participant, determining the QoS parameters for each participant, managing the computational resources for optimum efficiency and thus providing ecofriendly operation, integration with existing QoS techniques, ensuring compatibility with existing multimedia conferencing protocols etc. The method consists of a first step of registering all the participants who wants to take part multimedia conference. One or more specific participants can be assigned as administrators to control a particular multimedia conference session, if required. Registration can be done by each participant at any one of their conferencing devices or the registration can be done by the administrators by providing details of all the participants who are supposed to take part in a particular conferencing session. In both case, the system will collect more details of each participant through a series of self-learning actions. These details may include but not limited details like visual ability, audibility and intelligence of each participants.

Any one of the administrators can start a multimedia conference by registering a new conference or by initiating an already conducted conference. If a new conference is being registered, the administrator adds participants to the conference. A unique identifier or name is given to the newly registered conference. The administrator can schedule future conferences. After this, invitations are sent to each participant to join the conference on the scheduled time. During the invitation the participants are requested to provide information like whether they will participate in the conference, their expected conferencing device, their expected location on the time of the conference, what features of the conferencing system are expected to use etc. Based on this information and metrics and/or information and metrics gathered during participant registration and/or information and metrics gathered during conference registration by the administrator and/or much other information and metrics gathered using the self-learning process, and thereafter processing the above information and metrics, the multimedia conference system will select a conferencing policy for setting up the conference in order to provide maximum QoS to all the participants. If the conference is an already conducted one, in addition to above mentioned information, the system will consider much other information gathered by learning all the previous occurrences of that conference with the same identifier or name. The policy for setting up the next occurrence of an already conducted conference will be based on processing all the information gathered by learning the previous occurrences of the conference. Even if the name or identifier is different, the present invention will learn from other conferences with similar or identical participants or any other similar or identical multimedia conferences, in order to set a conferencing policy based on SLIP.

As mentioned earlier, a unique part of this multimedia conferencing system is the self-learning based quality and performance improvement. Learning can be defined as an intelligent process of gaining knowledge or skill by studying, practicing, being taught, or experiencing something. This definition of learning is more or less implemented in this invention so as to gather information and metrics regarding various factors that affect quality and performance of a multimedia conferencing system. The learning or self-learning is implemented by various systems and applications that take part in multimedia conference, including computer system. The learned information and metrics are processed to develop a conferencing policy that can improve the performance and quality rendered in multimedia conference.

One kind of learning is learning behaviour, intelligence and physical capabilities of the participants like hearing power, vision power etc. These factors can affect the apparent quality of a multimedia conference. For example, a person with high hearing power senses a low quality audio streaming more clearly than a person with low hearing power. The same is the case with vision power during video streaming. A person with very good vision power may require only a low quality video streaming whereas a person with low visual capabilities requires a higher quality video streaming in order to get similar satisfaction levels.

Intelligence of a participant can affect the apparent quality and performance of a multimedia conferencing system. Intelligence is often defined as the general mental ability to learn and apply knowledge to manipulate the environment, as well as the ability to reason and have abstract thought. Intelligence can also be defined as the adaptability to a new environment or to changes in the current environment, the ability to evaluate and judge, the ability to comprehend complex ideas, the capacity for original and productive thought, the ability to learn quickly and learn from experiences and even the ability to comprehend relationships. In a multimedia conferencing system where several participants are involved in conferencing, intelligence of the participants may determine how each participant interacts with the various components in the multimedia conference. A less intelligent person may take more time to adapt to different features than a person with more intelligence. A person with high intelligence will evaluate and judge performance and QoS of the multimedia conference with more leaned towards the actual cause of the multimedia conference, whereas a person with lesser intelligence can deviate from actual cause of the multimedia conference and concentrate on features that are of less use in a particular multimedia conferencing session.

One kind of behavioral learning implemented in this invention is the participant's nature of changing end points or devices in a multimedia conferencing system. For example, a participant may be using his personal computer during office hours in weekdays, but will be using his mobile devices during non-office hours or weekends. The pattern of the participants' device usage is learned by the self-learning system and the present invention will calculate and allocate required resources for the particular device. For example, bandwidth and other resource requirements for a mobile hand held device is different from bandwidth and other resource requirements for a personal computer or dedicated conferencing device. With the participants' nature of device usage is known, the present invention can allocate resources intelligently and ensure a better QoS and performance for all the participants.

Another kind of learning implemented in this invention is regarding geographical locations of the participants in a multimedia conferencing system. A participant may be taking part in multimedia conference from single location or may be travelling during a particular multimedia conferencing session. Some participants may be at different geographical locations for different conferencing sessions. The SLIP learns the nature of a particular participant's location changes and finds out if there is a pattern of location changes in time. The system can also predict the location of a particular participant and allocate resources according to the geographical location of the participant. This can render improved QoS and performance of operation.

Another kind of learning implemented in this invention is regarding various network related parameters like bandwidth, latency, speed etc. Bandwidth, typically measured in bits, kilobits, or megabits per second, is the rate at which data flows over the network. Just as more water flows through a wide river than a small, narrow creek, a high bandwidth network generally can deliver more information than a low bandwidth network given the same amount of a time. Because this can make the network feel faster, high bandwidth networks and connections often are called "high-speed".

Latency, usually measured in milliseconds, is the time that elapses between a request for information and its arrival. A high latency can degrade the performance of even the largest capacity network to a tremendous degree. Because it takes time for a signal to pass through wire, some latency will always be present, but slow servers, inefficient data packing, and excessive network hopping can collectively increase transmission delay.

Excess latency gives a network a low-speed feel. If a connection takes large time to respond, many users will complain the connection is "slow", even though the bandwidth is high. A network's speed is essentially a subjective evaluation of the combination of bandwidth and latency. If a network has high bandwidth but if the latency is high, the network is essentially low speed. Even if the latency is low but the application requires high bandwidth from the network, the network is at low speed. Therefore a compromise must be met depending upon the application, to ensure an expected speed for the network. In case of applications like a multimedia conferencing system, bandwidth and latency are some of the criterions that affect QoS and performance of operation. In case of a multimedia conferencing system that allows several participants and allows several concurrent conferences at the same time, bandwidth requirements are high. If the QoSs and performances of operation required by the participants are high, then also the bandwidth requirement will be high. There are also several other factors that require high bandwidth from the network.

Even if the bandwidth is high, latency in the network can affect performance of a multimedia conference. Latency causes multimedia contents like video and audio to be displaced in time domain. Small latencies are not much important in the case of data transfer where the complete information is rebuilt even if different pieces of data come at different time. Video and audio reaching at different time in a multimedia conferencing system affects the performance of the conference and satisfaction of the participants. The case is worse if consecutive pieces of audio reach at different time so that the reconstructed audio is unintelligible or displaced in time domain.

By learning a particular multimedia conferencing session over time, and learning different occurrences of the same conferencing group, and also from similar conferences with similar or identical participants, and from many similar or identical multimedia conferences, the present invention is able to generate a conferencing policy that gives improved QoS and performance of operation to a particular session of multimedia conference compared to the previous sessions of the same conference.

Above mentioned are not the only factors learned by the self-learning system in the innovation. There are several other factors that may affect the quality of a multimedia conferencing system. Also new and new factors arise over time, which can affect the quality of a multimedia conferencing system. The SLIP multimedia conferencing system invented and disclosed here can Find out various new factors, if any, that affects the quality and performance of multimedia conferencing systems and respond to the finding through intelligent processing and conference policy generation for a currently running or future multimedia conference.

All the information, data and metrics acquired by self-learning as explained above in are processed to develop a new conferencing policy for the next session of the same multimedia conference or an updated conferencing policy for a currently running multimedia conference. The processed information, data and metrics as said above are also used to generate conference policies for other conferences with more or less same participants, other conferences with similar end devices, other conferences with similar geographical locations of the participants etc.

Various artificial intelligent processing algorithms and computational methods are used in the disclosed invention to develop a conferencing policy for a multimedia conference. Some examples are neural networks, fuzzy logics etc. An intelligent processing method consisting of a combination of various artificial intelligence algorithms are used in some embodiments of the invention. In all the embodiments the result will be a continual improvement in the QoS and performance of multimedia conference.

In one embodiment of the disclosed system, a multimedia conference is initiated, maintained and terminated by a master server (105) and computer applications running on it. In the exemplary embodiment of Fig. 1, only one master server (105) is shown. In some cases there will be only one master server (105) at a time. But there can be more than one master server at a time and the master servers (105) for a particular conference are selected depending on the location of the participants and various other factors. Master server (105) contains two main computer applications running on it, namely the Conference Controller (106A) and Master MCU (107).

Conference Controller is the important part of the SLIP multimedia conferencing architecture. Architecture of an embodiment of Conference Controller (106B) is illustrated in Fig. 2. The dotted lines for the block of Master MCU (107) mean that Master MCU functionality is external to the Conference Controller (106B).

Any software MCU or hardware MCU can be designated as Master MCU. The Conference Controller (106B) and Master MCU communicate with each other using an application programming interface or any other similar technology.

In another embodiment of the disclosed system, the modular software based MCU or modular software MCU is integrated with conference controller the Conference Controller (106C) as a single application, and the modular software MCU (314) is working as the Master MCU. This preferred embodiment is illustrated in Fig. 3. In the process of providing multimedia conferencing services to geographically distributed end points, there can be many MCUs cascaded together in different ways as explained in many other publications and mentioned in this patent specification. But in the above exemplary implementation of the disclosed invention, that is SLIP multimedia conferencing architecture, the Master MCU (314) will be the first MCU initiated and used in the conference, whatever is the number of other MCUs. The other MCUs, referred to as local MCUs (103) are connected to the Master MCU (314) directly or through other local MCUs (103).

Another unique part of the invention disclosed in this patent specification is modular software MCU. The architecture of modular software MCU is shown in Fig. 4. The modular software MCU consists of at least one core controller (400), dynamically linkable micro MCU modules (1, 2....n) and dynamically linkable media accelerators (1A, 2A...nA). The modular software MCU uses dynamically linkable modular software approach to improve efficiency, speed and performance of software MCUs.

The architecture of Fig. 3 has additional performance benefits. The SLIP multimedia conferencing architecture can improve the quality and performance of multimedia conferencing session based on SLIP and by using the modular software MCU (314) the performance and quality of the multimedia conference is further improved.

In the architecture of Fig. 3, the core controller of the modular software MCU is executed at first in the conference control architecture and the core controller dynamically link required numbers of dynamically linkable micro MCU modules (1, 2....n) to provide facility for more participants and dynamically linkable media accelerators (1A, 2A...nA) to provide additional processing power if required.

In the exemplary embodiment of Fig. 1, there will be conference servers (104) at plurality of locations which are communicating with the master server (105) directly or through intermediate servers. Conference servers (104) connect to various conference end points through a local MCU (103). The local MCU (103) can be software MCU running in the conference servers (104) or the local MCU can be dedicated hardware MCU. The local MCU (103) can also be a modular software MCU running in the conference servers (104). Application called Local Controller (102) running on the conference servers (104) facilitates communication of the local MCU (103) with master servers (105) and manages MCU connections within a locality. Within each location, each of the local end points communicate with its associated local MCU (103) via internet or any other packet switching or circuit switching networks or combination of both. Those skilled and experienced in the art will appreciate that the number of locations, end points, MCUs and conference servers (104) or any other systems or facilities shown in Fig. 1 is only exemplary and is done for convenience of presentation. The disclosed invention is not concentrating on how master server communicates with conference servers (104) or how MCUs communicate with each other or MCUs communicate with conference end points. These are explained in many other publications. The uniqueness of this invention resides in the intelligent self-learning process implemented within the multimedia conferencing architecture shown in Fig. 1, whereas the intelligent learning process and other methods and devices invented and disclosed here is aimed at improving overall QoS and performance of operation of multimedia conference.

In all the embodiments mentioned above the master server (105) initiates, maintains and terminates a multimedia conference between pluralities of locations, where each location may have one or more conference servers (104). But this may not be the case in some other embodiments of the present invention. The area of operation of a conference server (104) is not limited by its geographical location, but a conference server (104) is selected in order to provide easy routing of multimedia contents in the multimedia conference and thereby maximum possible quality and performance. Descriptions for cascading several MCUs for a multisite conference and related activities are mentioned in various other publications. The present description concentrates on the unique part of the disclosed invention, the self-learning and intelligent architecture for quality and performance improvement, related methods, applications, systems and devices with reference to some exemplary embodiments.

The selection of master servers (105), conference servers (104), MCUs, other facilities required in a multimedia conferencing system, their geographical locations and the service areas for master servers (105) and conference servers (104) are determined by the multimedia conferencing policy, or conferencing policy, for a multimedia conferencing session.

One of the major components of the above mentioned embodiment of the invention is the Conference Controller (106A, 106B, 106C).

The internal architecture of the Conference Controllers (106B, 106C) of the preferred embodiments is explained below with respect to Figs. 2 and 3. FRONT END MODULE (201)

The Front End Module (201) interacts with the users of the conferencing system, sometimes called participants. It has several functions like registering a new user, registering a new conference with existing users etc., scheduling and managing conferences etc. The Front End Module (201) is composed of several functional sub modules including Registration module, User Interaction Module, Registration Intelligence Module, User Data Base, Conference Data Base and Intelligent Data Base Updater.

Registration Module gathers the details of a new user during a user registration or a new conferenced during the new conference registration process. User registration is required for every participant when they are using the facilities of the conferencing architecture for the first time. A unique user identifier is provided to every participant in order to distinguish them from other users. User registration can be done in the websites of the global conference service provider or through dedicated applications. In another way it can be a computer based registration form to be filled by the registrants. If a new user is registering, he has to provide his personal and contact details like name, e-mail, mobile number etc. and other details like location, devices using for conferencing etc. Registration Intelligence Module intelligently learns and gathers several other details regarding a user while he is registering for a conference. These details include his vision power, hearing power, clarity of speech, his intelligence etc. In order to gather the above information, registration module make use of various intelligent algorithms and devices like computers' camera, microphone etc. The details of registered users are stored in the User Data Base. These details are further updated based on intelligent self-learning during any multimedia conferencing session or during any other relevant occasions. During registration process, a software application called embedded learning module is downloaded and installed on various devices used by a user as conference end points. The embedded learning module resides in user devices, learns and gathers information and factors that affect quality and performance of a multimedia conferencing system, converts the learned information into a data format and send it to the Conference Controller (106B, 106C). This information is used by the Learning Engine (209) along with other learned information and metrics to generate the conference policy for a particular session of multimedia conference. More detailed description of learning process is provided later in this patent specification.

Each user is given a unique identifier or name. Any user can initiate a new conference by designating himself as the conference administrator. Then the conference administrator registers for a new conference. Each conference is provided a unique identifier. The conference administrator can add participants in an already registered conference by providing their name and contact details if they are not already registered or their unique conference identifier if they are already registered with the conferencing system. The conference administrator can schedule the next session of the conference or many future sessions of the conference. Invitations are sent to the participants added by the conference administrator or the invitations are automatically send by the Conference Controller (106B, 106C).

Intelligence is added in the process of conference scheduling. The Conference Controller (106B, 106C) can suggest the conference administrator regarding the best time and date for a future conference, based on intelligent SLIP, so that the scheduled multimedia conference will be of best quality and performance compared to the previous conferences.

The invited participants can accept or reject the invitation. If the participants accept the invitation, and if the participant is not previously registered with multimedia conferencing system, the participant has to go through a registration process as described earlier. Intelligent learning of the new user is done by the registration intelligence module as described earlier in this patent specification. Learning the user behaviour, attributes and other details are not only done during registration module, but it is a continuous process done during and after different multimedia conferencing sessions in the disclosed invention.

CALL AND CONNECTION SET UP MODULE (202)

Call and Connection Set up Module [202) sends invitation to a group of participants to take part in a particular session of a conference. Call and Connection Set up Module (202) takes information regarding registered users from the Front End Module (201). Each conference has a unique identification and a list of participants along with many other details. The Call and Connection Set up Module (202) also initiates the login process of the participants through authentication steps like password requests. Intelligence is implemented in the login process for authentication and security, wherein the implemented intelligence is based on the SLIP. An authentic user can be differentiated from hackers or non-authentic users based on the learned information, metrics and data and further processing. The present invention can thus make multimedia conferencing systems secure and immune to internet crimes.

CONFERENCE DATABASE (203)

Conference Database (203) has the details of all the registered conferences and details of the participants of each conference. Each conference has a unique ID, the list of participants, location of each participant, the end devices used by the participants at various times, and several other details of the conference. The details stored in the conference data base are constantly updated based on the intelligent self-learning process as one of the processes for improving the quality and performance of multimedia conferences. CONFERENCE TABLE (204)

Conference table (204) contains the details of all active conferencing sessions currently going on and their participants. A section of the data in the conference table (204) is taken from the Conference Database (203) and another section of the data is updated dynamically during the conference. The conference table entries for a particular conference are initiated when a conference is started and cleared when the conference is terminated. The dynamically updated data fields of the conference table (204) are saved to the corresponding fields in the before the conference table (204) for a particular session of the conference is cleared.

MCU DATABASE (212)

MCU Database (212) contains details of all MCUs that are available to be used in a multisite multimedia conference conducted by the global multimedia conference service provider or a corporation which wants to conduct a global Multimedia Conference or any other entity which wants to conduct a multisite multimedia conference using SLIP multimedia conferencing architecture. MCUs can be software MCUs that are running on master servers (105) or conference servers (104), or hardware MCUs connected to master servers (105) or conference servers (104) or modular software MCUs disclosed in this patent specification running on master servers (105) or conference servers (104) or any independent hardware or software MCU. The master server (105) contains the Master MCU (314) which are connected or cascaded to various local MCUs (103) for setting up a global multimedia conference. The Master MCU (314) can also connect to various end points at its service area. The conference servers (104) contain one or more local MCUs (103) which are connected to end points at the service area of a particular conference server (104).

MCU MANAGER (211)

MCU manager (211) facilitates connecting the Master MCU (314) to various end points in the service area of a master server (105) and also facilitates connecting the master MCU to various local MCUs (103) to facilitate a smooth multimedia conference in terms of maximum QoS and performance of operation. The number of MCUs selected and the location of the conference servers (104) containing the local MCUs (103) will be determined by the conference policy generated by the Policy Engine (205). Connecting or cascading MCUs as used herein should not be understood as implying a direct connection between the MCUs so connected or cascaded. It should be understood that cascaded MCUs can include intervene structures, such as routers, other MCUs etc. Also there may be many ways by which master servers (105) communicate with conference servers [104) or Master MCUs (314) connect or cascade with local MCUs (103). A person skilled in the art can appreciate this.

MCU TABLE (213)

MCU table (213) contains information about all MCUs currently in use for various multimedia conferencing sessions by the SLIP multimedia conferencing architecture. It also contains information about conferencing sessions which are using a particular MCU. In case a new conferencing requires some of the MCUs from the MCU data base, the MCU manager (211) gathers information from currently used MCUs regarding available ports or conferencing facilities at the currently used MCUs. If there are enough facilities available, the conference manager allocates a currently used MCU for a new conference. If there are no facilities available at currently used MCUs, a new MCU is selected from the MCU database (212). The above mentioned method of selecting MCUs is determined by conference policies generated for a particular session of the conference. The significant part of the disclosed invention is the self-learning and successive intelligent processing executed during a particular conferencing session or any other relevant occasions, for improving QoS and performance of operation of the currently running conference or future sessions of the same multimedia conference. The same multimedia conference here implies a multimedia conference with same conference identifier. But the quality improvement can also be achieved for other multimedia conference also, wherein those multimedia conferences have similar or identical participants or have participants from similar or identical geographical locations or multimedia conference with similar or identical end devices etc.

The self-learning, intelligent processing and successive improvement of quality and performance of the SLIP multimedia conferencing architecture is mainly implemented by various modules in the Conference Controller (106B, 106C) namely Dynamic Data Collector module (208), Learning Engine (209), Quality Analyzer Module (210), Policy Database (206), Learning Database (207) and Policy Engine (205). The functioning of these modules and the self-learning and resultant quality and performance improvement in a multimedia conference is explained below. DYNAMIC DATA COLLECTOR (208)

Dynamic Data Collector Module (208) monitors execution of a particular session of multimedia conference in a plurality of ways; and responsive to identifying a factor, hereafter called a quality element, that may affect quality or performance of a multimedia conferencing session, put an identifier to the quality element, track the quality element during the entire conferencing session or any other relevant occasions, continuously collect and stores all relevant information, data and metrics concerning the quality element in a Learning Database (207) to be further processed by the learning engine. The above collected information, data and metrics are then stored in computer readable format, for example an organized knowledge format.

The relevant factors that affect the QoS and performance of operation of a multimedia conferencing system, or hereby called quality element, are varied. It may include but not limited to the following;

a) Network information including delay, jitter, bandwidth, loss and reliability in a particular session of multimedia conference.

b) User centric information including throughput, latency, availability and continuity of service (frame size, frame rate, image and audio clarity) in a particular session multimedia conference.

c) User behavior during a multimedia conference. An example of this is a participant periodically switching on and switching off the multimedia facility in the conference. Another example of this is a participant periodically changing the end points in a conference. Another example is a participant's interest in some particular kinds of contents during a multimedia conferencing system. But these examples do not fully visualize all the possibilities in learning the behaviour of a particular participant during a conference session.

d) User attributes like audible power, vision power, clarity in speech etc. Some preliminary information regarding various registered users' attributes are gathered by the registration intelligence module during the registration process and stored in the learning data base. Dynamic Data Collector Module (208) further updates the information regarding a conference participant's attribute during a particular session of the multimedia conference. e) Client device related information including processing power, screen size and mobility during a particular session multimedia conference.

QUALITY ANALYZER MODULE (210)

The Quality Analyzer Module (210) dynamically measures the quality of various services of a multimedia conference during a particular session of the multimedia conference. The Quality Analyzer Module [210) uses various methods to get the quality information from various conference end points. One of the methods may be directly enquiring to a participant in the form of web form. Another method is through learning a participant's behavior during a session of multimedia conference. Another method is information and metrics from embedded learning module and/or other systems and applications in the client device.

LEARNING ENGINE (209)

The Learning Engine [209) implements various intelligent learning algorithms to process the data collected by the Dynamic Data Collector to facilitate quality and performance improvement. The data, information and metrics collected by the Dynamic Data Collector are stored in the Learning Data Base in an organized knowledge format. These data, information and metrics are continuously transferred to the Learning Engine (209). The Learning Engine (209) divides each conferencing session into different conference slice of equal time length. The Learning Engine (209) collects quality information corresponding to each conference slice from the Quality Analyzer Module (210) and learns the relationship between the organized knowledge and quality information from the Quality Analyzer Module (210). The Learning Engine (209) also makes use of information and metrics regarding the previous conference policies of the multimedia conference. Information and metrics regarding the previous conferencing policies are stored in the Policy Database (206). The Learning Engine (209) then makes a quality map which is a multidimensional graph of quality against various quality elements along with intelligent suggestions for quality and performance improvement. A quality element can be any factor that affects the quality and performance of multimedia conferencing sessions as mentioned early, where information and metrics regarding these factors are collected by the Dynamic Data Collector Module (208).

Various artificial intelligent and self-learning algorithms are used by the Learning Engine (209) and Policy Engine (205) to process the information and metrics as mentioned earlier.

POLICY ENGINE (205)

Policy Engine (205) automatically generates conferencing policies for a particular session of multimedia conference based on the quality map generated by the Learning Engine (209). The conferencing policy can be a sequence of processes and methods like allocating computing and network resources for the multimedia conference, selecting geographical location of the computing and networking resources, selecting services rendered to each participant in the multimedia conference, and contain all other processes and methods required to initiate, maintain and terminate multimedia conferencing sessions. Conferencing policies can be generated for currently running multimedia conferences or future multimedia conferences. If a quality improvement or performance of operation improvement is required for a currently running conference based on intelligent learning or due to request from any concerned participants or systems in a multimedia conference, the Policy Engine (205) updates the conferencing policy for that currently running conference, where the updated policy will improve the quality and performance of operation of the above mentioned multimedia conference.

The conferencing policy for the next session of a particular multimedia conference will be generated based on intelligently learning all previous sessions of the conference, where the aim of the changed policy is to improve the quality and performance of the multimedia conferencing session.

A second aspect of the invention used for improving quality and performance of a multimedia system is a Modular Software based Multi-site Connection Unit. It is also called a modular software multipoint connection unit or modular software MCU. The modular software MCU connects between EPs or clients in a multimedia conferencing system. An end point can be a personal computer, an IP telephone, mobile phone, and tablet PC or any device capable of audio, video or multimedia conference as mentioned earlier in this patent specification.

The modular software MCU can implement centralized, decentralized or hybrid multipoint capabilities. The defenitions of centralized, decentralized and hybrid multipoint capabilities are explained in many publications in the art. The modular software MCU has facilities to integrate many other features which are relevent to multimedia conference like rate matching or lip synchronization.

In one embodiment of the invention the modular software MCU consists of three main software components, the Core Controller [400), dynamically linkable micro MCU modules (1, 2....n) and dynamically linkable media accelerators (1A, 2A...nA). This is illustrated in Fig. 4. In an exemplary implementation, the modular software MCU will be running as a single application along with the Conference Controller (106C) in a web server. This is shown in Fig. 3. When the Conference Controller (106C) is loaded into the memory for the purpose of multimedia conference, the Core Controller (400) of the modular software MCU is initiated. Although not shown in any of the accompanying drawings, the Core Controller (400) may also contain an embedded micro MCU module so that the Core Controller (400) can work as a standalone MCU, without any dynamically linked micro MCU module or dynamically linked media controller, if the participating end points are less in numbers and the processing capability required is little. Each of the dynamically linkable micro MCU modules (1, 2....n) consists of two main system components, a Multipoint Controller (MC) and one or more Multipoint Processors (MP). The MC in the modular software MCU is responsible for creating and managing conferences between three or more endpoints. It performs capabilities exchange on behalf of every participating endpoint. All capabilities may be common for all endpoints or some of them may have different set. The MC in the modular software MCU is able to change the capabilities during session.

Multipoint Processors (MP) in the modular software MCU is a multipoint system component that processes audio, video, data streams or any multimedia components communicated in a multimedia conference. The multipoint processors change their formats and mix or switch among multimedia streams from different endpoints.

Dynamically linkable media controllers provide additional processing capability to the multimedia conferencing system using the modular software MCU, if the processing capability of the MP is not enough.

All the data from the end points are given to the modular software MCU and the modular software MCU mixes all the data and send the mixed data to all the end points. The mixed data contain video, audio, data stream, instant text messages or any other multimedia components from all the end points so that each point can receive multimedia inputs from a single source. The modular software MCU can allow multiple conferences so that more than one simultaneous conference can be carried out by single modular software MCU. The core controller keeps track of the various multiple conferences allowed in single modular software MCU.

In an example case, the Core Controller (400) with embedded micro MCU module can conduct a multimedia conference with a small number of clients or end points, for example ten clients, without linking to any micro MCU module or media accelerator. In order to add more participants to the conference, the Core Controller (400) links sufficient number of micro MCU modules so that the total number of participants are handled by the embedded micro MCU and the dynamically linked micro MCU modules. If the processing power of all the multisite processors in the linked micro MCU modules is not enough to process audio, video, data stream or any multimedia components, one or more media accelerator modules are linked to the core controller.

Dynamically linkable micro MCU are software modules which can function as a complete MCU in association with the Core Controller (400). A software module may be defined as including objects, functions, routines, procedures, libraries, and/or applications or any kind of software components. The dynamically linkable micro MCU modules (1, 2....n) contain a fixed number of input ports which has capability to handle a small multimedia conference when linked to the Core Controller [400). If more participants are needed more dynamically linkable micro MCU modules (1, 2....n) are linked to the Core Controller (400).

Dynamically linkable media accelerators (1A, 2A..,nA) are software modules with digital signal processing capabilities to process audio, video, data stream and other multimedia components in a multimedia conference and other related applications. Some examples of processing are format conversions, filtering, echo cancellation, frequency domain and time domain synthesis etc.

The unique part of the disclosed invention is dynamically linkable modular software components for the software MCUs in intelligent SLIP multimedia conferencing architecture. The core control module can dynamically link any number of dynamically linkable micro MCU modules (1, 2....n) or media accelerator module. A dynamically linkable micro MCU module (1, 2....n) or a dynamically linkable media accelerator module (1A, 2A....nA) is only loaded into memory when needed and unloaded from memory when it is no longer needed. There are several advantages and performance improvement in this approach like resource saving, energy efficiency, increase in processing speed etc. One of the advantages of this approach is that the dynamically linkable software modules are in memory only as long as it is needed, resulting in more efficient use of memory. Another advantage is that applications will typically load more quickly when using this approach because not all the modules needed to run the program is loaded when the application initially loads.

The Core Controller (400) along with the Conference Controller (106C) requires very less memory, loaded quickly into the memory and executed with high performance. If the participants in a multimedia conference are less, the Core Controller (400) itself can work as a complete MCU along with the SLIP conferencing architecture, in one of the embodiments. Only if more participants are to be added, more dynamically linkable micro MCU modules (1, 2....n) are linked in sufficient number. And, only if high processing power is required, sufficient numbers of dynamically linkable media accelerator modules (1A, 2A....nA) are linked. This approach also facilitates energy efficient and green computing concepts.

In one embodiment of the invention, in order to bind or link different micro MCU modules, the output of a particular micro MCU is connected to the input of another micro MCU module. When a single micro MCU module is linked to the Core Controller (400), the output of the embedded micro MCU in the Core Controller [400) is connected to the input of the linked micro MCU module. Those skilled in the art can appreciate that the term connected in this context does not mean a physical connection, but a software facility in which data can be transferred from one application to other. In all the above linking process there is flow of control signals and data streams between the various linked modules.

An exemplary operation of the presently invented self-learning based multimedia conferencing system is as follows. A company wants to conduct a multimedia conference between one hundred of its employees who are geographically distributed across four continents of the globe. A particular employee in the headquarters of the company logins into the website of the global multimedia conferencing service provider and registers as a user and also designates as the conference administrator. There are provisions in the registration process to provide administrative control rights to more than one participant.

The registration application in the webpage of the global multimedia conferencing service provider gathers much information from the registrant including personal, behavioural etc. Some of the information is gathered in the form of questionnaire and registration forms. Some other information is gathered by a self-learning process during registration. The conference administrator then downloads a self-learning application, called embedded learning module, to his conferencing devices, his mobile devices and many other personal devices. These applications are further used to gather other information through self-learning.

The conference administrator then registers for a new conference. The SLIP system then provides a unique identifier or name for the conference. The conference administrator then schedules the date and time for the next conference or for many future conferences. The conference administrator then adds the primary details for other 99 participants in the registration page of the website of global multimedia conferencing service provider. After this invitations will be sent to all the participants added by the conference administrator. These invitations may be through e-mail or other social networking or other communication and collaboration platforms. If the invited persons accept the invitation, many details are gathered from them using questionnaires, registration forms and self-learning method. All the participants then download a self-learning application, called embedded learning module, to his conferencing devices. These applications are further used to gather other information through self-learning.

Before the scheduled time of the conference, the embedded learning module installed in the conferencing devices, mobile devices and other personal devices will execute the learning process in order to gather information, data and metrics from the participants, participants' devices, their network service providers etc., where these information, data and metrics may affect the performance of operation and QoS of a multimedia conferencing session.

Before the scheduled conference, notifications are sent to all the participants and asked whether their participation in the conference is confirmed. A list of confirmed participant is then kept by the conference control system.

At the scheduled time of the conference, all the participants in the conference logins into the website of the global multimedia conferencing service provider using any web browser. In another, embodiment of the system, a dedicated multimedia conferencing application will be installed in client devices used for multimedia conference and that multimedia conferencing application will automatically connect to the website of the multimedia conferencing service provider. In both the embodiments, the multimedia conferencing session is initiated based on the conferencing policy generated for that particular session of the multimedia conference.

Self-learning process is executed before, during and after a particular session of the multimedia conferencing system. This learning process learns and gathers various information, data and metrics that may affect the QoS and performance of operation of multimedia conferencing sessions. Some of these learning are explained earlier in this patent specification. These gathered information, data and metrics are processed by the conference control system to generate conferencing policies that improve the QoS and performance of operation of the next session of the multimedia conference with the same conference identifier or name. The system can also improve the quality and performance of a currently running conference by generating an updated conferencing policy using the SLIP method. The above mentioned processing can also help in generating conferencing policies for multimedia conferences that may be related to the above multimedia conference by more or less same participants or participants at same geographical locations or similar factors. If the multimedia conference with the above mentioned conference identifier is initiated for the next time, all the participants will feel an improved QoS and performance of operation as compared to the previous session. This improvement in QoS and performance of operation is the result of the self-learning process, intelligent processing of the information, data and metrics gathered during the self-learning process and the conferencing policies generated for the particular session of the multimedia conference.

The foregoing summary is not intended to summarize each potential embodiment or every aspect of the present disclosure, and many other features and advantages of the present disclosure will become apparent upon reading the following detailed description of the embodiments with the accompanying drawing and appended claims. Even though the foregoing descriptions are done based on a global conferencing service provider, the invention can be used by any corporation or company or personal who want to conduct a multimedia conference involving two or more participants.

Furthermore, although specific exemplary embodiments are described in detail to illustrate the inventive concepts to a person skilled in art, such embodiments are susceptible to various modifications and alternative forms. Accordingly, the figures and written descriptions are not intended to limit the scope of the inventive concepts in any manner. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. The overall aim of the invention is to provide maximum possible QoS, performance of operation, user satisfaction and environment friendliness in multimedia conferences.

Claims

A self-learning and intelligent system for continually improving quality and performance of a multimedia conference comprising of:

(a) networked computer systems comprising of (i) a means to collect information and metrics from the internet or any network in which the multimedia conference is established or from persons or devices taking part in the multimedia conference in the form of computer readable data (208), (ii) a computer memory system (203, 204, 206, 207, 212, 213) for storing the collected computer-readable data for further processing,

(iii) a processor and a self-learning, adaptive, intelligent and knowledge processing computer application and algorithm for manipulating the computer-readable data; and

(iv) a hardware and software communication interface for communicating computer- readable data between systems and subsystems;

(b) at least one self-learning, adaptive, and intelligent computer system comprising of: (i) a means of accepting the computer readable data in various formats, (ii) a means of accepting conferencing policy information for previous sessions of various multimedia conferences, (iii) the self-learning, adaptive, intelligent and knowledge processing computer application and algorithm for processing computer readable data in a format together with information regarding at least one previous session of a multimedia conference and generating an updated conference policy for a currently running multimedia conference or generating a new conferencing policy for a session of a multimedia conference that is to be initiated in future and (iv) a means of implementing the conference policy thus updated to maintain and terminate a currently running multimedia conference with improved quality and performance or a means of implementing the newly generated conference policy to initiate, maintain and terminate a new session of a multimedia conference with improved quality and performance; and

(c) a system for feedback and evaluation of success of the updated conference policy or newly generated conference policy for a multimedia conference, in terms of improved quality and performance, comprising of: (i) a means for collecting information and metrics from the internet or any network in which the multimedia conference with the updated conferencing policy or new conferencing policy is established and/or from persons and/or devices taking part in multimedia conference in the form of computer readable data, wherein the collected information and metrics provide details regarding quality of service (QoS) and performance of the multimedia conference established with the updated conferencing policy or the new conferencing policy, (ii) a means to store the information and metrics so collected in a computer readable memory for further processing, (iii) a means to use a self-learning, adaptive, intelligent and knowledge processing computer application and algorithm for processing the information and metrics and (iv) a means to update the current policy if QoS and performance of operation is not improved or keeping the current policy if the QoS and performance of operation is improved.

2. The system of claim 1, wherein the term previous session of a multimedia conference as referred to a new session or a future session of multimedia conference is defined as (i) a conference with identical or similar participating clients, (ii) a conference with participating clients at identical or similar geographical locations, (iii) a conference with participants using identical or similar kind of end points, or (iv) a conference with identical or similar hardware and/or software and/or network conditions or any combination thereof.

3. The system of claim 1, wherein the system is integrated with any other quality and performance improvement systems.

4. The system of claim 1, wherein a conferencing policy is a sequence of processes used in a multimedia conference, like allocating computing and network resources, selecting geographical location of the computing and networking resources, selecting services rendered to each participant in the conference, and all other processes required to initiate, maintain and terminate multimedia conference sessions.

5. The system of claim 2, wherein the information and metrics collected in machine readable format include network related parameters such as delay, jitter, bandwidth, data loss and data reliability, wherein the information and metrics are collected during client registration for a multimedia conference and/or during registration of a multimedia conference and/or before a particular session of a multimedia conference and/or during a particular session of a multimedia conference and/or after a particular session of multimedia conference.

6. The system of claim 2, wherein the information and metrics collected in machine readable format include various client centric information such as throughput, latency, availability or continuity of service such as frame size, frame rate, image and/or audio clarity, wherein the information and metrics are collected during client registration for a multimedia conference and/or during registration of a multimedia conference and/or before a particular session of a multimedia conference and/or during a particular session of a multimedia conference and/or after a particular session of multimedia conference.

7. The system of claim 2, wherein the information and metrics collected in machine readable format include various end point related information such as processing power, screen size, audio capabilities, video capabilities, signal processing capabilities, memory capacity, network connection capacity or mobility of the end points, wherein the information and metrics are collected during client registration for a multimedia conference and/or during registration of a multimedia conference and/or before a particular session of a multimedia conference and/or during a particular session of a multimedia conference and/or after a particular session of multimedia conference.

8. The system of claim 2, wherein the information and metrics collected in machine readable format include information and metrics regarding online behaviour of client, wherein the information and metrics are collected during client registration for a multimedia conference and/or during registration of a multimedia conference and/or before a particular session of a multimedia conference and/or during a particular session of a multimedia conference and/or after a particular session of multimedia conference.

9. The system of claim 2, wherein the information and metrics collected in machine readable format include information and metrics regarding intelligence of client, wherein the information and metrics are collected during client registration for a multimedia conference and/or during registration of a multimedia conference and/or before a particular session of a multimedia conference and/or during a particular session of a multimedia conference and/or after a particular session of multimedia conference.

10. The system of claim 2, wherein the information and metrics collected in machine readable format include information and metrics such as vision power, hearing power, clarity in speech etc., wherein the information and metrics are collected during client registration for a multimedia conference and/or during registration of a multimedia conference and/or before a particular session of a multimedia conference and/or during a particular session of a multimedia conference and/or after a particular session of multimedia conference.

11. The system of claim 2, wherein the information and metrics collected in machine readable format include information and metrics regarding client's regular pattern of changing end points used in a multimedia conference, wherein the information and metrics are collected during client registration for a multimedia conference and/or during registration of a multimedia conference and/or before a particular session of a multimedia conference and/or during a particular session of a multimedia conference and/or after a particular session of multimedia conference.

12. The system of claim 2, wherein the system provides intelligent software applications in the end points used by clients of the multimedia conference for aiding the intelligent learning process.

13. The system of claim 1, wherein the performance of a multimedia conference is further improved by a modular software Multisite Connection Unit (Modular Software MCU), wherein the Modular Software MCU comprising of

a) at least one dynamically linkable micro Multisite Connection Unit module (dynamically linkable micro MCU module) (1, 2...n) with fixed number of input ports that can accommodate a fixed number of conference end points, b) a core controller (400) to identify a requirement of a new multisite multimedia conference with more than two participants or to identify a requirement to add one or more conference end points to an existing multisite multimedia conference or to identify a requirement of one or more simultaneous multisite multimedia conferencing sessions while one or more conferences are going on, and to link required number of dynamically linkable micro MCU modules (1, 2....n) to the Core Controller (400) to provide as much as input ports as required for the new conference or to add as much as end points as required to an existing conference or to add as much as conferences as required while a one or more conferences are going on,

c) at least one dynamically linkable media accelerator module (1A, 2A nA) that can be dynamically linked to the core controller (400) and/or other dynamically linkable micro MCU module (1, 2....n) and/or other dynamically linkable media accelerator module (1A, 2A....nA) if high performance processing is required on audio, video, data, text or other multimedia components communicated in a multimedia conference.

14. The system of claim 13, wherein one or more dynamically linkable micro MCU modules (1, 2....n) are dynamically linked to the Core Controller (400) while the Core Controller (400) is being executed.

15. The system of claim 13, wherein one or more dynamically linkable media accelerator modules (1A, 2A....nA) are dynamically linked to the Core Controller (400) while the Core Controller (400) is being executed.

16. The system of claim 13, wherein the dynamically linkable micro MCU modules (1, 2....n) contain at least one micro multipoint controller (micro MC) responsible for creating and managing conferences between a fixed number of multimedia conference end points.

17. The system of claim 13, wherein the dynamically linkable micro MCU modules (1, 2....n) contain at least one micro multipoint processor (micro MP) responsible for processing audio, video, data, text and other components communicated in a multimedia conference.

18. The system of claim 13, wherein a means is provided to increase the system's capability to connect more conference end points to a multimedia conference by linking one or more dynamically linkable micro MCU modules (1, 2....n) to the Core Controller (400).

19. The system of claim 13, wherein a means is provided to allow more than one simultaneous multimedia conference with the system by linking one or more dynamically linkable micro MCU modules (1, 2....n) to the Core Controller (400).

20. The system of claim 13, wherein a means is provided to increase the system's capability to process audio, video, data, text or other multimedia components communicated in a multimedia conference by linking one or more dynamically linkable micro MCU modules (1, 2....n) to the Core Controller (400).

21. The system of claim 13, wherein a means is provided to provide additional capability to the system to process audio, video, data, text or other multimedia components communicated in a multimedia conference by linking one or more media accelerator modules to the Core Controller [400).

22. The system of claim 13, wherein a means is provided to improve performance of a particular multimedia conferencing session which includes the Core Controller (400) selecting right number of dynamically linkable micro MCU module (1, 2....n) to be linked to the Core Controller (400).

23. The system of claim 13, wherein a means is provided to improve performance of a particular multimedia conferencing session which includes the Core Controller (400) selecting right number of media accelerator module to be linked to the Core Controller (400).

24. The system of claim 13, wherein additional geographically distributed modular software MCU is cascaded to provide high performance multimedia conferencing facilities to a number of geographically distributed clients and/or end points, if that multimedia conference cannot be efficiently handled by a particular modular software MCU.

25. The system of claim 13, wherein additional modular software MCU is cascaded with hardware Multisite Connection Units (hardware MCU) and/or software Multisite

Connection Units (software MCU) to provide multimedia conferencing facilities to a number of geographically distributed clients and/or end points.