WO2001026308A9

WO2001026308A9 - A dynamic programmable routing architecture with quality of service support

Info

Publication number: WO2001026308A9
Application number: PCT/US2000/027663
Authority: WO
Inventors: Aurel A Lazar; Mahesan Nandikesan
Original assignee: Xbind Inc; Aurel A Lazar; Mahesan Nandikesan
Priority date: 1999-10-06
Filing date: 2000-10-06
Publication date: 2001-12-06
Also published as: AU7867700A; WO2001026308A2; WO2001026308A3

Abstract

A communications network collects and present views of the network, including the networking capacity graph. The invention provides a framework and application programming interface (API) for implementing a large class of route computation algorithms without the need for any distributed communication on the part of the latter, and for providing interested software entities the ability to register to receive alerts of any changes to a view of the network. An objective of the invention is to simplify and modularize the operations of systems such as route computing engines and connection controllers, though it is not limited to these. The schedulable region of the network is distributed. Thus, remote objects desiring to do call admission control are enabled to perform independent route computation based on these views.

Description

A Dynamic Programmable Routing Architecture With Quality of

Service Support

This application claims priority under 35 U.S.C. 1 19(e) to Provisional Application No. 60/157,965, filed October 6. 1999, the entire contents of which are hereby incorporated by reference.

Field of the Invention

The present invention relates to the automatic discovery of the topology, capacity and other state information of a network for computation of routes satisfying quality of service requirements.

Background of the Invention Two methods for networking system maintenance have gotten much attention recently. The first is the Private Network-Network Interface ("Private Network- Network Interface. Version 1.0", ATM Forum, March 1996.) The other is the proxy- PAR being developed by the IETF. Methods related to topology discovery have been based on the control software residing on the ATM switches (U.S. Pat. No. 4,827,41 1 to Arrowood el al. U.S. Pat. No. 5,796.736 to Suzuki). Although methods exist for controlling an ATM switch remotely, a need still exists for methods of discovering useful information beyond simple network topology and other state information.

Summary of the Invention

The present invention comprises a communications network consisting of a set of switches and computers with network interface cards interconnected via links. The goal of the invention is to collect and present views of the network. In particular, two views are presented in the preferred embodiment of the invented system, namely the networking capacity graph of an ATM network, and a set of routes between pairs of ATM switches or computers. One skilled in the art, however, should understand the invention^'s general applicability to other types of connection-oriented networks (including soft-state networks). Moreover, the invention applies to a virtual network as well in that the latter is logically a collection of virtual paths. The invention also provides a framework and application programming interface (API) for implementing a large class of route computation algorithms without the need for any distributed communication on the part of the latter, and for providing interested software entities the ability to register to receive alerts of any changes to a view of the network. An objective of the invention is to simplify and modularize the operations of systems such as route computing engines and connection controllers, though it is not limited to these.

Previous efforts in network system maintenance have been focused on topology discovery, and to a lesser extent on link-state discovery, where the link-state parameters considered have been link bandwidth, and cell delays and losses in the link. None have considered distributing the schedulable region, which captures much more than the link bandwidth, cell delays and cell losses. Thus, remote objects desiring to do call admission control have had to rely on a generic call admission control (GCAC) algorithm, which is a very rough approximation. The networking capacity graph of the present invention, on the other hand addresses the need for comprehensive views of the entire network, thus permitting independent route computation based on these views. No prior efforts have discovered the networking capacity graph.

Previous methods for topology discovery and route computation have been integrated as one unit, thus not permitting these two functionalities to be implemented by different parties. Typically, a single entity performs both functions and there is no clear separation between the two functions. Similarly, those methods have integrated route computation with connection control, thus not permitting these two functionalities to be implemented by two different parties. For example, PNNI provides protocol-oriented interfaces between "peers", but does not provide APIs between routing and signaling. The present invention addresses the need for networking capacity graph discovery and distribution, as well as independent and simultaneous route calculation and virtual circuit creation.

A system according to the invention discovers the networking capacity graph at every computer of the network. In other words, a copy of the networking capacity graph becomes available at every computer. Thus, route computation algorithms need only read the local copy of the networking capacity graph. Independent route computation algorithms may run simultaneously at various nodes on the network. Each of these algorithms however, need perform local computations only, without the need for any distributed computation or communication. The advantage of this approach is that every computer has a complete view of the networking capacity graph and hence may make control and management decisions locally, ranging from computing routes to dimensioning virtual networks.

A system according to the invention discovers the networking capacity graph by running software entities known as "controllers" at every computer in the network. In the preferred embodiment described herein, these controllers communicate with each other via a packet-switched network, which may or may not be overlaid on top of the ATM network. In the preferred embodiment, the packet-switched network is taken to be IP. (However, no IP routing protocols are required if ATM is used as a link layer for IP.) In the process, they discover ATM links between ATM switches and/or computers. Moreover, the controllers keep track of the connection state of all the links that each of them controls. These locally discovered or maintained pieces of state information are distributed by the controllers to every other computer. As a result, all the computers become aware of the entire networking capacity graph.

During the course of network operation, the controllers continue monitoring, thus discovering any failure or recovery of links, switches or computers. All such discoveries are forwarded to controllers at every computer so that the view of the networking capacity graph is up-to-date at every computer. Whenever such a change occurs, all software entities that expressed an interest in such changes are alerted. Examples of such software entities are route computation engines and connection controllers. Upon receiving an alert, a route computation engine may re-compute routes and store them in the route repository, while a connection controller may reroute a call if, for example, a link or ATM switch utilized by the call fails.

Brief Description of the Drawing

The invention is described with reference to the several figures of the drawing, in which. FIG. 1 shows the components of an ATM cell, namely a 5-byte header and a 48-byte payload;

FIG. 2 shows the header of an ATM cell when used in the UNI format;

FIG. 3 shows the format of the ATM cell header when used in the NNI format;

FIG. 4 shows the format of an AAL-5 packet;

FIG. 5 shows an external computer connected to the control port of an ATM switch. The former controls the latter through this connection;

FIG. 6 illustrates the representation of an ATM network as a graph; FIG. 7 illustrates a sample schedulable region with two traffic classes;

FIG. 8 illustrates a networking capacity graph corresponding to the physical connection graph of FIG. 6;

FIG. 9 illustrates a computer with only an C-GOC for controlling its NIC(s);

FIG. 10 illustrates a computer running an C-GOC and an S-GOC. The S-GOC controls a switch attached to the computer via an ATM link;

FIG. 11 illustrates the set of meta virtual circuit segments that are setup on an ATM switch by its controller;

FIG. 12 illustrates the communication of two computers directly connected to each other by an ATM link; FIG. 13 illustrates the communication of control computer (of a switch) with computer connected directly to the switch;

FIG. 14 illustrates the communication of the control of computers of switches that are directly connected to each other; and

FIG. 15 illustrates the communication of the control of computers of switches that are connected to each other via a virtual path.

Detailed Description The present invention can be applied to any connection-oriented network but for the sake of concreteness, the following description assumes that an ATM network is used. ATM networks are considered to meet the requirements of broadband networks that flexibly support the quality of service requirements of multimedia applications.

ATM NETWORKS

ATM networks are well known in the art. In the present invention, only a very small subset of the ATM capabilities described by the ITU and ATM Forum is used. For example, the Traffic Management, PNNI, UNI and APS specifications of the ATM Forum are not used in the system we discuss herein. This is very important as it simplifies the system tremendously. However, the invention requires augmentation in the area of switch control.

As understood in the present document, an ATM network is a packet-switched connection-oriented network in which every packet is of a fixed size, namely a 53 byte cell. It consists of ATM switches and computers with ATM interface cards interconnected via links - fiber, copper, wireless or otherwise. A cell contains two parts, the header and the payload. The header contains information that is used for switching a cell arriving at an input port of a switch to an output port of the switch.

An ATM cell is shown in FIG 1 , and its content is as specified in the International Telecommunications Union's "B-ISDN ATM Layer Specification" (ITU-T 1.361 , 1993.) The first five bytes are the header and the remaining 48 bytes are the payload. The header consists of, among other things, a virtual path identifier (VPI) and a virtual channel identifier (VCI) (FIG 2). The VCI occupies sixteen bits, starting from bit position 12. The VPI occupies either eight bits, as shown in FIG 2. or twelve bits, as shown in FIG 3. The eight-bit case is used if the ATM cell is either transmitted or received by a network interface card attached to a computer. The twelve-bit case is used otherwise. The starting positions for the two cases are bit positions 0 and 4, respectively. The VPI and VCI fields are used by switches for switching the cell to the appropriate output port. Moreover, they identify the connection that the cell belongs to and hence provide information about the quality of service requirements for it. The GFC and CLP fields in the header are not used in the present invention. The HEC field is used for error detection in the ATM header. The PTI field is used for adaptation, which is described next. There are five ATM adaptation layers known in the art (International Telecommunications Union, "B-ISDN ATM Adaptation Layer (AAL) Specification," ITU-T 1.363, 1993.) The preferred embodiment of the present invention uses the fifth layer, namely AAL-5. The format of an AAL-5 frame is shown in FIG 4. One bit in the PTI field of the ATM header is used for indicating boundaries between AAL-5 frames. The other two bits of the PTI field are not used in the present invention.

The above are the only features of an ATM network that are relevant to the present invention. However, the form of some of these features is not important to the present invention. Thus, here is summarized the essential characteristics for the applicability of the present invention. The candidate network must be connection- oriented, and provide individual control of each node via an interface. The network could be a virtual network, in which case the links are virtual paths in an underlying network. The switching could be packet based or circuit based. The form of this node control interfaces is not important. It could be functional, protocol oriented, or otherwise. The semantic requirements of this interface are delineated below, both for a switch as well as non-switch computers.

The basic capabilities for controlling a switch is given below.

• Read, write and remove virtual circuit segments and virtual path segments

• Clear all virtual circuit segments and virtual path segments • Retrieve the number of ports on the switch

• Read the identifier, the bandwidth and the valid VPI/VCI ranges of each port.

• Get the identity of the control port

In addition, some packet switches may support capabilities to configure the packet multiplexers in the switch. Such capabilities enable a switch controller to tune the system so as to achieve larger schedulable regions, thus improving the system performance. These capabilities are not required for the present invention, but rather they are enhancements. Switches with such capabilities will be called switches with resource model support.

Some switches may support capabilities for reading the schedulable region or some other set of statistics that readily yield the schedulable region (defined below in the section on quality of service). An example of such a set of statistics is the set of moments of the instantaneous bit-rate corresponding to each traffic class (also defined below). This set is sufficient for traffic classes with tight delay requirements since it is well known in the art that the effect of temporal correlation has little effect on the small-buffer case. Switches with such capabilities will be called switches with Quality of Service (QoS support. These capabilities are not essential for the present invention. They merely improve system performance by providing potentially better estimates of the schedulable region than what is obtained via the trivial approximation.

The switch receives the above messages from a controller connected via a special port (which may be one of the ATM ports, a serial port, or otherwise), bus, intra-CPU communication, or some other means. In the preferred embodiment, the communication is via one of the ATM ports. This port will be referred to as the control port. The messages must be received at the control port on a pre-assigned VPI/VCI pair. Responses to these messages are sent out via the control port.

The basic capabilities for controlling network interface cards (NICs) on a computer are as follow. These capabilities are typically provided via the operating system:

• Create and remove sockets that are mapped to specified VPI-VC1 pairs and specified AAL on a specified network interface card.

• Send and receive AAL frames of the on a specified socket.

• Read the number of NICs and the identifier of each NIC on the computer • Read the bandwidth and valid VPI VCI ranges of each NIC.

Some operating systems, with the aid of ATM NIC drivers, might provide capabilities for configuring the cell multiplexers on the ATM NICs. Such operating systems are said to support a resource model. These capabilities are not required for the present invention, but they enable the system to be tuned for better performance. Some operating systems, with the aid of ATM NIC drivers, might provide capabilities for reading the schedulable region or some other set of statistics that readily yield the schedulable region. An example of such a set of statistics was given above. Operating systems with the above capabilities are said to support QoS.

IP NETWORKS

IP networks are well known in the art. The present invention relies only on the TCP and IP protocols. In particular, Internet routing protocols such as OSPF, RIP and BGP are not required. The reason for this is that all IP communication in the invented system is between neighbors, connected by physical links or a virtual path.

QUALITY OF SERVICE MODEL

A traffic class is a statistical model for the bit-rate of a digital information stream over time. It is represented using a set of quantitative parameters and a qualitative parameter. The former may consist of parameters such as the peak cell rate, average cell rate, etc. The latter is a qualitative characterization such as 'video^'. 'voice^', 'audio^" or 'data'. With each traffic class a set of quality of service constraints is attached. Examples of quality of service constraints include maximum cell dela , average cell delay and cell loss ratio. A special traffic class, known as the premium class, is defined with no specific characteristics. The bandwidth assigned to calls of the premium class are chosen on a per-call basis. This is in contrast to the other traffic classes, where all calls of any single traffic class share the same characteristics. The terms 'call', 'connection', and 'stream' are used interchangeably in this document.

The vector consisting of the number of streams of each traffic class is termed the operating point of the system specified in the preceding paragraph. The only exception is the component of the vector that represents the premium class. This component gives the sum of the bandwidths assigned to each of the calls of the premium class. For such a system, the operating point is said to be admissible if the multiplexer can provide the requested quality of service to each stream. The set of all admissible operating points is called the schedulable region of the system. It is a capacity characterization of the multiplexer and also a stability concept. A sample two-dimensional schedulable region (corresponding to two traffic classes) is given in FIG 7. The shaded area is the schedulable region.

A trivial lower bound to the schedulable region is obtained thus. By 'lower bound^" is meant that the schedulable region contains the lower bound. Denote the line speed of the multiplexer by C and the peak rates of the traffic classes by p_\ ...., p„. where n is the number of traffic classes. The lower bound is given by the region in N" bounded by the hyperplane whose intercepts arepι,...,p_n. The set N = {0, 1. 2, ...} . The above lower bound is called the peak-rate approximation to the schedulable region.

With the multiplexer at the output port of every switch and network interface card, there is an associated schedulable region and operating point. The schedulable region of the multiplexer will be called its networking capacity. It will also be referred to loosely as the networking capacity of the corresponding link. The traditional capacity (measured in bits/ seconds) will be referred to simply as the capacity of the port and link.

The description of an ATM network by the combination of its topology and the schedulable region of each link is called the networking capacity graph (NCG) of the network. FIG 8 illustrates the networking capacity graph corresponding to the ATM network shown in FIG 6. The concept is not limited to ATM networks, and can be used for all connection-oriented communication networks including telephone networks and soft-state networks. The networking capacity graph is very useful for evaluating end-to-end network quality of service. This paragraph provides a calculus for such evaluations. Consider the route starting at A and traversing the nodes E. F and C (FIG 6). Consider the QoS of a call along this path. Suppose that it occupies class cj, c₂, and c₃ on the output multiplexers at computer A, switch E and switch F, respectively. Then. the end-to-end network delay bound (whether it be maximum delay or average delay) for this call is the sum of the delay bounds for classes cj, c₂, and c₃ on the respective multiplexers, and the propagation delay on the links. In other words, delay bounds along a path are additive. Loss bounds are approximately additive. Suppose that the loss bounds on those output multiplexers are l\, , and /₃, respectively. Then, the loss along the route is bounded by 1 - (l-/ι)(l-/₂)(l-/₃) which is approximately l\ + + since l_\, /₂, and /₃ are typically much smaller than 1. If they are not small, then they should be added to the above computation.

ARCHITECTURE OVERVIEW In order to meet an objective of the invention to collect and present system views to objects such as router computing engines and connection controllers, the system requires certain software components to be located at every computer on the network and communicate with each other. Such a placement of software will permit the system to discover the relevant attributes of the entire ATM network and make them available locally at every computer. The architecture consists of two parts, the hardware component and the software component. The hardware component of the architecture consists of an ATM network and an IP network. The ATM network consists of a set of ATM switches and a set of computers interconnected via ATM links. Each computer is equipped with an ATM network interface card. Each ATM switch is connected on one of its ports to a particular computer designated its control computer, via a single ATM link. The IP network consists of a set of computers and a set of routers, Ethernet switches,

Ethernet hubs, bridges, repeaters, etc., interconnected via a set of links. The entities of the IP network are not necessarily distinct from those of the ATM network. For example, the computers in both networks could be the same. Moreover, the IP network may be implemented entirely on top of the ATM network, in which case the ATM layer acts as a link layer for IP traffic.

The software component of the architecture consists of a set of software modules that run on each computer. In the preferred embodiment, each such set of modules is termed a group of controllers ("GOC"), and each module is termed a controller. In the case of a control computer, additional groups of controllers run, one for each ATM switch controlled by the computer. A group of controllers on a computer controlling an ATM switch will be designated S-GOC (FIG 10) and a group of controllers associated with a computer will be designated C-GOC (FIG 9). In the event that a computer controls multiple ATM switches, there will be as many S-GOCs running on the computer. Both the C-GOC and the S-GOC contain the following controllers:

Neighbor discovery controller (NDC) State distribution controller Event channel controller Routing controller • Database controllers

The database controller contains repositories and APIs for storing and accessing a networking capacity graph and a set of routes. In addition to the above controllers, a C-GOC contains a network interface card controller and a S-GOC contains an ATM switch controller. Two controllers that are part of the same group of controllers are said to be local with respect to each other.

The controllers presented above cooperate with each other to perform the following functions. A neighbor discovery controller discovers the neighbors of either the computer it resides on or the ATM switch it controls. A state distribution controller distributes the discoveries made by its local neighbor discovery controller to other state distribution controllers in the network. In addition it receives information from other state distribution controllers and updates its local networking capacity graph repository. The event channel controllers provide an event channel to aid the state distribution controllers. The routing controller computes routes by reading its local networking capacity graph repository and writing the results to its local route repository. The switch controller is an access point for other entities to setup and tear down connections on the switch, and reserve bandwidth and VPI/VCI space on the switch ports. The network interface card controller is an access point for reserving bandwidth and VPI/VCI space on the NIC.

ARCHITECTURE COMPONENTS

A network interface card controller is a piece of software that provides two functions: First, it controls and manages the set of resources of each of the network interface cards on a computer. This set of resources consists of the input and output VPI/VCI spaces and the output multiplexers. Second, it provides admission control services for these resources.

The network interface card controller provides means to request the creation and tear down of virtual circuit originations and virtual circuit terminations. Each of the above is created by opening a socket. However, the following reservations are made first in order to avoid conflict and also to guarantee quality of service. A virtual circuit origination requires reservation of VPI, VCI and a slot of the specified traffic class in the schedulable region for the specified port (NIC). In the case of the premium class, multiple slots will be reserved. The number of slots is proportional to the requested bandwidth. This applies to all the operations below that involve the premium class, and will not be repeated. A virtual circuit termination requires reservation of VPI and VCI for the specified port. All of the tear down operations involve releasing the reserved resources. A switch controller is a piece of software that controls and manages an ATM switch. It performs tasks related to discovering the configuration of the switch such as the number of ports and port bandwidths, and also performs connection setup and tear down. In particular, it provides means to request the creation and tear down of virtual circuit segments, virtual path segments, and virtual path originations and terminations. Virtual circuit segments and virtual path segments are created by writing the entries in the switching table. However, the following reservations are made first in order to avoid conflicts and also guarantee quality of service. A virtual circuit segment requires reservation of VPI and VCI on both the input and output ports. In addition, it requires the reservation of a slot of the specified traffic class in the schedulable region of the output port. A virtual path segment requires the reservation of VPI on both the input and output ports. In addition, it requires the reservation of multiple slots of the premium class in the schedulable region of the output port. A virtual path origination requires reservation of VPI and multiple slots of the premium class in the schedulable region on the specified port. A virtual path termination requires reservation of VPI on the specified port. A multicast tree segment can be built by creating multiple virtual circuit segments.

After bootup, the switch controller polls the ATM switch periodically for a response. A message that does not affect the state of the switch (such as the request for the number of ports on the switch) may be used for polling. At the time of boot up of the switch controller, the switch itself need not be powered up and connected. It could be powered up and connected at a later time. Once a switch is detected by the polling, the switch controller proceeds as follows:

• Clear the switching table of the switch

• Get the identity of the control port • Get the number of ports, port bandwidths, and VPI/VCI ranges for each port.

Determine if the switch supports QoS (by executing some QoS related request). • If the switch supports QoS, use it to read the schedulable region. Also use it to get any future updates of the schedulable region.

• If the switch does not support QoS, use the trivial approximation to the schedulable region. Once it completes the above operations, the switch controller continues to poll the switch. It considers the switch to be up and running as long as it receives responses to the polling messages. If it ceases to receive responses, it considers the switch to be down. At this point, it continues to poll, hoping to detect a switch at some future instant. A neighbor discovery controller (NDC) is responsible for discovering all neighbors of the switch or computer it represents. Two switches, a switch and a computer, or two computers are said to be neighbors if they are connected via a single link or a virtual path. The NDC discovers its neighbors by sending a Hello message on all the ports of the switch or computer it represents, and listens for responses from those ports. Virtual path origination-termination pairs are also treated as ports. The only difference is that physical ports may use any VPI when transmitting data, whereas virtual path origination-termination pairs must use the VPI assigned to them. For simplicity, the following presentation refers only to physical ports, but with the understanding that virtual path origination-termination pairs could also be treated as additional ports. The Hello messages are sent over AAL-5 in the preferred embodiment. It contains the sender address, sender port number, receiver address and receiver port number.

The 'sender address' refers to the address of the switch or computer that the respective NDC represents. When an NDC does not know the identity of the switch or computer at the other end of a port (the receiver), it sends Hello messages with 'receiver address' and 'receiver port number' set to zero. Upon receiving a Hello message from the remote NDC, the local NDC becomes aware of the remote NDC's address and port number. Addresses are flat in the system described herein.

Two NDCs are said to be neighbors if they represent switches or computers that are neighbors. An NDC transmits a Hello message to all its neighbors periodically. Whenever an NDC receives a Hello message in which the 'receiver address' field is non-zero, it considers that it has discovered ATM connectivity with the neighbor on the port on which it received the Hello message. The address of the neighbor is given by the field 'sender address' in the received Hello message.

Once ATM connectivity with a neighbor has been discovered, the connectivity is assumed to be present until the NDC ceases to receive hello messages from the neighbor for a significant duration. At such time, the connectivity is assumed to be lost. Hardware-level detection of link failure would improve the detection time.

In order for an NDC that represents a switch to communicate with neighboring NDCs, it must create a set of virtual circuit segments on the switch it represents. These virtual circuit segments will be referred to as meta virtual circuit segments. A pair of such meta virtual circuit segment must be created for each port.

The meta virtual circuit segments between the control port (FIG 5) and every port of an ATM switch are shown in FIG 1 1. These are setup by the NDC at bootup time by invoking the 'create virtual circuit segment^' operation of the switch controller. Corresponding to each port/?, two virtual circuit segments are created as follows: The first virtual circuit segment has input port/?, input VPI 0, input VCI routing vci, and output port c, output VPI 0, output VCI vci(p), where c is the identifier of the control port and vci(p) is the meta VCI assigned to port/?. The value of vci(p) must be distinct for distinct values of ?. The routing_vci is some constant VCI. The value chosen for routing vci must be the same at every switch in the entire network. In FIG 1 1 , routing _vci = 50. The second virtual circuit segment is setup in the opposite direction to the first one. Note that the case where ? = c is not excluded. NDC controllers representing a computer communicate with their neighbors by sending and receiving Hello messages on VPI 0, VCI routing vci. The neighbor could be another computer (FIG 12) or an ATM switch (FIG 13). NDC controllers representing ATM switches communicate with their neighbors as illustrated in FIG 14: Suppose that the VCI assigned to port /? is vci{p), where vci(p) takes distinct values for distinct ports ?. The function vci(.) can be different at different switches. Then in order to communicate with a neighbor attached to port/?, the NDC controller sends and receives messages on VPI 0, VCI vci(p). In FIG 14, /? = 6 and vci(p) = 106 for switch A and/? = 3, vci(p) = 103 for switch B. Since vci(p) takes distinct values for distinct values of p. when the NCG controller receives a message on a particular VCI (e.g., 106) it knows the port of the switch on which the message was received (port 6 in the example). Communication between NDCs representing ATM switches which are interconnected via a virtual path is very similar. It is illustrated in FIG 15. In the preferred embodiment, all communication between neighboring controllers, except the Hello messages, is achieved by running IP over the meta virtual circuits. The Hello messages use AAL5 over the meta virtual circuits.

The state distribution controllers coordinate in the following way with their neighbors to ensure that all Networking Capacity Graph Repositories contain the same information. . Whenever a topology change occurs, the Neighbor Discovery Controllers associated with the nodes in question discover the change and inform their local State Distribution Controller. In the case where a node is discovered, the state distribution controller first synchronizes its local NCG repository with that of the discovered neighbor. Then, the state distribution controller distributes all changes made to its NCG database by pushing the changes to the local event channel controller. This second step is carried out even if the topology change was due to a loss of neighbor connectivity.

In addition to maintaining a consistent topology database, the state distribution controllers also coordinate to maintain a consistent view of the link states in their NCG controllers. The state distribution controllers periodically poll their local switch controller or NIC controller to read the schedulable regions, operating points and other link-state parameters of all the links attached to their node (switch or computer). Any significant changes (the threshold of which is determined by configuration) will result in the change being propagated to all other state distribution controllers via an event that is pushed on the event channel. An event channel provides a model of communication whereby senders and receivers do not need to be aware of each other. In the preferred embodiment, the event channel is implemented by a set of event channel controllers, one such controller executing on each computer. When a software entity sends a message to an event channel controller along with an event type, all receivers registered with that event channel controller under that event type will receive the message. In addition, for certain event types, all receivers registered with any event channel controller in the network will receive the message. Events of each type are given a sequence number. In the preferred embodiment, the event channel is realized as follows: When a sender sends an event (using the term here interchangeably with message) to an event channel controller, the latter passes on the event to the following: (i) all listeners registered with that specific event channel controller; (ii) event channel controllers at all neighboring nodes. Each of the event channels at neighboring nodes then in turn forwards the event to their neighbors. This process continues until all nodes have received the message. The general technique just referred to is known as flooding. When each of the event channel controllers receives this event, they pass it onto the listeners registered with them. The routing controller executes a routing algorithm to compute routes by reading the networking capacity graph from the local NCG database. It is triggered by an event that is pushed by the NCG repository when the networking capacity graph changes. The routing algorithm computes a set of routes for each one of the configured traffic classes, and writes them into the local route repository. Since the route repository, the NCG, and the event channel have well-defined interfaces, any routing algorithm from a large class of routing algorithms can be plugged into the framework. In particular, any routing algorithm applicable to multirate circuit switching can be applied. Moreover, the algorithm could be changed dynamically while the system is running. In the preferred embodiment, the route computation algorithm is changed by loading a new dynamic library on the fly. The networking capacity graph repository provides APIs for reading and writing aspects of the networking capacity graph. The write operations allow for adding and removing nodes and links to the networking capacity graph. They also allow for writing the schedulable region and statistics of the operating point for each link in the networking capacity graph. All the write operations trigger an event to be pushed to the event channel controller, so that all interested parties are alerted to the change in the networking capacity graph. The read operations allow for reading the set of nodes, the set of links attached to a node, and the schedulable region and operating point statistics of a link. The route repository provides interfaces for reading and writing routes. It also provides an interface for clearing all routes. Routes are categorized based on source- destination pair and traffic class. Multiple routes may exist for each category. The organization of these routes may take one of many forms. For example, the routes may be ordered according to some criterion such as path length. The criterion would be determined by the route computation algorithm. Alternatively, the paths could be organized as a probability distribution, with each path assigned a probability by the route computation algorithm.

In the preferred embodiment, all controllers belonging to a given GOC are placed in the same process of the same computer. This will minimize the communication overhead between these controllers. These controllers were grouped together as they perform tightly coupled functions. It is possible to split controllers belonging to single GOC, and execute them on different computers, or on the same computer but on different processes. It is generally more efficient for the neighbor discovery controller of a S-GOC to be located on the control computer of the switch that the S-GOC represents, and the neighbor discovery controller of a C-GOC to be located on the computer it represents. The reason for this is that the placement of the NDC controllers is important to their function, which is to detect their neighbors. If a switch controller is located on a computer other than the control computer, then it would require some means of communicating with the ATM switch, either through a proxy on the control computer or via virtual circuits. The latter solution could make the topology discovery mechanism complicated. Similarly, if a NIC controller is located on a computer other than the one it controls, then it would need some mechanism to communicate with that computer.

APPLICATIONS OF THE INVENTION

The following is a list of applications of the present invention. It is only illustrative and does not limit the use of the present invention in other ways. Connection Setup

The present invention is suitable for use by connection controllers. A connection controller is a software entity that accepts requests for ATM connections between specified source and destination nodes with quality of service. Given a request, the connection controller can read the route repository for a route between the source and destination with the specified quality of service. If there is a route between the specified nodes satisfying the requested quality of service, the route repository will return such a route. Then, the connection controller can use that route to identify the switches along the route and hence communicate with those switches to setup virtual circuit segments on those switches. As a result, a virtual circuit between the source and the destination nodes emerges. The means by which the connection controller communicates with the switches is not relevant here. Connection Re-routing

The present invention is suitable for handling connection re-routing. Consider the case where an existing connection is abruptly lost due to an ATM switch failure or an ATM link failure. Such failures will be detected by one or more neighbor discovery controllers and distributed to all nodes via the event channel. Any software entity desiring to receive an alert message indicating the change of topology may register with the NCG event channel in order to receive such an alert message. Upon receiving an alert message, the software entity (e.g., a connection controller) can compare the old route of the call with the new route in the route repository and decide which virtual circuit segments of the old call need to be torn down, and which virtual circuit segments need to be created newly. Upon carrying out these actions, the call would have been re-routed around the failed ATM switches and/or links. Thus, the users of the call will resume reception of ATM cells from each other. Similar action can be taken to reroute an affected virtual paths. In case there is no route between the source and destination nodes after the failure, then re-routing would not be possible. Garbage Collection

The scenario of the preceding section where no route is found between the source and destination after an ATM switch or link failure would result in virtual circuits passing through the failed nodes to be fragmented. The virtual circuit segments that belong to these fragments ("garbage") need to be torn down ("garbage collected"). Upon being notified of failure (topology change), a software entity such as the one that created the connection originally, can proceed to garbage collect the dangling virtual circuit segments of the connection. A similar procedure can be used to garbage collect fragmented virtual paths. Network Management

The present invention is suitable for visualizing the networking capacity graph of an ATM network. The management system can register itself with the event channel controller to receive updates, just like a NCG controller. Such updates can be used to visual the networking capacity graph in real-time on a graphical user interface (GUI).

The present invention is also suitable for a collection of management agents to monitor the health of their neighbors. In this way, if one or more management agents crash or otherwise cease to be operational, neighboring agents can detect the failure. The present invention aids this monitoring by providing the agents with a list of neighboring nodes.

Other embodiments of the invention will be apparent to those skilled in the art from a consideration of the specification or practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with the true scope and spirit of the invention being indicated by the following claims.

What is claimed is:

Claims

L A communications network comprising; a first node and a second node interconnected through a network: each of said nodes comprising a computer; each of said computers executing a controller; each controller having the ability to independently and simultaneously create virtual circuit segments.

2. The communications network of claim 1 , wherein said controllers are adapted to perform one or more functions from the group consisting of neighbor discovery, state distribution, event channel, routing, and database functions.

3. The communications network of claim 1 , wherein each controller automatically discovers the networking capacity graph of the network.

4. The communications network of claim 3, wherein each controller contains a networking capacity graph repository.

5. The communications network of claim 2, wherein each controller routing function provides an interface for loading an algorithm for computing routes.

6. The communications network of claim 5, wherein said loading may be from a dynamic library.

7. The communications network of claim 5, wherein the routing algorithm may be a multi-rate circuit switching algorithm.

8. The communications network of claim 5. wherein said routing algorithm uses schedulable regions to represent the capacity of a multiplexer.

9. The communications network of claim 5, wherein each controller contains a route repository.

10. The communications network of claim 8, wherein the statistics of the operating point of the schedulable region is used as link state advertisements for registered software entities seeking alerts regarding changes to the network state.

1 1. The communications network of claim 3, wherein remote objects seeking call admission may compute a set of routes based upon said networking capacity graph.

12. The communications network of claim 1 , wherein each controller continually monitors the network for discovery and forwarding of information relating to any failures or recovery of network links, switches or computers.

13. A method of discovery and distribution of a networking capacity graph as a complete description of a network comprising at least two computers, said method comprising the steps of: at each of the at least two computers, discovering links to adjacent computers on the network; at each of the at least two computers, querying the discovered connections to obtain remote connection information; at each of the at least two computers, discovering the networking capacity graph of the network; at each of the at least two computers, independently and simultaneousl) computing routes by applying possibly different algorithms to the networking capacity graph; creating virtual circuit segments; at each of the at least two computers, continually monitoring and distributing information relating to the state of the computer and adjacent computers.

14. The method of claim 13, wherein any of the steps of discovering connections, obtaining remote connection information, discovering the networking capacity graph. computing routes, creating virtual circuit segments, or monitoring and distributing state information is executed by a controller.

15. The method of claim 14, wherein each controller discovering the networking capacity graph may contain and write to a networking capacity graph repository.

16. The method of claim 14, wherein each controller computing a route may load an algorithm for computing routes.

17. The method of claim 14, wherein each controller computing a route may dynamically load an algorithm.

18. The method of claim 14, wherein each controller computing a route may load a multi-rate circuit switching algorithm.

19. The method of claim 14. wherein each controller computing a route may contain and write to a route repository.

20. The method of claim 14, wherein each controller communicating with registered software entities seeking alerts regarding changes to the network state uses the statistics of the operating point of the schedulable region as link state advertisements.

21. The method of claim 14. wherein remote objects seeking call admission may compute a set of routes based upon the networking capacity graph.

22. The method of claim 14. wherein each controller monitoring and distributing state information discovers and communicates information relating to any failures or recovery of network links, switches or computers.

23. The communications network of claim 1. wherein said nodes comprise switches connected on their control ports to a computer.

24. The communications network of claim 1 , wherein said communications network comprises a virtual network.

25. The method of claim 13, wherein the network comprises a virtual network.

26. The method of claim 25, wherein discovering links comprises discovering logical connections on the virtual network.