LOAD BALANCE AND FAULT TOLERANCE IN A NETWORK SYSTEM Cross Reference To Related Applications
This application claims priority to U.S. Provisional Application No. 60/095,652, filed August 7, 1998.
Background of the Invention The present invention relates to load balancing and fault tolerance amongst computer servers functioning to track Internet/Intranet transactions. In particular the present invention relates to a system of load balancing and fault tolerance utilizing a lightweight algorithm and continually cycling processes to reduce exchange of server state information.
A mechanism is described to achieve both load balancing and fault tolerance. The backup systems provide load balancing services while active. When a system fails, the remaining available systems take over the failed system's load. A master system determines which participating system owns a decision track. Ownership of a decision track indicates responsibility for executing a contact gathering process and an event evaluation process. Step evaluation processes are distributed among available system within the same peer group .
By way of example access to distributed networks such as the internet has increased greatly in recent years and challenged commerce to use the internet advantageously. Thousands of internet and intranet, hereinafter Inet, sites have been added to networks. A
great expenditure of time and effort has been invested in creating a myriad of resources available to Inet browsers. As a means- to benefit from the Inet forum, it is useful to have tools to interact with those browsing the Inet such as being able to track parties contacting a particular Inet-site. It is important that these tools be reliable and responsive as an Inet contact may be the first and possibly only type of contact made with an individual . The creation of virtual worlds online has further increased the importance of reliability and responsiveness. Purveyors of the Inet desire interactions that further emulate a real life commercial experience. Virtual storefront owners, corporate home pages, online catalogue vendors, and a myriad of other Inet-site owners, find it useful to be able to emulate the real life experience. As the complexity of an Inet interaction increases, the expectations of an individual making contact via the Inet also increases. Contact requires a fast reliable response.
A traditional method of increasing transaction speed is to increase the speed of processor units running the application. Processors have limits however to the maximum throughput available. Increasing demands cannot always be met by a faster processing box. Another method of increasing transaction speed is through shared processing amongst a plurality of processing units. Typically, however, this has involved very complicated hardware and software solutions requiring a sizable investment in man hours and expense. Often these types of solutions are not warranted for a dedicated Inet application.
To be effective, a system needs to be right sized to the task at hand. Consequently, there remains a need
for a simple, cost effective means of sharing a processing load and also providing fault tolerance.
Summary of the Invention Accordingly, this invention provides a method of load balancing and fault tolerance amongst a plurality of servers on a computer network, such as the internet or a private network. In a preferred embodiment of the invention, a programmed computer server divides processing work into tracks of work that may be referred to as decision tracks. Each decision track comprises a series of conditions that are to be tested by records of a database. As conditions of the decision track are tested and met, an appropriate action is taken in response to the condition met. In addition, actions can be sequenced so as to achieve a desired result, as illustrated in Table 1 below.
Decision Track
Condition 1 If Condition is Met Then Action 1
Condition 2 If Condition is Met Then Action 2
Condition 3 If Condition is Met Then Action 3
Condition 4 If Condition is Met Then Action 4
Condition 5 If Condition is Met Then Action 5
Decision tracks are constructed so as to be able to be claimed by a computer server. A plurality of computer servers is arranged into a peer group networked together. The network enables the computer servers to communicate with each other. Servers coordinate within the peer group to claim individual decision tracks. Thereafter, the server owning a decision track processes initial work pertaining to that track and also allocates
blocks of work from that decision track to other servers on the network who have advertised for work. In the event a server ceases. to communicate with the other servers in the peer group, all uncompleted work assigned to a non-communicating server is reallocated to another server.
Peer groups also elect a master server. The presence of a master server signifies that a group is functioning and able to handle common tasks. In addition to normal peer group server functions, a master server handles tasks pertaining to the peer group as a whole, such as e-mail. The master is typically elected by a simple device such as the lowest machine number of each server. For the purposes of this disclosure an Internet can refer to, for example, a network comprising computers exceeding the boundaries of a private network. An Intranet can refer to, for example, computers within a private network. An Inet can refer to an Internet and/or an Intranet adhering to an internet protocol or similar protocol. An Inet-site is, for example, a site available on either an Internet or an Intranet. A network, for example, can have a computer acting as a server and a computer acting as a client. A contact can, for example, be an access to an electronic interface such as a web site, or other contents of a stored memory such as a hard drive or dynamic random access memory of a server. A client can be a person, a node operator, or broadly, a machine or electronic device making such contact, or causing a node of a network to make such a contact. Real time is meant to be read broadly to signify on a basis timely to or in relation to an individual event.
Other advantages and features of the present invention will become apparent from the following description, including the drawings and the claims.
Brief Description of the Drawing FIG. 1 illustrates a typical configuration supporting this invention.
FIG. 2 illustrates the query process of a decision track,
FIG. 3 illustrates a load balancing sequence,
Description of the Preferred Embodiments
According to the present invention, an apparatus method and system are described for load balancing and fault tolerance comprising a plurality of computer servers 110 networked together into a peer group 120 and also networked to a database server 130. The network provides a means of communication between servers. Work is divided into tracks 135 and distributed amongst the peer group servers according to the availability of each server to accommodate additional work. Utilizing a multitude of servers to process work effectively lessens the work required by a single server and effectively speeds the response of the system. The ability of a peer group to allocate work amongst available servers, and then reallocate work if a particular server should become unavailable, provides fault tolerance.
Servers periodically notify other peer group servers of their presence on a network by way of a well- known device such as a "hello" message or an
"advertisement . " Such advertisements are performed on a periodic cycle. A preferred periodic cycle is about 15 seconds. However, periodic cycles may be any length that is appropriate based on network characteristics, such as the number of nodes, the speed of communication, and the speed of the processor units. Generally, any periodic cycle between 5 seconds and 120 seconds is acceptable.
If an advertisement is not received from a server for a predetermined number of periodic cycles, such as for example, 4 cycles of 15 seconds each, the other peer group servers will consider the mute server unavailable. Any work previously allocated to a server subsequently determined unavailable is reallocated amongst available servers .
In a preferred embodiment of the present invention, work is structured so that it may be executed by decision tracks. Each decision track comprises a series of queries to be made against records of a database. If the conditions of a query are met, 210 then an appropriate action may be taken, if the conditions are not met, then a next record, or a next set of conditions is queried.
During operation, a peer group computer server 110 will claim ownership of one or more decision tracks 135. After determination of ownership 310, a computer server 110 performs initial work such as for example, contact gathering 320. Contact gathering comprises creation of a set of contact records 145 that are to be put on a particular step of the decision track 135. After the contact gathering is complete, blocks of work comprising steps are created and can either be distributed to other computer servers 110 in the peer group 330 or performed by an owning server. A block of work may consist of, by way of example, a set of contact records ready for the next step of a decision track to be performed on them, or a list of steps to be executed on a particular record. After the distribution, a server evaluates events 340 for any changes in conditions and cycles through the process again.
Distribution of work is effectuated by a response to advertisements or requests for work sent out by various computer servers 110 included in a peer group 120. As a server is capable of accepting additional work, it will send an advertisement to the other servers in the peer group requesting work, such as for example a step list block 350. The requesting computer server 350 the executes indicated steps 360. An owner computer server 110 who receives such an advertisement may send a block of work to the advertising computer server llOto be processed. In this manner there is a continual load sharing of available work.
In one preferred embodiment, decision track ownership is claimed by attaching a claim counter to an advertisement broadcast by a server. A server will claim a decision track and set a counter to a predetermined interval, for example two. Each time the server broadcasts an advertisement, the counter decrements one. When the counter reaches zero the decision track is authoritatively owned by the claiming server. Other peer group servers may challenge the claim for a decision track by claiming it for themselves during the counter interval .
If two or more servers claim ownership of the same decision track, ownership election reverts to an arbitration routine. Arbitration determines ownership by a simple criterion such as the server with the least number of owned tracks . In the instance where two or more servers have an equal number of tracks, the ownership is awarded to the server with lowest machine
ID.
A preferred embodiment teaches each server 110 maintaining a table 155 to store the time of the most
recent advertisement for each server and the decision tracks owned by each server. Each server queries the table to test if a predetermined period has elapsed without notification from any of the peer group servers. If a predetermined period has elapsed without notification from a particular server, the non- communicating server is deemed to be unavailable. All decision tracks owned by unavailable servers are reallocated to the remaining servers. Reallocation is accomplished in much the same manner as initial election. A server will advertise claiming ownership of a decision track of a server determined to be unavailable. If the advertisement is not challenged within a predetermined number of advertisement cycles, ownership is awarded to the advertising server.
The allocation and reallocation process acts as fault tolerance. A decision track 135 will not be without an owner for more than the predetermined period. After the predetermined period has elapsed another server 110 takes ownership and the work of the decommissioned server commences again. Each peer group server 110 includes a copy of each decision track 135 as well as the table recording ownership of the various decision tracks. As a server 110 begins functioning as an owner, it records ownership in the table, and commences to perform the work allocated to the owner of that decision track.
A database server 130 stores the contact data records 145 referenced in the various blocks of work performed by peer group servers executing decision tracks. Typically, there is only one database server 130 from which all records are processed. In this manner all peer group servers have access to the same data.
A peer group 120 will also elect a master server 140. The advertised presence of a master server declares that network connectivity exists, that the peer group is communicating properly, and that operations may commence. Elections for a master server 140 are based on a simple criterion such as the lowest ID of the servers involved. In a periodic cycle the master will broadcast an "advertisement" or hello message, declaring its presence to other servers in the peer group. One preferred embodiment of a periodic cycle is 15 seconds. Another preferred embodiment of a periodic cycle is between 5 seconds and 120 seconds. The duration of the periodic cycle will depend on the speed of the network and the processing power of the servers.
If the presence of a master server 140 has not been detected by a peer group 120, through receipt of an advertisement from a master server 140 for a period of some number of periodic cycles, for example 4 cycles, the peer group elects a new master server. A period may be comprised of more or less cycles depending on the criticality of the timing for the work being performed and the processing power of the servers.
Decision tracks 135 and the criteria for each step of a decision track 135 can be created and manipulated via a user interface 165. In a preferred embodiment graphical representation for each step of a decision making process correlating to each step of a decision track is created. The graphical representation can facilitate accurate processing of data and ease of use. Another method for creating decisions tracks would include a written language statement defining criteria for each decision.
In a preferred embodiment of this invention, a software program on a computer readable medium is loaded on a plurality of servers. The software program comprises a front-end application that allows users to access a variety of the features designed to load balance and provide fault tolerance. Features are grouped into different categories according to the type of users. A security scheme allows user access to a feature according to category.
An administrator can be responsible for secure configuration and maintenance of a decision track software. The administrator can configure databases and external access methods and defines access rights of various users. The administration is also responsible for defining the synchronization relationships with other servers .
Decision tracks 135 may also define a series of actions to take based on different trigger events. Trigger events may be time-based single events, time- based recurring events or external input and query result events. Queries may be directed to a database. In addition, queries against external Structured Query Language (SQL) accessible databases will operate. Conditionals control the transition of individual query results to the next state in the decision track.
The methods and mechanisms described here are not limited to any particular hardware or software configuration, or to any particular communications modality, but rather they may find applicability in any communications or computer network environment. In a preferred embodiment of this invention, a software program comprising computer readable code on a computer readable medium is loaded onto a plurality of servers.
The software program additionally comprises a front-end application that allows users to access a variety of the features designed to automate load sharing and fault tolerance.
The techniques described here may be implemented in hardware or software, or a combination of the two. Preferably, the techniques are implemented in computer programs executing one or more programmable computer that includes a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements) , and suitable input and output devices. The programmable computers may be either general-purpose computers or special-purpose, embedded systems. In either case, program code is applied to data entered with or received from an input device to perform the functions described and to generate output information. The output information is applied to one or more output devices .
Each program is preferably implemented in a high level procedural or object-oriented programming language to communicate with a computer system. However, the programs can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language.
Each such computer program is preferably stored on a storage medium or device (e.g., CD-ROM, hard disk, magnetic diskette, or memory chip) that is readable by a general or special purpose programmable computer for configuring and operating the computer when the storage medium or device is read by the computer to perform the procedures described. The system also may be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured
causes a computer to operate in a specific and predefined manner.
The invention described has broad application to a wide range of electronic interaction environments and a number of embodiments based upon the principles disclosed are possible.
What is claimed is: