WO2002037799A2 - Chargement commande de serveur - Google Patents
Chargement commande de serveur Download PDFInfo
- Publication number
- WO2002037799A2 WO2002037799A2 PCT/US2001/047013 US0147013W WO0237799A2 WO 2002037799 A2 WO2002037799 A2 WO 2002037799A2 US 0147013 W US0147013 W US 0147013W WO 0237799 A2 WO0237799 A2 WO 0237799A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- server
- dispatcher
- requests
- connections
- connection
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1008—Server selection for load balancing based on parameters of servers, e.g. available memory or workload
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/566—Grouping or aggregating service requests, e.g. for unified processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/16—Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/16—Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
- H04L69/161—Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/30—Definitions, standards or architectural aspects of layered protocol stacks
- H04L69/32—Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
- H04L69/322—Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
- H04L69/329—Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/40—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass for recovering from a failure of a protocol instance or entity, e.g. service redundancy protocols, protocol state redundancy or protocol service redirection
Definitions
- the present invention relates generally to controlled loading of servers, including standalone and cluster-based Web servers, to thereby increase server performance. More particularly, the invention relates to methods for controlling the amount of data processed concurrently by such servers, including the number of connections supported, as well as to servers and server software embodying such methods .
- a variety of Web servers are known in the art for serving the needs of the over 100 million Internet users. Most of these Web servers provide an upper bound on the number of concurrent connections they support . For instance, a particular Web server may support a maximum of 256 concurrent connections. Thus, if such a server is supporting 255 concurrent connections when a new connection request is received, the new request will typically be granted. Furthermore, most servers attempt to process all data requests received over such connections (or as many as possible) simultaneously. In the case of HTTP/1.0 connections, where only one data request is associated with each connection, a server supporting a maximum of 256 concurrent connections may attempt to process as many as 256 data requests simultaneously. In the case of HTTP/l.l connections, where multiple data requests per connection are permitted, such a server may attempt to process in excess of 256 data requests concurrently.
- each server in the pool (also referred to as a back-end server) typically supports some maximum number of concurrent connections, which may be the same as or different than the maximum number of connections supported by other servers in the pool. Thus, each back-end server may continue to establish additional connections (with the dispatcher or with clients directly, depending on the implementation) upon request until its maximum number of connections is reached.
- the operating performance of a server at any given time is a function of, among other things, the amount of data processed concurrently by the server, including the number of connections supported and the number of data requests serviced.
- the amount of data processed concurrently by the server including the number of connections supported and the number of data requests serviced.
- what is needed is a means for dynamically managing the number of connections supported concurrently by a particular server, and/or the number of data requests processed concurrently, in such a manner as to improve the operating performance of the server .
- a dispatcher is preferably interposed between clients and one or more back-end servers, and preferably monitors the performance of each back-end server (either directly or otherwise) .
- the dispatcher For each back-end server, the dispatcher preferably also controls, in response to the monitored performance, either or both of the number of concurrently processed data requests and the number of concurrently supported connections to thereby control the back-end servers' performance.
- the dispatcher uses a packet capture library for capturing packets at OSI layer 2 and implements a simplified TCP/IP protocol in user-space (vs. kernel space) to reduce data copying.
- COTS commercially off-the-shelf
- a server for providing data to clients includes a dispatcher having a queue for storing requests received from clients, and at least one back-end server.
- the dispatcher stores in the queue one or more of the requests received from clients when the back-end server is unavailable to process the one or more requests.
- the dispatcher retrieves the one or more requests from the queue for forwarding to the back-end server when the back-end server becomes available to process them.
- the dispatcher determines whether the back-end server is available to process the one or more requests by comparing a number of connections concurrently supported by the back-end server to a maximum number of concurrent connections that the back-end server is permitted to support, where the maximum number is less than a maximum number of connections which the back-end server is capable of supporting concurrently.
- a method for controlled server loading includes the steps of defining a maximum number of concurrent connections that a server is permitted to support, limiting a number of concurrent connections supported by the server to the maximum number, monitoring the server's performance while it supports the concurrent connections, and dynamically adjusting the maximum number as a function of the server's performance to thereby control a performance factor for the server.
- a method for controlled server loading includes the steps of receiving a plurality of data requests from clients, forwarding a number of the data requests to a server for processing, and storing at least one of the data requests until the server completes processing at least one of the forwarded data requests .
- a method for controlled server loading includes the steps of defining a maximum number of data requests that a server is permitted to process concurrently, monitoring the server's performance, and dynamically adjusting the maximum number in response to the monitoring step to thereby adjust the server's performance.
- a method for controlled loading of a cluster-based server having a dispatcher and a plurality of back-end servers includes the steps of receiving at the dispatcher a plurality of data requests from clients, forwarding a plurality of the data requests to each of the back-end servers for processing, and storing at the dispatcher at least one of the data requests until one of the back-end servers completes processing one of the forwarded data requests .
- a method for controlled loading of a cluster-based server having a dispatcher and a plurality of back-end servers includes the steps of defining, for each back-end server, a maximum number of data requests that can be processed concurrently, monitoring the performance of each back-end server, and dynamically adjusting the maximum number for at least one of the back-end servers in response to the monitoring step to thereby adjust the performance of the cluster-based server.
- a server for providing data to clients includes an OSI layer 4 dispatcher having a queue for storing connection requests received from clients, and at least one back-end server.
- the dispatcher stores in the queue one or more of the connection requests received from clients when the back-end server is unavailable to process the one or more connection requests.
- the dispatcher retrieves the one or more connection requests from the queue for forwarding to the back-end server when the back-end server becomes available to process the one or more connection requests.
- the dispatcher also determines whether the back-end server is available to process the one or more connection requests by comparing a number of connections concurrently supported by the back-end server to a maximum number of concurrent connections that the back-end server is permitted to support, where this maximum number is less than a maximum number of connections which the back-end server is capable of supporting concurrently.
- a method for controlled server loading includes receiving a plurality of connection requests from clients, establishing, in response to some of the connection requests, a number of concurrent connections between a server and clients, and storing at least one of the connection requests until one of the established connections is terminated.
- a method for controlled server loading includes defining a maximum number of concurrent connections that a server is permitted to support, monitoring the server's performance, and dynamically adjusting the maximum number in response to the monitoring to thereby adjust the server's performance .
- a method for controlled loading of a cluster-based server having a dispatcher and a plurality of back-end servers.
- the method includes receiving at the dispatcher a plurality of connection requests from clients, forwarding a plurality of the connection requests to each of the back-end servers, each back-end server establishing a number of concurrent connections with clients in response to the connection requests forwarded thereto, and storing at the dispatcher at least one of the connection requests until one of the concurrent connections is terminated.
- a method for controlled loading of a cluster-based server having a dispatcher and a plurality of back-end servers.
- the method includes defining, for each back-end server, a maximum number of concurrent connections that can be supported, monitoring the performance of each back-end server, and dynamically adjusting the maximum number for at least one of the back-end servers in response to the monitoring to thereby adjust the performance of the cluster- based server.
- a computer server for providing data to clients includes a dispatcher for receiving data requests from a plurality of clients, and at least one back-end server.
- the dispatcher establishes at least one persistent connection with the back-end server, and forwards the data requests received from the plurality of clients to the back-end server over the persistent connection.
- a method for reducing connection overhead between a dispatcher and a server includes establishing a persistent connection between the dispatcher and the server, receiving at the dispatcher at least a first data request from a first client and a second data request from a second client, and forwarding the first data request and the second data request from the dispatcher to the server over the persistent connection.
- a method for reducing connection overhead between a dispatcher and a server includes establishing a set of persistent connections between the dispatcher and the server, maintaining the set of persistent connections between the dispatcher and the server while establishing and terminating connections between the dispatcher and a plurality of clients, receiving at the dispatcher data requests from the plurality of clients over the connections between the dispatcher and the plurality of clients, and forwarding the received data requests from the dispatcher to the server over the set of persistent connections.
- a method for reducing back-end connection overhead in a cluster-based server includes establishing a set of persistent connections between a dispatcher and each of a plurality of back-end servers, maintaining each set of persistent connections while establishing and terminating connections between the dispatcher and a plurality of clients, receiving at the dispatcher data requests from the plurality of clients over the connections between the dispatcher and the plurality of clients, and forwarding each received data request from the dispatcher to one of the servers over one of the persistent connections.
- a computer-readable medium has computer- executable instructions stored thereon for implementing any one or more of the servers and methods described herein.
- Fig. 1 is a block diagram of a server having an L7/3 dispatcher according to one embodiment of the present invention.
- Fig. 2 is a block diagram of a cluster-based server having an L7/3 dispatcher according to another embodiment of the present invention.
- Fig. 3 is a block diagram of a server having an L4/3 dispatcher according to a further embodiment of the present invention.
- Fig. 4 is a block diagram of a cluster-based server having an L4/3 dispatcher according to yet another embodiment of the present invention.
- Fig. 5 is a block diagram of a simplified TCP/IP protocol implemented by the L7/3 dispatcher of Fig. 2.
- Fig. 6 is an activity diagram illustrating the processing of packets using the simplified TCP/IP protocol of Fig. 5.
- Fig. 7(a) is a state diagram for the L7/3 dispatcher of
- Fig. 2 as it manages front-end connections.
- Fig. 7(b) is a state diagram for the L7/3 dispatcher of
- Fig. 2 as it manages back-end connections.
- Fig. 8 illustrates a two-dimensional server mapping array for storing connection information.
- Fig. 9 is a block diagram illustrating the manner in which back-end connections are maintained.
- Fig. 10 illustrates the manner in which the dispatcher of Fig. 2 translates sequence information for a packet passed from a back-end connection to a front-end connection.
- Corresponding reference characters indicate corresponding features throughout the several views of the drawings .
- a Web server according to one preferred embodiment of the present invention is illustrated in Fig. 1 and indicated generally by reference character 100.
- the server 100 includes a dispatcher 102 and a back-end server 104 (the phrase "back-end server” does not require server 100 to be a cluster-based server) .
- the dispatcher 102 is configured to support open systems integration (OSI) layer seven (L7) switching (also known as content-based routing) , and includes a queue 106 for storing data requests (e.g., HTTP requests) received from exemplary clients 108, 110, as further explained below.
- the dispatcher 102 is transparent to both the clients 108, 110 and the back-end server 104. That is, the clients perceive the dispatcher as a server, and the back-end server perceives the dispatcher as one or more clients.
- the dispatcher 102 preferably maintains a front-end connection 112, 114 with each client 108, 110, and a dynamic set of persistent back-end connections 116, 118, 120 with the back-end server 104.
- the back-end connections 116-120 are persistent in the sense that the dispatcher 102 can forward multiple data requests to the back-end server 104 over the same connection.
- the dispatcher can preferably forward data requests received from different clients to the back-end server 104 over the same connection, when desirable. This is in contrast to using client- specific back-end connections, as is done for example in prior art L7/3 cluster-based servers. As a result, back-end connection overhead is markedly reduced.
- non-persistent and/or client-specific back-end connections may be employed.
- the set of back-end connections 116-120 is dynamic in the sense that the number of connections maintained between the dispatcher 102 and the back-end server 104 may change over time, including while the server 100 is in use.
- the front-end connections 112, 114 may be established using HTTP/1.0, HTTP/1.1 or any other suitable protocol, and may or may not be persistent.
- Each back-end connection 116-120 preferably remains open until terminated by the back-end server 104 when no data request is received over that connection within a certain amount of time (e.g., as defined by HTTP/1.1), or until terminated by the dispatcher 102 as necessary to adjust the performance of the back-end server 104, as further explained below.
- the back-end connections 116-120 are initially established using the HTTP/1.1 protocol (or any other protocol supporting persistent connections) either before or after the front-end connections 112-114 are established.
- the dispatcher may initially define and establish a default number of persistent connections to the back-end server before, and in anticipation of, establishing the front-end connections.
- This default number is typically less than the maximum number of connections that can be supported concurrently by the back-end server 104 (e.g., if the back-end server can support up to 256 concurrent connections, the default number may be five, ten, one hundred, etc., depending on the application).
- this default number represents the number of connections that the back-end server 104 can readily support while yielding good performance.
- the default number of permissible connections selected for any given back-end server will depend upon that server's hardware and/or software configuration, and may also depend upon the particular performance metric (e.g., request rate, average response time, maximum response time, throughput, etc.) to be controlled, as discussed further below.
- the dispatcher 102 may establish the back-end connections on an as-needed basis (i.e., as data requests are received from clients) until the default (or subsequently adjusted) number of permissible connections for the back-end server 104 is established.
- the dispatcher may establish another back-end connection immediately, or when needed.
- the performance of a server may be enhanced by limiting the amount of data processed by that server at any given time. For example, by limiting the number of data requests processed concurrently by a server, it is possible to reduce the average response time and increase server throughput.
- the dispatcher 102 is configured to establish connections with clients and accept data requests therefrom to the fullest extent possible while, at the same time, limit the number of data requests processed by the back-end server 104 concurrently. In the event that the dispatcher 102 receives a greater number of data requests than what the back-end server 104 can process efficiently (as determined with reference to a performance metric for the back-end server) , the excess data requests are preferably stored in the queue 106.
- the dispatcher 102 will preferably not forward another data request over that same connection until it receives a response to the previously forwarded data request.
- the maximum number of data requests processed by the back-end server 104 at any given time can be controlled by dynamically controlling the number of back-end connections 116-120. Limiting the number of concurrently processed data requests prevents thrashing of server resources by the back-end server's operating system, which could otherwise degrade performance.
- a back-end connection over which a data request has been forwarded, and for which a response is pending may be referred to as an "active connection.”
- a back-end connection over which no data request has as yet been forwarded, or over which no response is pending, may be referred to as an "idle connection.”
- Data requests arriving from clients at the dispatcher 102 are forwarded to the back-end server 104 for processing as soon as possible and, in this embodiment, in the same order that such data requests arrived at the dispatcher.
- the dispatcher 102 Upon receiving a data request from a client, the dispatcher 102 selects an idle connection for forwarding that data request to the back-end server 104. When no idle connection is available, data requests received from clients are stored in the queue 106. Thereafter, each time an idle connection is detected, a data request is retrieved from the queue 106, preferably on a FIFO basis, and forwarded over the formerly idle (now active) connection.
- the system may be configured such that all data requests are first queued, and then dequeued as soon as possible (which may be immediately) for forwarding to the back-end server 104 over an idle connection.
- the dispatcher 102 After receiving a response to a data request from the back-end server 104, the dispatcher 102 forwards the response to the corresponding client.
- Client connections are preferably processed by the dispatcher 102 on a first come, first served (FCFS) basis.
- the dispatcher preferably denies additional connection requests (e.g., TCP requests) received from clients (e.g., by sending an RST to each such client) .
- additional connection requests e.g., TCP requests
- the dispatcher 102 ensures that already established front-end connections 108-110 are serviced before requests for new front-end connections are accepted.
- the dispatcher may establish additional front-end connections upon request until the maximum number of front-end connections that can be supported by the dispatcher 102 is reached, or until the number of data requests stored in the queue 106 exceeds the defined threshold.
- the dispatcher 102 maintains a variable number of persistent connections 116-120 with the back-end server 104.
- the dispatcher 102 implements a feedback control system by monitoring a performance metric for the back-end server 104 and then adjusting the number of back-end connections 116-120 as necessary to adjust the performance metric as desired. For example, suppose a primary performance metric of concern for the back-end server 104 is overall throughput. If the monitored throughput falls below a minimum level, the dispatcher 102 may adjust the number of back-end connections 116-120 until the throughput returns to an acceptable level .
- the dispatcher 102 may also be configured to adjust the number of back-end connections 116-120 so as to control a performance metric for the back-end server 104 other than throughput, such as, for example, average response time, maximum response time, etc.
- a performance metric for the back-end server 104 other than throughput, such as, for example, average response time, maximum response time, etc.
- the dispatcher 102 is preferably configured to maintain the performance metric of interest within an acceptable range of values, rather than at a single specific value .
- the dispatcher can independently monitor the performance metric of concern for the back-end server 104.
- the back-end server may be configured to monitor its performance and provide performance information to the dispatcher.
- the dispatcher 102 may immediately increase the number of back- end connections 116-120 as desired (until the maximum number of connections which the back-end server is capable of supporting is reached) . To decrease the number of back-end connections, the dispatcher 102 preferably waits until a connection becomes idle before terminating that connection (in contrast to terminating an active connection over which a response to a data request is pending) .
- the dispatcher 102 and the back-end server 104 may be implemented as separate components, as illustrated generally in Fig. 1. Alternatively, they may be integrated in a single computer device having at least one processor.
- the dispatcher functionality may be integrated into a conventional Web server (having sufficient resources) for the purpose of enhancing server performance.
- the server 100 achieved nearly three times the performance, measured in terms of HTTP request rate, of a conventional Web server.
- a cluster-based server 200 according to another preferred embodiment of the present invention is shown in Fig. 2, and is preferably implemented in manner similar to the embodiment described above with reference to Fig. 1, except as noted below.
- the -cluster- based server 200 employs multiple back-end servers 202, 204 for processing data requests provided by exemplary clients 206, 208 through an L7 dispatcher 210 having a queue 212.
- the dispatcher 210 preferably manages a dynamic set of persistent back end connections 214-218, 220-224 with each back-end server 202, 204, respectively.
- the dispatcher 210 also controls the number of data requests processed concurrently by each back-end server at any given time in such a manner as to improve the performance of each back-end server and, thus, the cluster-based server 200.
- the dispatcher 210 preferably refrains from forwarding a data request to one of the back-end servers 202-204 over a particular connection until the dispatcher 210 receives a response to a prior data request forwarded over the same particular connection (if applicable) .
- the dispatcher 210 can control the maximum number of data requests processed by any back- end server at any given time simply by dynamically controlling the number of back-end connections 214-224.
- FIG. 2 illustrates the dispatcher 210 as having three persistent connections 214-218, 220-224 with each back-end server 202, 204, it should be apparent from the description below that the set of persistent connections between the dispatcher and each back-end server may include more or less than three connections at any given time, and the number of persistent connections in any given set may differ at any time from that of another set.
- the default number of permissible connections initially selected for any given back-end server will depend upon that server's hardware and/or software configuration, and may also depend upon the particular performance metric (e.g., request rate, throughput, average response time, maximum response time, etc.) to be controlled for that back-end server. Preferably, the same performance metric is controlled for each back-end server.
- performance metric e.g., request rate, throughput, average response time, maximum response time, etc.
- An "idle server” refers to a back-end server having one or more idle connections, or to which an additional connection can be established by the dispatcher without exceeding the default (or subsequently adjusted) number of permissible connections for that back-end server.
- the dispatcher Upon receiving a data request from a client, the dispatcher preferably selects an idle server, if available, and then forwards the data request to the selected server. If no idle server is available, the data request is stored in the queue 212. Thereafter, each time an idle connection is detected, a data request is retrieved from the queue 212, preferably on a FIFO basis, and forwarded over the formerly idle (now active) connection.
- the system may be configured such that all data requests are first queued and then dequeued as soon as possible (which may be immediately) for forwarding to an idle server.
- the dispatcher preferably forwards data requests to these idle servers on a round-robin basis.
- the dispatcher can forward data requests to the idle servers according to another load sharing algorithm, or according to the content of such data requests (i.e., content-based dispatching).
- the dispatcher Upon receiving a response from a back-end server to which a data request was dispatched, the dispatcher forwards the response to the corresponding client.
- a Web server according to another preferred embodiment of the present invention is illustrated in Fig. 3 and indicated generally by reference character 300. Similar to the server 100 of Fig. 1, the server 300 of Fig. 3 includes a dispatcher 302 and a back-end server 304. However, in this particular embodiment, the dispatcher 302 is configured to support open systems integration (OSI) layer four (L4) switching. Thus, connections 314-318 are made between exemplary clients 308-312 and the back-end server 304 directly rather than with the dispatcher 302.
- the dispatcher 302 includes a queue 306 for storing connection requests (e.g., SYN packets) received from clients 308-312.
- connection requests e.g., SYN packets
- the dispatcher 302 monitors a performance metric for the back-end server 304 and controls the number of connections 314-318 established between the back-end server 304 and clients 308-312 to thereby control the back-end server's performance.
- the dispatcher 302 is an L4/3 dispatcher (i.e., it implements layer 4 switching with layer 3 packet forwarding) , thereby requiring all transmissions between the back-end server 304 and clients 308-312 to pass through the dispatcher.
- the dispatcher 302 can monitor the back-end server's performance directly.
- the dispatcher can monitor the back-end server' s performance via performance data provided to the dispatcher by the back-end server, or otherwise.
- the dispatcher 302 monitors a performance metric for the back-end server 304 (e.g., average response time, maximum response time, server packet throughput, etc.) and then dynamically adjusts the number of connections 314-318 to the back-end server 304 as necessary to adjust the performance metric as desired.
- the number of connections is dynamically adjusted by controlling the number of connection requests (e.g., SYN packets), received by the dispatcher 302 from clients 308-312, that are forwarded to the back-end server 304.
- Fig. 4 illustrates a cluster-based embodiment of the Web server 300 shown in Fig. 3.
- a cluster-based server 400 includes an L4/3 dispatcher 402 having a queue 404 for storing connection requests, and several back-end servers 406, 408.
- L4/3 dispatcher 402 having a queue 404 for storing connection requests, and several back-end servers 406, 408.
- connections 410-420 are made between exemplary clients 422, 424 and the back-end servers 406, 408 directly.
- the dispatcher 402 preferably monitors the performance of each back-end server 406, 408 and dynamically adjusts the number of connections therewith, by controlling the number of connection requests forwarded to each back-end server, to thereby control their performance .
- All functions of the dispatcher 210 are preferably implemented via a software application implementing a simplified TCP/IP protocol, shown in Fig. 5, and running in user-space (in contrast to kernel space) on commercially off-the-shelf ("COTS") hardware and operating system software.
- this software application runs under the Linux operating system or another modern UNIX system supporting libpcap, a publicly available packet capture library, and POSIX threads. As a result, the dispatcher can capture the necessary packets in the datalink layer.
- the packet When a packet arrives at the datalink layer of the dispatcher 210, the packet is preferably applied to each filter defined by the dispatcher, as shown in Figure 5.
- the packet capture device then captures all the packets in which it is interested. For example, the packet capture device can operate in a promiscuous mode, during which all packets arriving at the datalink layer are copied to a packet capture buffer and then filtered, through software, according to, e.g., their source IP or MAC address, protocol type, etc. Matching packets can then be forwarded to the application making the packet capture call, whereas non- matching packets can be discarded.
- packets arriving at the datalink layer can be filtered through hardware (e.g., via a network interface card) in addition to or instead of software filtering.
- interrupts are preferably generated at the hardware level only when broadcast packets or packets addressed to that hardware are received.
- two packet capture devices are used to capture packets from the clients 206-208 and the back-end servers 202-204, respectively. These packets are then decomposed and analyzed using the simplified TCP/IP protocol, as further described below. Packets seeking to establish or terminate a connection are preferably handled by the dispatcher 210 immediately. Packets containing data requests (e.g., HTTP requests) are stored in the queue 212 when all of the back-end connections 214-224 are active.
- data requests e.g., HTTP requests
- a data request is dequeued, combined with corresponding TCP and IP headers, and sent to this server using a raw socket (raw socket is provided in many operating systems, e.g., UNIX, for users to read and write raw network protocol datagrams with a protocol field that is not processed by the kernel) .
- Packets containing response data from a back-end server are combined with appropriate TCP and IP headers and passed to the corresponding client using raw sockets. This process is illustrated by the activity diagram of Figure 6.
- All packets transmitted to establish and terminate a connection are short and in sequence except for retransmitted packets.
- the dispatcher acts like a gateway, whose function is too simply change packet header fields and pass packets.
- the sequence window in this embodiment is simplified to have a size of one, to deal with connection setup and termination. b. Timers .
- Retransmission is done in TCP to avoid data loss when the sender does not receive an acknowledgement within a certain period. Since the back-end servers are distributed in the same LAN, data loss is rare.
- the client When establishing a connection with a client, since the client is active, the client will retransmit the same packet if it does not receive the packet from the dispatcher.
- the dispatcher When terminating a connection with the client, if the dispatcher does not receive any response from the client for a certain period, the dispatcher will disconnect the connection. Therefore, retransmission can be omitted.
- Persist timer This is set when the other end of a connection advertises a window of zero, thereby stopping TCP from sending data. When it expires, one byte of data is sent to determine if the window has opened. This is not applicable since the bulk data transmission will not occur when establishing and terminating connections. 3. Delayed acknowledgement . This is used to improve the efficiency of the transmission. It is not applicable to establishing and terminating connections because an immediate response can be given, but could be used to acknowledge an HTTP request . Because maintaining an alarm or maintaining a time record and polling for each connection is expensive, this problem is solved by sending an acknowledgement to each HTTP request immediately after it is received. c. Option Field.
- MSS Maximum Segment Size
- window scale For simplicity, only the MSS option is implemented in this embodiment.
- a two-dimensional server-mapping array is used to store the connection information between the dispatcher and the back-end servers.
- a linked list could be used.
- Each server is preferably associated with a unique index number, and newly added servers are assigned larger index numbers.
- Each connection to a back-end server is identified by a port number, which is used by the dispatcher to set up that connection.
- a third dimension, port number layer, is preferably used to keep the number of connections fixed.
- a client when a client connects to an Apache server using HTTP/1.1, the server will close the connection when it receives a bad request, which may be a non-existent URL. In this situation, the connection becomes unusable for a certain period of time (which varies by operating system) . This means the port number is disabled. In order to maintain the active connection number, a new connection to the same server is preferably opened. Thus, a new memory space must be allocated for the connection. To efficiently use memory space and manage the connection set, the port number manager uses layers to assign a different port number and stores its information in the same slot. As shown in Figure 8, a port number is uniquely determined by the index of the server, the connection index of this server, the index of port number layers, and the port start number.
- the port start number is defined as 10000
- the port number used by the dispatcher to setup the first connection to the first back-end server will be 10000 and the second connection to the first back-end server will be 10001.
- the number of permissible connections to a particular back-end server is, for example, eight
- the port number used by the dispatcher to setup the first connection to the second back-end server is 10008.
- two queues are preferably used: a not-in-use queue 902; and an idle queue 904.
- all port numbers are initially inserted into the not-in-use queue 902 in such a way that each back-end server has an equal chance to be connected to by the dispatcher.
- the dispatcher receives a connection request from a client, it removes a port number from the head of the not-in-use queue 902 and uses it to set up a connection with the corresponding back-end server. This port number is placed in the idle queue 904 once the connection is established.
- the dispatcher When a data request arrives from a client, the dispatcher matches the data request with an idle port, dequeues the associated port number from the idle queue 904, and forwards the data request to the back-end server associated with the dequeued port number.
- this back-end connection (s) When the load of the dispatcher decreases and one or more back-end connections do not receive a data request within a certain time interval (which is three minutes in this particular implementation) , this back-end connection (s) is terminated by the corresponding back-end server (s), and the corresponding port numbers are placed back into the not-in-use queue 902.
- the idle queue stores port numbers associated with idle connections
- the not-in-use queue stores port numbers not associated with an existing connection. In this manner, the network resources and the resources of the back-end servers are used efficiently.
- Each hash entry is uniquely identified by a tuple of client IP address, client port number, and a dispatcher port number.
- To calculate the hash value the client IP address and the client port number are used to get a hash index. Collision is handled using open addressing, which resolves the collision problems by polling adjacent slots until an empty one is found.
- To obtain the hash entry the client IP address and port number are compared to those of entries in the hash slot.
- the dispatcher port numbers preferably have a one-to-one relationship with back-end servers.
- the hash index or map index that stores the information for a particular connection is preferably stored in the data request queue 212 shown in Fig. 2.
- a sequence number space is maintained by each side of a connection to control the transmission.
- a packet arrives from a back-end server, it includes sequence information specific to the connection between the back-end server and the dispatcher. This packet must then be changed by the dispatcher to carry sequence information specific to the front-end connection between the dispatcher and the associated client.
- Fig. 10 provides an example of how the packet sequence number is changed while it is passed by the dispatcher. The four sequence numbers are represented using the following symbols :
- X the sequence number of the next byte to be sent to the client by the dispatcher.
- Y the sequence number of the next byte to be sent to the dispatcher by the client.
- A the sequence number of the next byte to be sent to the server by the dispatcher.
- step (1) after the dispatcher sends a client's request to a selected back-end server, it saves the initial sequence numbers X0 and BO.
- step (3) the dispatcher receives the first response packet from the back- end server with the sequence number BO and the acknowledgement number Al . Since this is the first response, the dispatcher searches the header of the packet for content-length field and records the total bytes that the server is sending to the client.
- step (4) the dispatcher changes the sequence number to X0 and the acknowledgement number to Y0 and forwards the packet to the client.
- the address space and checksum of the packet are also updated accordingly every time the packet is passed.
- step (5) the dispatcher receives the acknowledgement from the client with the sequence number Y0 and the acknowledgement number Z.
- the dispatcher compares Z with X0; if Z>X0, then the dispatcher updates X0 to XI; otherwise, it keeps X0.
- step (6) the dispatcher changes the sequence number to Al and the acknowledgment number to Bl and sends it to the back-end server.
- the dispatcher calculates the remaining packet length to be received. Since the remaining packet length is greater than zero, the dispatcher waits for the next packet.
- step (5) the dispatcher receives the acknowledge
- the dispatcher changes the sequence number to XI and the acknowledgement number to Y0 and sends the packet to the client.
- step (9) the dispatcher receives the acknowledgment from the client and repeats the same work done in step (5) .
- step (9) the dispatcher receives the acknowledgment from the client and repeats the same work done in step (5) .
- the dispatcher preferably does not acknowledge the amount of data it receives from the server. Instead, it passes the packet on to the client and acknowledges it only after it receives the acknowledgement from the client. In this way, the server is responsible for the retransmission when it has not received an acknowledgment within a certain period, and the client is responsible for the flow control if it runs out of buffer space.
- the TIME_WAIT state is provided for a sender to wait for a period of time to allow the acknowledgement packet sent by the sender to die out in the network. A soft timer and a queue are preferably used to keep track of this time interval.
- TIME_WAIT When a connection enters the TIME_WAIT state, its hash index is placed in the TIME_WAIT queue.
- the queue is preferably checked every second if the interval exceeds a certain period. For UNIX, this interval is one minute, but in the particular implementation of the invention under discussion, because of the short transmission time and short route, it is preferably set to one second.
- the soft timer which is realized by reading the system time each time after the program has finished processing one packet, is preferably used instead of a kernel alarm to eliminate the overhead involved in the interrupt caused by the kernel .
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Computer And Data Communications (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Selon l'invention, des serveurs isolés ou des serveurs en grappes, notamment des serveurs web, déterminent la quantité de données traitées de manière concurrente par de tels serveurs, afin de maîtriser ainsi les performances de fonctionnement des serveurs. De préférence, un achemineur est interposé entre des clients et un ou plusieurs serveurs principaux, et il surveille les performances de chaque serveur principal (soit directement, soit d'une autre manière). De préférence également, l'achemineur détermine, pour chaque serveur principal, en réponse aux performances surveillées, soit le nombre de demandes de données traitées de manière concurrente, soit le nombre de connexions supportées de manière concurrente, soit les deux, afin de maîtriser les performances de ces serveurs principaux. Dans un mode de réalisation, l'achemineur utilise une bibliothèque de saisie de paquets aux fins de saisie de paquets au niveau de la couche 2 OSI et il met en oeuvre un protocole simplifié TCP/IP dans l'espace utilisateur (par opposition à l'espace noyau), afin de réduire la copie de données. De préférence, on emploie du matériel et des logiciels de système d'exploitation, disponibles sur le marché, pour tirer parti de leur rapport prix/performances.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP01989983A EP1332600A2 (fr) | 2000-11-03 | 2001-11-05 | Chargement commande de serveur |
AU2002228861A AU2002228861A1 (en) | 2000-11-03 | 2001-11-05 | Load balancing method and system |
Applications Claiming Priority (14)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US24579000P | 2000-11-03 | 2000-11-03 | |
US24578800P | 2000-11-03 | 2000-11-03 | |
US24578900P | 2000-11-03 | 2000-11-03 | |
US24585900P | 2000-11-03 | 2000-11-03 | |
US60/245,789 | 2000-11-03 | ||
US60/245,859 | 2000-11-03 | ||
US60/245,790 | 2000-11-03 | ||
US60/245,788 | 2000-11-03 | ||
US09/878,787 | 2001-06-11 | ||
US09/878,787 US20030046394A1 (en) | 2000-11-03 | 2001-06-11 | System and method for an application space server cluster |
US09/930,014 US20020055980A1 (en) | 2000-11-03 | 2001-08-15 | Controlled server loading |
US09/930,014 | 2001-08-15 | ||
US09/965,526 US20020055982A1 (en) | 2000-11-03 | 2001-09-26 | Controlled server loading using L4 dispatching |
US09/965,526 | 2001-09-26 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2002037799A2 true WO2002037799A2 (fr) | 2002-05-10 |
WO2002037799A3 WO2002037799A3 (fr) | 2003-03-13 |
Family
ID=27569454
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2001/047013 WO2002037799A2 (fr) | 2000-11-03 | 2001-11-05 | Chargement commande de serveur |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1332600A2 (fr) |
AU (1) | AU2002228861A1 (fr) |
WO (1) | WO2002037799A2 (fr) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2005060211A2 (fr) * | 2003-12-10 | 2005-06-30 | Aventail Corporation | Appareil de reseau |
WO2006029771A1 (fr) * | 2004-09-13 | 2006-03-23 | Fujitsu Siemens Computers, Inc. | Systeme informatique et procede pour la fourniture de services pour des clients sur un reseau |
EP2023245A1 (fr) * | 2006-04-26 | 2009-02-11 | Nippon Telegraph and Telephone Corporation | Dispositif de controle de charge et procede associe |
US7698388B2 (en) | 2003-12-10 | 2010-04-13 | Aventail Llc | Secure access to remote resources over a network |
US7770222B2 (en) | 2003-12-10 | 2010-08-03 | Aventail Llc | Creating an interrogation manifest request |
US7779469B2 (en) | 2003-12-10 | 2010-08-17 | Aventail Llc | Provisioning an operating environment of a remote computer |
US8005983B2 (en) | 2003-12-10 | 2011-08-23 | Aventail Llc | Rule-based routing to resources through a network |
US9628489B2 (en) | 2003-12-10 | 2017-04-18 | Sonicwall Inc. | Remote access to resources over a network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0491367A2 (fr) * | 1990-12-19 | 1992-06-24 | Bull HN Information Systems Inc. | Méthode de contrôle de file d'attente pour un système de courrier électronique |
EP0794490A2 (fr) * | 1996-03-08 | 1997-09-10 | International Business Machines Corporation | Gestion dynamique d'unité d'exécution pour un système de serveur à haute performance |
EP0892531A2 (fr) * | 1997-06-19 | 1999-01-20 | Sun Microsystems Inc. | Equilibrage de charge de réseau pour serveur à multi-ordinateur |
WO1999053415A1 (fr) * | 1998-04-15 | 1999-10-21 | Hewlett-Packard Company | Traitement reparti dans un reseau |
EP1035703A1 (fr) * | 1999-03-11 | 2000-09-13 | Lucent Technologies Inc. | Procédé et dispositif pour repartition de charge dans un réseau étendu |
US6141759A (en) * | 1997-12-10 | 2000-10-31 | Bmc Software, Inc. | System and architecture for distributing, monitoring, and managing information requests on a computer network |
-
2001
- 2001-11-05 EP EP01989983A patent/EP1332600A2/fr not_active Withdrawn
- 2001-11-05 WO PCT/US2001/047013 patent/WO2002037799A2/fr not_active Application Discontinuation
- 2001-11-05 AU AU2002228861A patent/AU2002228861A1/en not_active Abandoned
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0491367A2 (fr) * | 1990-12-19 | 1992-06-24 | Bull HN Information Systems Inc. | Méthode de contrôle de file d'attente pour un système de courrier électronique |
EP0794490A2 (fr) * | 1996-03-08 | 1997-09-10 | International Business Machines Corporation | Gestion dynamique d'unité d'exécution pour un système de serveur à haute performance |
EP0892531A2 (fr) * | 1997-06-19 | 1999-01-20 | Sun Microsystems Inc. | Equilibrage de charge de réseau pour serveur à multi-ordinateur |
US6141759A (en) * | 1997-12-10 | 2000-10-31 | Bmc Software, Inc. | System and architecture for distributing, monitoring, and managing information requests on a computer network |
WO1999053415A1 (fr) * | 1998-04-15 | 1999-10-21 | Hewlett-Packard Company | Traitement reparti dans un reseau |
EP1035703A1 (fr) * | 1999-03-11 | 2000-09-13 | Lucent Technologies Inc. | Procédé et dispositif pour repartition de charge dans un réseau étendu |
Non-Patent Citations (2)
Title |
---|
ARIEL COHEN, SAMPATH RANGARAJAN, AND HAMILTON SLYE: SECOND USENIX SYMPOSIUM ON INTERNET TECHNOLOGIES AND SYSTEMS, 11 - 14 October 1999, XP002203108 Boulder, Colorado * |
HUNT G D H ET AL: "Network Dispatcher: a connection router for scalable Internet services" COMPUTER NETWORKS AND ISDN SYSTEMS, NORTH HOLLAND PUBLISHING. AMSTERDAM, NL, vol. 30, no. 1-7, 1 April 1998 (1998-04-01), pages 347-357, XP004121412 ISSN: 0169-7552 * |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8572249B2 (en) | 2003-12-10 | 2013-10-29 | Aventail Llc | Network appliance for balancing load and platform services |
US10313350B2 (en) | 2003-12-10 | 2019-06-04 | Sonicwall Inc. | Remote access to resources over a network |
WO2005060211A2 (fr) * | 2003-12-10 | 2005-06-30 | Aventail Corporation | Appareil de reseau |
US8590032B2 (en) | 2003-12-10 | 2013-11-19 | Aventail Llc | Rule-based routing to resources through a network |
US7698388B2 (en) | 2003-12-10 | 2010-04-13 | Aventail Llc | Secure access to remote resources over a network |
US7770222B2 (en) | 2003-12-10 | 2010-08-03 | Aventail Llc | Creating an interrogation manifest request |
US7779469B2 (en) | 2003-12-10 | 2010-08-17 | Aventail Llc | Provisioning an operating environment of a remote computer |
US8005983B2 (en) | 2003-12-10 | 2011-08-23 | Aventail Llc | Rule-based routing to resources through a network |
US10218782B2 (en) | 2003-12-10 | 2019-02-26 | Sonicwall Inc. | Routing of communications to one or more processors performing one or more services according to a load balancing function |
US8255973B2 (en) | 2003-12-10 | 2012-08-28 | Chris Hopen | Provisioning remote computers for accessing resources |
US8301769B2 (en) | 2003-12-10 | 2012-10-30 | Aventail Llc | Classifying an operating environment of a remote computer |
US8438254B2 (en) | 2003-12-10 | 2013-05-07 | Aventail Llc | Providing distributed cache services |
WO2005060211A3 (fr) * | 2003-12-10 | 2008-01-24 | Aventail Corp | Appareil de reseau |
US10135827B2 (en) | 2003-12-10 | 2018-11-20 | Sonicwall Inc. | Secure access to remote resources over a network |
US9736234B2 (en) | 2003-12-10 | 2017-08-15 | Aventail Llc | Routing of communications to one or more processors performing one or more services according to a load balancing function |
US8661158B2 (en) | 2003-12-10 | 2014-02-25 | Aventail Llc | Smart tunneling to resources in a network |
US10003576B2 (en) | 2003-12-10 | 2018-06-19 | Sonicwall Inc. | Rule-based routing to resources through a network |
US8700775B2 (en) | 2003-12-10 | 2014-04-15 | Aventail Llc | Routing of communications to a platform service |
US8959384B2 (en) | 2003-12-10 | 2015-02-17 | Aventail Llc | Routing of communications to one or more processors performing one or more services according to a load balancing function |
US9268656B2 (en) | 2003-12-10 | 2016-02-23 | Dell Software Inc. | Routing of communications to one or more processors performing one or more services according to a load balancing function |
US9628489B2 (en) | 2003-12-10 | 2017-04-18 | Sonicwall Inc. | Remote access to resources over a network |
US8615796B2 (en) | 2003-12-10 | 2013-12-24 | Aventail Llc | Managing resource allocations |
US9906534B2 (en) | 2003-12-10 | 2018-02-27 | Sonicwall Inc. | Remote access to resources over a network |
WO2006029771A1 (fr) * | 2004-09-13 | 2006-03-23 | Fujitsu Siemens Computers, Inc. | Systeme informatique et procede pour la fourniture de services pour des clients sur un reseau |
US8667120B2 (en) | 2006-04-26 | 2014-03-04 | Nippon Telegraph And Telephone Corporation | Load control device and method thereof for controlling requests sent to a server |
EP2023245A4 (fr) * | 2006-04-26 | 2012-08-15 | Nippon Telegraph & Telephone | Dispositif de controle de charge et procede associe |
EP2023245A1 (fr) * | 2006-04-26 | 2009-02-11 | Nippon Telegraph and Telephone Corporation | Dispositif de controle de charge et procede associe |
Also Published As
Publication number | Publication date |
---|---|
WO2002037799A3 (fr) | 2003-03-13 |
EP1332600A2 (fr) | 2003-08-06 |
AU2002228861A1 (en) | 2002-05-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20020055980A1 (en) | Controlled server loading | |
US20020055982A1 (en) | Controlled server loading using L4 dispatching | |
US8635363B2 (en) | System, method and computer program product to maximize server throughput while avoiding server overload by controlling the rate of establishing server-side network connections | |
US6928051B2 (en) | Application based bandwidth limiting proxies | |
US9954785B1 (en) | Intelligent switching of client packets among a group of servers | |
US6665304B2 (en) | Method and apparatus for providing an integrated cluster alias address | |
US5918021A (en) | System and method for dynamic distribution of data packets through multiple channels | |
US6389448B1 (en) | System and method for load balancing | |
US8463935B2 (en) | Data prioritization system and method therefor | |
US20020055983A1 (en) | Computer server having non-client-specific persistent connections | |
EP1494426B1 (fr) | Traitement réseau sécurisé | |
US5878228A (en) | Data transfer server with time slots scheduling base on transfer rate and predetermined data | |
US6014707A (en) | Stateless data transfer protocol with client controlled transfer unit size | |
EP1469653A2 (fr) | Moteur de traitement de réseau de couche transport prenant en compte les objets | |
EP1864465A1 (fr) | Communications en reseau pour des partitions de systeme d'exploitation | |
WO2002037799A2 (fr) | Chargement commande de serveur | |
US7392318B1 (en) | Method and system for balancing a traffic load in a half-duplex environment | |
WO2004071027A1 (fr) | Procedes et systemes permettant la resolution d'adresse physique sans perturbation | |
KR20050061927A (ko) | 부하 분산 장치 및 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2001989983 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 2001989983 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: JP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2001989983 Country of ref document: EP |