US20170111478A1 - Load balancing utilizing adaptive thresholding - Google Patents
Load balancing utilizing adaptive thresholding Download PDFInfo
- Publication number
- US20170111478A1 US20170111478A1 US15/276,551 US201615276551A US2017111478A1 US 20170111478 A1 US20170111478 A1 US 20170111478A1 US 201615276551 A US201615276551 A US 201615276551A US 2017111478 A1 US2017111478 A1 US 2017111478A1
- Authority
- US
- United States
- Prior art keywords
- server
- request
- queue
- client
- threshold
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
-
- H04L67/42—
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
-
- H04L67/1002—
Definitions
- server farms implement some type of load balancing algorithm to distribute requests from client computers among the multiple servers.
- client devices generally issue requests to server devices for some kind of service and/or processing and the server devices process those requests and return suitable results to the client devices.
- workload distribution among the servers significantly affects the quality of service that the client devices receive from the servers.
- client devices number in the hundreds of thousands or millions, while the servers number in the hundreds or thousands. In such environments server load balancing becomes particularly important to system performance.
- One approach to increase the effectiveness of load balancing and the resulting system performance and throughput is to efficiently find the servers which have lower load levels than other servers, and assign new client requests to these servers. Finding and distributing workload to overloaded and under-utilized servers may be done in a central or a distributed manner.
- Central control of load balancing requires a dedicated controller, such as a master server, to keep track of all servers and their respective loads at all times, incurring certain administrative costs associated with keeping lists of servers and connections up-to-date.
- a master server constitutes a single point of failure in the system, requiring multiple mirrored master servers for more reliable operation.
- the reliability and scalability of the number of servers in the server farm can be dependent on the ability and efficiency of the dedicated controller to handle the increased number of servers.
- the client computer randomly selects a server. For example, a pseudo-random number generator may be utilized to select one of N servers.
- random selection of servers does not take the actual server loads into consideration and, thus, cannot avoid occasionally loading a particular server.
- Random server selection algorithms improve the average performance for request handling. This means such algorithms improve request handling for about 50% of the requests, but not for the majority of the requests.
- the client computing device can implement a weighted probability selection algorithm in which the selection of a server is determined, at least in part, on the reported load/resources of each server.
- server load information must be updated periodically at each client device to make optimal server selection based on server loads.
- the server load may be indicated by a length of a request queue at each server, a request processing latency, or other similar indicators.
- a round-robin algorithm for server assignment may be used where each request is sent to a next server according to a number indicated by a counter maintained at the client device.
- this approach does not distribute the load optimally among servers because the round-robin cycle in different clients could coincide, causing multiple clients to call the same server at the same time.
- servers may be assigned to individual clients on a priority basis.
- a client may be assigned a number of servers according to a prioritized list of servers where the client sends a request to the server with the highest priority first, and next re-sends the request to the server with the next highest priority, if needed, and so on.
- a prioritized list of servers where the client sends a request to the server with the highest priority first, and next re-sends the request to the server with the next highest priority, if needed, and so on.
- a computer-implemented method for processing data requests includes obtaining a data request for a document and/or service available from a server, such as a Web server.
- a first request queue threshold, associated with a first server, and a second request queue threshold, associated with a second server, are compared to determine whether to process the data request. Based on the comparison of the two thresholds, the first request queue threshold is increased and the data request is processed.
- a system for load balancing including a first server, coupled with a network, for obtaining and processing a data request.
- a first data store coupled with the first server is provided for storing information associated with the data request.
- the system includes a second server, coupled with the network, for obtaining and processing the data request.
- a second data store coupled with the second server is also provided for storing information associated with the data request.
- the system also includes a first request queue associated with the first server, having a first threshold and a second request queue associated with the second server, having a second threshold.
- the first server increases the first threshold and processes the data request based on a comparison of the first threshold and the second threshold.
- a system for load balancing including a client component operating within a client computing device for transmitting a data request to a server.
- the system includes a first server, coupled with a network, for obtaining and processing a data request.
- a first data store coupled with the first server is provided for storing information associated with the data request.
- the system includes a second server, coupled with the network, for obtaining and processing the data request.
- a second data store coupled with the second server is also provided for storing information associated with the data request.
- the system also includes a first request queue associated with the first server, having a first threshold and a second request queue associated with the second server, having a second threshold.
- the first server increases the first threshold and processes the data request based on a comparison of the first threshold and the second threshold.
- a computer-implemented method for processing data requests including transmitting a first data request to a first server computing device.
- the first data request is rejected if a first request queue threshold associated with the first server computing device is exceeded.
- a second data request is formed by adding the first request queue threshold to the first data request, and transmitting the second data request to a second server computing device.
- the second data request is also rejected if a second request queue threshold associated with the second server computing device is greater than the first request queue threshold included in the second data request.
- a third data request is then formed, by adding the second request queue threshold to the first data request, and transmitted to the first server computing device.
- FIG. 1 is a block diagram depicting an illustrative client-server operating environment suitable for distributed load balancing, including a number of client devices and a number of server devices having a client request queue;
- FIG. 2A is a block diagram of the client-server operating environment of FIG. 1 illustrating the initiation of a request from a client device to a first server computing device;
- FIG. 2B is a block diagram of the client-server operating environment of FIG. 1 illustrating a first alternative of a first server computing device accepting the request from the client computing device for processing;
- FIG. 2C is a block diagram of the client-server operating environment of FIG. 1 illustrating a second alternative first server computing device rejecting the request from the client computing device and the client computing device sending the request to a second server for processing;
- FIG. 2D is a block diagram of the client-server operating environment of FIG. 1 illustrating a first alternative second server computing device rejecting the request from the client computing device and the client computing device resending the request to the first server computing device;
- FIG. 2E is a block diagram of the client-server operating environment of FIG. 1 illustrating a second alternative second server computing device accepting the request from the client computing device and adjusting its queue threshold;
- FIG. 3 is a flow diagram depicting an illustrative method for accepting requests and adjusting server computing device queue thresholds.
- the invention relates to load balancing in a client-server computing environment. Specifically, the invention relates to the balancing of server load using distributed routing of client requests.
- a client device initially transmits a data request to a selected first server device using any one of a variety of methods for selecting the server.
- the first server device processes the request and may reject the data request if its request queue threshold is exceeded. On rejection, the first server device includes its request queue threshold in a rejection message to the client device.
- the client device retransmits the data request, including the request queue threshold, to a second server device, selected in a similar manner.
- the second server device may reject the data request if the request queue threshold of the first server device is smaller than a request queue threshold of the second server device.
- the second server device includes its request queue threshold.
- the client device transmits the data request back to the first server device, including the request queue threshold of the second server device.
- the first server device processes the data request and adjusts its request queue threshold based on the request queue threshold of the first and the second server devices.
- FIG. 1 is a block diagram depicting a sample client-server operating environment 100 suitable for distributed load balancing.
- Client devices 102 are coupled to server devices 106 via a network 102 .
- the network 104 is the Internet and the client devices 102 communicate with the sever devices 106 via Web protocols such as the HTTP (Hyper Text Transport Protocol).
- the servers 106 may be Web servers arranged in a server farm accessible through the same URI (Uniform Resource Identifier).
- the client devices 102 generally search for documents using a query statement and the server devices 106 find documents that match the query and return Web pages to the client devices 102 , which are displayed in a Web browser on the client device 102 .
- the network 104 may be a LAN.
- the servers 106 may offer a number of services to the client devices 102 , such as FTP (File Transfer Protocol), database access services, file access services, application services, etc.
- the client request may be for data, such as Web pages, to be returned to the client 102 by the server 106 .
- the client request may indicate a request to perform some process or task at the server 106 , such as a registration or a data update at the server 106 , without returning any data. In all cases, however, the server 106 processes the request from the client 102 .
- Client devices may include, but are not limited to, a personal computer, a personal digital assistant (PDA), a mobile phone device, etc.
- the client device 102 may include an independent component for interacting with the queue processing application 112 for routing requests.
- the client device 102 may include a software component, integrated with another software component running on the client device 102 , for interacting with the queue processing application 112 for routing requests.
- the client device 102 may include a plug-in component integrated with a Web browser running on the client device 102 . Such plug-in component may be specifically used for interactions with the queue processing application 112 .
- the server device 106 may include a server queue 108 having a queue threshold 110 and a queue processing application 112 .
- the queue processing application 112 may be an independent software component and determines whether accepting the request causes the length of the server queue 108 to exceed the queue threshold 110 .
- the queue processing application 112 may be an integral part of another application, such as a search engine, running on the server device 106 .
- the documents queried may be stored in a data store 114 .
- Data store 114 may be local, such as a disk drive or disk farm, or may be remote such as a remote database.
- the client-server environment comprises client computing devices and server computing devices coupled together through the Internet.
- the client-server environment comprises client computing devices and server computing devices coupled together through a local area network (LAN) such as Ethernet.
- LAN local area network
- the clients and servers may be virtual applications running on the same physical machine.
- FIGS. 2A-2E illustrate distributed request routing in the client-server operating environment of FIG. 1 .
- the server queue 108 comprises fixed-size cells for storing fixed-size requests.
- the server queue 108 comprises an array of pointers to requests.
- the server queue 108 comprises an indexed table.
- the server queue 108 comprises a linked-list data structure.
- the queue processing application 112 takes the requests from the head of the server queue 108 and processes each request, possibly using data stored in data store 114 .
- the server device 106 may return some data to the client device 102 .
- the data returned to the client 102 may be an acknowledgement that the request has been processed.
- the requests from the clients 102 may be routed to any one of the multiple servers 106 , which offer the same services based on the same data and information stored in corresponding data stores 114 .
- the data stores 114 coupled with the corresponding server device 106 include the same data synchronized periodically to stay consistent.
- the multiple servers 106 may be coupled to the same data store 114 .
- the data store 114 comprises a search index suitable for use by search engines.
- the data store 114 may be a local or remote database.
- the data store 114 may be a file server or an application server.
- the routing of client requests to the servers is based on a distributed request routing algorithm with adaptive thresholding for dynamically adjusting the server queue thresholds 110 for each server device 106 .
- a distributed request routing algorithm no central control system exists for request routing. Rather, the request routing algorithm is implemented using all clients and servers in a cooperative and distributed manner, as more fully described below.
- the queue threshold 110 is used by each respective server device 106 to determine whether to accept a request sent by a client 102 for service.
- the queue processing application 112 compares the current queue load with the queue threshold 110 . If, upon processing the request, the queue threshold 110 is exceeded, then the request is rejected, otherwise, the request is accepted for further processing by the server device 106 .
- the client device 102 selects a first server device 106 using any of a variety of methods for selecting the server. As described above, the selection methods can include random selection, probabilistic selection, weighted probabilistic server selection, server assignment, and the like.
- a probability of selection to each server device 106 based on a server load is calculated based on reported server loads/resources.
- the probability is inversely proportional to the server load. So, the server assigned the highest probability is the server with the lightest load.
- the server load is characterized by a length of the server queue 108 . The longer the length of the server queue 108 , the more load the server device 106 has. In this embodiment, the server with the shortest queue length is selected.
- server 106 may be selected randomly, for example, using a pseudo-random number generator. In yet another embodiment, the server 106 may be selected according to a pre-assigned order.
- a prioritized list of servers may be used by each client from which to select the next server for transmission of data requests.
- the server 106 may be selected according to a round-robin scheme. For purposes of illustration, in one embodiment, if the first server device 106 receives the request and the first queue threshold 110 is exceeded, the server device rejects the request and returns a rejection message to the client device 102 . The rejection message includes the first queue threshold 110 .
- the server 106 can accept the request from client 102 .
- the acceptance of the request from the client 102 is based on server load as represented by a length of the server queue 108 . If the threshold is not exceeded, then the request is accepted by the server 106 for further processing.
- the client device 102 may be notified of the acceptance of the request.
- Acceptance of the request by the server 106 includes placing the request at the back of the request queue 108 for later processing by the server 106 .
- the client device 102 is notified of the rejection via a rejection message through the network 104 .
- the first queue threshold 110 of the server queue 108 of the first server 106 is included in the rejection message.
- the rejection message is the request originally sent by the client device 102 with the first queue threshold 110 appended to the request.
- the rejection message may be a simple message including only the first queue threshold 110 and the sever ID of the first server device 106 . Those skilled in the art will appreciate that other configurations of rejection messages may be used.
- the client device 102 selects a second server to which the request is to be sent.
- the client device 102 includes the first queue threshold 110 in the request sent to the second server device 116 .
- the first queue threshold 110 may be included in a URI as a parameter for the second server 116 .
- the second server 116 receives the request including the first queue threshold 110 .
- the second server device 116 treats the request the same way as did the first server device 106 , namely, the queue processing application 122 determines whether accepting the request causes the length of the server queue 118 to exceed a second queue threshold 120 .
- the second server 116 may determine that accepting the request will cause the server queue 118 to exceed the corresponding threshold 120 . If such determination is made, then the second server 116 compares the first queue threshold 110 with the second queue threshold 120 . If the first queue threshold 110 is less than the second queue threshold 120 , the second server 116 also rejects the request, as illustrated in FIG. 2D . The second server 116 rejects the request and sends a rejection message to the client device 102 , via the network 104 , indicating the rejection. The rejection message includes the queue threshold 120 of the second server 116 . The client device receives the rejection message and resends the request to the first server device 106 .
- the request resent to the first server device 106 includes the queue threshold 120 of the second server device 116 .
- the first server device 106 receives the resent request and processes the request in a manner similar to the second server device 116 , as described with respect to FIG. 2C .
- the queue processing application 112 compares the first queue threshold 110 of the first server device 106 to the second queue threshold 120 included in the resent request.
- the second queue threshold 120 is necessarily greater than the first queue threshold 110 because the same comparison with the same two queue thresholds was done at the second server device 116 resulting in the rejection and resending of the request to the first server device 106 .
- the queue processing application 112 adjusts the first queue threshold 110 to be equal to the second queue threshold 120 , equalizing the queue thresholds in the first and the second server devices 106 and 116 , respectively.
- the request sent to the second server 116 includes the first queue threshold 110 . If the first queue threshold 110 is greater than the second queue threshold 120 , the second server 116 accepts the request and adjusts the second queue threshold 120 to the same value as the first queue threshold 110 , equalizing the two queue thresholds. This way, the queue thresholds are equalized dynamically. Queue threshold equalization provides uniform load distribution across servers 106 and 116 by synchronizing queue thresholds.
- FIG. 2E is a block diagram of the operating environment of FIG. 1 illustrating the second server computing device accepting the request from the client computing device and adjusting its queue threshold. The second server device 116 accepts the request if the first queue threshold 110 is greater than the second queue threshold 120 , and adjusts the second queue threshold 120 , as discussed above.
- the second server device 116 processes the request using the data stored in the data store 124 .
- the second server device 116 may also accept the request if the first queue threshold 110 is equal to the second queue threshold 120 .
- the second queue threshold is increased by a constant amount to enable the second queue 118 to accept the request.
- the client device 102 is notified of the acceptance of the request by the second server device 116 and continues its interactions, such as receiving results of its request, with the second server device 116 .
- the distributed request routing algorithm described above can be partly implemented at the server device 106 , and partly at the client device 102 .
- the processing of the client requests and adjustment of server queue thresholds 110 is done at the server devices 106 using the threshold data exchanged between the first and the second server devices through the client device 102 via rejection messages and requests.
- FIG. 3 is a flow diagram depicting an illustrative method for accepting requests and adjusting server computing device queue thresholds.
- the request processing routine starts at block 300 and proceeds to block 305 where the server device 106 obtains a request sent from the client device 102 .
- the queue processing application 112 determines whether adding the request to the server queue 108 causes the pending queue threshold 110 to be exceeded. If it is determined that the pending queue threshold 110 will not be exceeded, the routine proceeds to block 340 where the request is processed by the server device 106 .
- the routine proceeds to decision block 315 where it is determined whether the request is an alternate request rejected by an alternate server device 116 and rerouted by the client device 102 to another sever device 106 .
- the alternate queue threshold 120 is included in the rerouted request for access by another server device 106 .
- the alternate queue threshold 120 is appended to a URI as a parameter which can be retrieved and used by the queue processing application 112 for queue threshold comparisons.
- the queue threshold 120 may be included in a header field of the request sent by the client device 102 . If the request is an alternate request, the routine proceeds to decision box 320 where it is determined whether the alternate queue threshold 120 is greater than or equal to the pending queue threshold 110 .
- the pending queue threshold 110 is set to equal the alternate queue threshold 120 . If the alternate queue threshold 120 is equal to the pending queue threshold 110 , the pending queue threshold 110 is increased by a constant amount to enable it to accept the request and the request is placed at the back of the pending queue 108 . At block 335 , the request is processed and the routine terminates at block 360 . Back at decision block 320 , if it is determined that the alternate queue threshold 120 is less than the pending queue threshold 110 , the request is rejected at block 325 .
- the rejection of the request is communicated to the client device 102 by sending a rejection message to the client device 102 including the pending queue threshold 110 .
- the client device 102 will reroute the request, including the pending queue threshold 110 , to another server device selected randomly.
- decision block 315 if it is determined that the request is not from an alternate server device 116 , the request is rejected at block 325 because accepting the request would cause the pending queue threshold 110 to be exceeded.
- the routine terminates at block 360 .
- the routine 300 proceeds to decision block 345 where it is determined whether to decrease the pending queue threshold 110 .
- the determination to decrease the pending queue threshold 110 is based on the length of the pending queue 108 .
- the queue processing application 112 continuously polls the length of the pending queue 108 to determine whether the length is less than a predetermined fraction of the pending queue threshold 110 . If so, then the pending queue threshold 110 is decreased. In one illustrative embodiment, the pending queue threshold 110 is reduced by a fixed amount.
- the pending queue threshold 110 is reduced by an amount which is a percentage of the current value, such as ten percent.
- the queue processing application 112 may be notified, via a system message generated by the server device 106 , that an event associated with the pending queue length has taken place. The event may be specified based on the queue length being less than the pending queue threshold 110 for a predetermined length of time or a predetermined number of requests. If it is determined that the pending queue threshold 110 should be decreased, the routine proceeds to block 350 where the pending queue threshold 110 is decreased by an appropriate amount, as discussed above and the routine terminates at block 360 . If it is determined that the pending queue threshold 110 should not be decreased, the routine proceeds to block 360 , and the routine 300 terminates at block 360 .
- Test and simulation results indicate that the embodiments of the present invention improve request handling performance, in a client-server computing environment, at 99.9th percentile for different loads. This means that request handling performance is improved for almost all requests under various load conditions. Such performance improvements are very close to those achieved by hardware-based, central load distribution methods without the drawbacks of such methods discussed above. More specifically, the request handling performance is improved by lowering latency and queue thresholds.
Abstract
Methods and systems for processing data requests in a client-server computing environment, such as the Web, are disclosed. A client device initially transmits a data request to a randomly selected first server device. The first server device may reject the data request if its request queue threshold is exceeded. The client device retransmits the data request, including the request queue threshold, to a randomly selected second server device. The second server device may reject the data request if the request queue threshold of the first server device is smaller than a request queue threshold of the second server device. The client device transmits the data request back to the first server device, including the request queue threshold of the second server device. The first server device processes the data request and adjusts its request queue threshold based on the request queue thresholds of the first and second server devices.
Description
- Any and all applications for which a foreign or domestic priority claim is made are identified in the Application Data Sheet filed with the present application and are incorporated by reference under 37 CFR 1.57 and made a part of this specification.
- The ubiquity of computers in business, government, and private homes has resulted in availability of massive amounts of information from network-connected sources, such as data stores accessible through communication networks, such as the Internet. In recent years, computer communication and search tools have become widely available to facilitate the location and availability of information to users. Most computer communication and search tools implement a client-server architecture where a user client computer communicates with a remote server computer over a communication network. In order to achieve better system performance and throughput in the client-server architecture, large communication network bandwidths are needed as the number of client computers communicating with server computers increases.
- One approach to increasing communication bandwidths relates to employing multiple networked server computers offering the same services. These server computers may be arranged in server farms, in which a single server from the server farm receives and processes a particular request from a client computer. Typically, server farms implement some type of load balancing algorithm to distribute requests from client computers among the multiple servers. Generally described, in a typical client-server computing environment, client devices generally issue requests to server devices for some kind of service and/or processing and the server devices process those requests and return suitable results to the client devices. In an environment where multiple clients send requests to multiple servers, workload distribution among the servers significantly affects the quality of service that the client devices receive from the servers. In many modern client-server environments, client devices number in the hundreds of thousands or millions, while the servers number in the hundreds or thousands. In such environments server load balancing becomes particularly important to system performance.
- One approach to increase the effectiveness of load balancing and the resulting system performance and throughput, is to efficiently find the servers which have lower load levels than other servers, and assign new client requests to these servers. Finding and distributing workload to overloaded and under-utilized servers may be done in a central or a distributed manner. Central control of load balancing requires a dedicated controller, such as a master server, to keep track of all servers and their respective loads at all times, incurring certain administrative costs associated with keeping lists of servers and connections up-to-date. Additionally, such a master server constitutes a single point of failure in the system, requiring multiple mirrored master servers for more reliable operation. Still further, the reliability and scalability of the number of servers in the server farm can be dependent on the ability and efficiency of the dedicated controller to handle the increased number of servers.
- Other approaches to finding and distributing workloads in a multi-server environment exist that relate to distributed, software-based approaches in which the client computers implement some type of load balancing software components. In one such approach, the client computer randomly selects a server. For example, a pseudo-random number generator may be utilized to select one of N servers. However, random selection of servers does not take the actual server loads into consideration and, thus, cannot avoid occasionally loading a particular server. Random server selection algorithms improve the average performance for request handling. This means such algorithms improve request handling for about 50% of the requests, but not for the majority of the requests. In another approach, the client computing device can implement a weighted probability selection algorithm in which the selection of a server is determined, at least in part, on the reported load/resources of each server. This approach must contend with the problem of information distribution among client devices. That is, server load information must be updated periodically at each client device to make optimal server selection based on server loads. The server load may be indicated by a length of a request queue at each server, a request processing latency, or other similar indicators. In yet another approach, a round-robin algorithm for server assignment may be used where each request is sent to a next server according to a number indicated by a counter maintained at the client device. Although simple to implement, this approach does not distribute the load optimally among servers because the round-robin cycle in different clients could coincide, causing multiple clients to call the same server at the same time. In yet another approach, servers may be assigned to individual clients on a priority basis. For example, a client may be assigned a number of servers according to a prioritized list of servers where the client sends a request to the server with the highest priority first, and next re-sends the request to the server with the next highest priority, if needed, and so on. As noted above, each of these approaches for server load distribution suffer from a particular problem that make server selection and load distribution sub-optimal, causing low levels of performance.
- This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
- In one aspect of the invention a computer-implemented method for processing data requests is provided that includes obtaining a data request for a document and/or service available from a server, such as a Web server. A first request queue threshold, associated with a first server, and a second request queue threshold, associated with a second server, are compared to determine whether to process the data request. Based on the comparison of the two thresholds, the first request queue threshold is increased and the data request is processed.
- According to another aspect of the invention, a system for load balancing is provided including a first server, coupled with a network, for obtaining and processing a data request. A first data store coupled with the first server is provided for storing information associated with the data request. The system includes a second server, coupled with the network, for obtaining and processing the data request. A second data store coupled with the second server is also provided for storing information associated with the data request. The system also includes a first request queue associated with the first server, having a first threshold and a second request queue associated with the second server, having a second threshold. The first server increases the first threshold and processes the data request based on a comparison of the first threshold and the second threshold.
- According to yet another aspect of the invention, a system for load balancing is provided including a client component operating within a client computing device for transmitting a data request to a server. The system includes a first server, coupled with a network, for obtaining and processing a data request. A first data store coupled with the first server is provided for storing information associated with the data request. The system includes a second server, coupled with the network, for obtaining and processing the data request. A second data store coupled with the second server is also provided for storing information associated with the data request. The system also includes a first request queue associated with the first server, having a first threshold and a second request queue associated with the second server, having a second threshold. The first server increases the first threshold and processes the data request based on a comparison of the first threshold and the second threshold.
- According to yet another aspect of the invention, a computer-implemented method for processing data requests including transmitting a first data request to a first server computing device is provided. The first data request is rejected if a first request queue threshold associated with the first server computing device is exceeded. A second data request is formed by adding the first request queue threshold to the first data request, and transmitting the second data request to a second server computing device. The second data request is also rejected if a second request queue threshold associated with the second server computing device is greater than the first request queue threshold included in the second data request. A third data request is then formed, by adding the second request queue threshold to the first data request, and transmitted to the first server computing device.
- Other aspects and advantages of the present invention will become apparent from the detailed description that follows including the use of adaptive thresholds for balancing server loads.
- The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
-
FIG. 1 is a block diagram depicting an illustrative client-server operating environment suitable for distributed load balancing, including a number of client devices and a number of server devices having a client request queue; -
FIG. 2A is a block diagram of the client-server operating environment ofFIG. 1 illustrating the initiation of a request from a client device to a first server computing device; -
FIG. 2B is a block diagram of the client-server operating environment ofFIG. 1 illustrating a first alternative of a first server computing device accepting the request from the client computing device for processing; -
FIG. 2C is a block diagram of the client-server operating environment ofFIG. 1 illustrating a second alternative first server computing device rejecting the request from the client computing device and the client computing device sending the request to a second server for processing; -
FIG. 2D is a block diagram of the client-server operating environment ofFIG. 1 illustrating a first alternative second server computing device rejecting the request from the client computing device and the client computing device resending the request to the first server computing device; -
FIG. 2E is a block diagram of the client-server operating environment ofFIG. 1 illustrating a second alternative second server computing device accepting the request from the client computing device and adjusting its queue threshold; and -
FIG. 3 is a flow diagram depicting an illustrative method for accepting requests and adjusting server computing device queue thresholds. - Generally described, the invention relates to load balancing in a client-server computing environment. Specifically, the invention relates to the balancing of server load using distributed routing of client requests. In accordance with an embodiment of the invention, a client device initially transmits a data request to a selected first server device using any one of a variety of methods for selecting the server. The first server device processes the request and may reject the data request if its request queue threshold is exceeded. On rejection, the first server device includes its request queue threshold in a rejection message to the client device. The client device retransmits the data request, including the request queue threshold, to a second server device, selected in a similar manner. The second server device may reject the data request if the request queue threshold of the first server device is smaller than a request queue threshold of the second server device. In a second rejection message to the client device, the second server device includes its request queue threshold. The client device transmits the data request back to the first server device, including the request queue threshold of the second server device. The first server device processes the data request and adjusts its request queue threshold based on the request queue threshold of the first and the second server devices.
- The following detailed description describes illustrative embodiments of the invention. Although specific operating environments, system configurations, user interfaces, and flow diagrams may be illustrated and/or described, it should be understood that the examples provided are not exhaustive and do not limit the invention to the precise forms and embodiments disclosed. Persons skilled in the field of computer programming will recognize that the components and process elements described herein may be interchangeable with other components or elements or combinations of components or elements and still achieve the benefits and advantages of the invention. Although the present description may refer to the Internet, persons skilled in the art will recognize that other network environments that include local area networks, wide area networks, and/or wired or wireless networks, as well as standalone computing environments, such as personal computers, may also be suitable. In addition, although the below description describes a client-server architecture, those skilled in the art will recognize that the invention may be implemented in a peer-to-peer network as well.
- Prior to discussing the details of the invention, it will be appreciated by those skilled in the art that the following description is presented largely in terms of logic operations that may be performed by conventional computer components. These computer components, which may be grouped in a single location or distributed over a wide area, generally include computer processors, memory storage devices, display devices, input devices, etc. In circumstances where the computer components are distributed, the computer components are accessible to each other via communication links.
- In the following descriptions, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be apparent to one skilled in the art that the invention may be practiced without some or all of these specific details. In other instances, well-known process elements have not been described in detail in order not to unnecessarily obscure the invention.
-
FIG. 1 is a block diagram depicting a sample client-server operating environment 100 suitable for distributed load balancing.Client devices 102 are coupled toserver devices 106 via anetwork 102. In one illustrative embodiment, thenetwork 104 is the Internet and theclient devices 102 communicate with the severdevices 106 via Web protocols such as the HTTP (Hyper Text Transport Protocol). In this embodiment, theservers 106 may be Web servers arranged in a server farm accessible through the same URI (Uniform Resource Identifier). In a Web environment, theclient devices 102 generally search for documents using a query statement and theserver devices 106 find documents that match the query and return Web pages to theclient devices 102, which are displayed in a Web browser on theclient device 102. In another illustrative embodiment, for example, in a corporate environment, thenetwork 104 may be a LAN. Theservers 106 may offer a number of services to theclient devices 102, such as FTP (File Transfer Protocol), database access services, file access services, application services, etc. In one embodiment, the client request may be for data, such as Web pages, to be returned to theclient 102 by theserver 106. In another embodiment, the client request may indicate a request to perform some process or task at theserver 106, such as a registration or a data update at theserver 106, without returning any data. In all cases, however, theserver 106 processes the request from theclient 102. Client devices may include, but are not limited to, a personal computer, a personal digital assistant (PDA), a mobile phone device, etc. In one illustrative embodiment, theclient device 102 may include an independent component for interacting with thequeue processing application 112 for routing requests. In another illustrative embodiment, theclient device 102 may include a software component, integrated with another software component running on theclient device 102, for interacting with thequeue processing application 112 for routing requests. For example, theclient device 102 may include a plug-in component integrated with a Web browser running on theclient device 102. Such plug-in component may be specifically used for interactions with thequeue processing application 112. - With continued reference to
FIG. 1 , theserver device 106 may include aserver queue 108 having aqueue threshold 110 and aqueue processing application 112. In one illustrative embodiment, thequeue processing application 112 may be an independent software component and determines whether accepting the request causes the length of theserver queue 108 to exceed thequeue threshold 110. In another illustrative embodiment, thequeue processing application 112 may be an integral part of another application, such as a search engine, running on theserver device 106. In one embodiment, the documents queried may be stored in adata store 114.Data store 114 may be local, such as a disk drive or disk farm, or may be remote such as a remote database. - In an illustrative embodiment, the client-server environment comprises client computing devices and server computing devices coupled together through the Internet. In another illustrative embodiment, the client-server environment comprises client computing devices and server computing devices coupled together through a local area network (LAN) such as Ethernet. In yet another illustrative embodiment, the clients and servers may be virtual applications running on the same physical machine. Those skilled in the art will appreciate that the client and server components may take other forms comprising any combination of hardware and software without departing from the essence of a client-server architecture including requests from client components and processing of those requests by the server components.
- Although the above descriptions and the detailed descriptions that follow may refer to a single client and two servers, it will be appreciated by those skilled in the art that the present invention is not limited to a single client or two servers, but is equally applicable to any number of client and server machines/components. Additionally, even though the following descriptions may refer to the Web and Web-based protocols, those skilled in the art will appreciate that the techniques and systems described are equally applicable to other kinds of computing environments, such as LANs and multiple virtual servers embodied in a single machine. Still further, although the present invention will be described with regard to network-based client-server communications, the present invention may be applicable to either network-based client-severs, virtual client-severs or a combination thereof
-
FIGS. 2A-2E illustrate distributed request routing in the client-server operating environment ofFIG. 1 . With reference toFIG. 2A , when a request arrives from theclient device 102, the request is queued in theserver queue 108 subject to thethreshold 110, as more fully described below. In one illustrative embodiment, theserver queue 108 comprises fixed-size cells for storing fixed-size requests. In another embodiment, theserver queue 108 comprises an array of pointers to requests. In yet another embodiment, theserver queue 108 comprises an indexed table. In yet another embodiment, theserver queue 108 comprises a linked-list data structure. Those skilled in the art will appreciate that a queue may be implemented in many other ways while still maintaining the essential properties of a queue data structure. Thequeue processing application 112 takes the requests from the head of theserver queue 108 and processes each request, possibly using data stored indata store 114. Depending on the request, theserver device 106 may return some data to theclient device 102. In some cases the data returned to theclient 102 may be an acknowledgement that the request has been processed. The requests from theclients 102 may be routed to any one of themultiple servers 106, which offer the same services based on the same data and information stored in correspondingdata stores 114. In one illustrative embodiment, thedata stores 114 coupled with the correspondingserver device 106 include the same data synchronized periodically to stay consistent. In another illustrative embodiment, themultiple servers 106 may be coupled to thesame data store 114. In one illustrative embodiment, thedata store 114 comprises a search index suitable for use by search engines. In another illustrative embodiment, thedata store 114 may be a local or remote database. In yet another illustrative embodiment, thedata store 114 may be a file server or an application server. The routing of client requests to the servers is based on a distributed request routing algorithm with adaptive thresholding for dynamically adjusting theserver queue thresholds 110 for eachserver device 106. As noted above, in a distributed request routing algorithm, no central control system exists for request routing. Rather, the request routing algorithm is implemented using all clients and servers in a cooperative and distributed manner, as more fully described below. - The
queue threshold 110 is used by eachrespective server device 106 to determine whether to accept a request sent by aclient 102 for service. Thequeue processing application 112 compares the current queue load with thequeue threshold 110. If, upon processing the request, thequeue threshold 110 is exceeded, then the request is rejected, otherwise, the request is accepted for further processing by theserver device 106. In one illustrative embodiment, to route a request, theclient device 102 selects afirst server device 106 using any of a variety of methods for selecting the server. As described above, the selection methods can include random selection, probabilistic selection, weighted probabilistic server selection, server assignment, and the like. For example, in a weighted probabilistic server selection algorithm, a probability of selection to eachserver device 106 based on a server load is calculated based on reported server loads/resources. The probability is inversely proportional to the server load. So, the server assigned the highest probability is the server with the lightest load. In one embodiment, the server load is characterized by a length of theserver queue 108. The longer the length of theserver queue 108, the more load theserver device 106 has. In this embodiment, the server with the shortest queue length is selected. In another embodiment,server 106 may be selected randomly, for example, using a pseudo-random number generator. In yet another embodiment, theserver 106 may be selected according to a pre-assigned order. For example, a prioritized list of servers may be used by each client from which to select the next server for transmission of data requests. In yet another embodiment, theserver 106 may be selected according to a round-robin scheme. For purposes of illustration, in one embodiment, if thefirst server device 106 receives the request and thefirst queue threshold 110 is exceeded, the server device rejects the request and returns a rejection message to theclient device 102. The rejection message includes thefirst queue threshold 110. - With reference now to
FIG. 2B , in a first alternative, theserver 106 can accept the request fromclient 102. The acceptance of the request from theclient 102 is based on server load as represented by a length of theserver queue 108. If the threshold is not exceeded, then the request is accepted by theserver 106 for further processing. Theclient device 102 may be notified of the acceptance of the request. Acceptance of the request by theserver 106 includes placing the request at the back of therequest queue 108 for later processing by theserver 106. - Referring to
FIG. 2C , if theserver 106 rejects the request, theclient device 102 is notified of the rejection via a rejection message through thenetwork 104. In one illustrative embodiment, thefirst queue threshold 110 of theserver queue 108 of thefirst server 106 is included in the rejection message. In another illustrative embodiment, the rejection message is the request originally sent by theclient device 102 with thefirst queue threshold 110 appended to the request. In another illustrative embodiment, the rejection message may be a simple message including only thefirst queue threshold 110 and the sever ID of thefirst server device 106. Those skilled in the art will appreciate that other configurations of rejection messages may be used. - The
client device 102 selects a second server to which the request is to be sent. In one illustrative embodiment, theclient device 102 includes thefirst queue threshold 110 in the request sent to thesecond server device 116. For example, thefirst queue threshold 110 may be included in a URI as a parameter for thesecond server 116. Thesecond server 116 receives the request including thefirst queue threshold 110. Thesecond server device 116 treats the request the same way as did thefirst server device 106, namely, thequeue processing application 122 determines whether accepting the request causes the length of theserver queue 118 to exceed asecond queue threshold 120. - As noted above, the
second server 116 may determine that accepting the request will cause theserver queue 118 to exceed thecorresponding threshold 120. If such determination is made, then thesecond server 116 compares thefirst queue threshold 110 with thesecond queue threshold 120. If thefirst queue threshold 110 is less than thesecond queue threshold 120, thesecond server 116 also rejects the request, as illustrated inFIG. 2D . Thesecond server 116 rejects the request and sends a rejection message to theclient device 102, via thenetwork 104, indicating the rejection. The rejection message includes thequeue threshold 120 of thesecond server 116. The client device receives the rejection message and resends the request to thefirst server device 106. The request resent to thefirst server device 106 includes thequeue threshold 120 of thesecond server device 116. Thefirst server device 106 receives the resent request and processes the request in a manner similar to thesecond server device 116, as described with respect toFIG. 2C . At this point, thequeue processing application 112 compares thefirst queue threshold 110 of thefirst server device 106 to thesecond queue threshold 120 included in the resent request. Thesecond queue threshold 120 is necessarily greater than thefirst queue threshold 110 because the same comparison with the same two queue thresholds was done at thesecond server device 116 resulting in the rejection and resending of the request to thefirst server device 106. Thequeue processing application 112 adjusts thefirst queue threshold 110 to be equal to thesecond queue threshold 120, equalizing the queue thresholds in the first and thesecond server devices - As noted above, the request sent to the
second server 116 includes thefirst queue threshold 110. If thefirst queue threshold 110 is greater than thesecond queue threshold 120, thesecond server 116 accepts the request and adjusts thesecond queue threshold 120 to the same value as thefirst queue threshold 110, equalizing the two queue thresholds. This way, the queue thresholds are equalized dynamically. Queue threshold equalization provides uniform load distribution acrossservers FIG. 2E is a block diagram of the operating environment ofFIG. 1 illustrating the second server computing device accepting the request from the client computing device and adjusting its queue threshold. Thesecond server device 116 accepts the request if thefirst queue threshold 110 is greater than thesecond queue threshold 120, and adjusts thesecond queue threshold 120, as discussed above. In one illustrative embodiment, thesecond server device 116 processes the request using the data stored in thedata store 124. Thesecond server device 116 may also accept the request if thefirst queue threshold 110 is equal to thesecond queue threshold 120. In this case, the second queue threshold is increased by a constant amount to enable thesecond queue 118 to accept the request. Theclient device 102 is notified of the acceptance of the request by thesecond server device 116 and continues its interactions, such as receiving results of its request, with thesecond server device 116. - The distributed request routing algorithm described above can be partly implemented at the
server device 106, and partly at theclient device 102. The processing of the client requests and adjustment ofserver queue thresholds 110 is done at theserver devices 106 using the threshold data exchanged between the first and the second server devices through theclient device 102 via rejection messages and requests. -
FIG. 3 is a flow diagram depicting an illustrative method for accepting requests and adjusting server computing device queue thresholds. The request processing routine starts atblock 300 and proceeds to block 305 where theserver device 106 obtains a request sent from theclient device 102. Atdecision block 310, thequeue processing application 112 determines whether adding the request to theserver queue 108 causes the pendingqueue threshold 110 to be exceeded. If it is determined that the pendingqueue threshold 110 will not be exceeded, the routine proceeds to block 340 where the request is processed by theserver device 106. If it is determined that the pendingqueue threshold 110 will be exceeded by accepting the request, the routine proceeds to decision block 315 where it is determined whether the request is an alternate request rejected by analternate server device 116 and rerouted by theclient device 102 to another severdevice 106. - As discussed above, the
alternate queue threshold 120 is included in the rerouted request for access by anotherserver device 106. With continued reference toFIG. 3 , in one illustrative embodiment, thealternate queue threshold 120 is appended to a URI as a parameter which can be retrieved and used by thequeue processing application 112 for queue threshold comparisons. In another illustrative embodiment, thequeue threshold 120 may be included in a header field of the request sent by theclient device 102. If the request is an alternate request, the routine proceeds todecision box 320 where it is determined whether thealternate queue threshold 120 is greater than or equal to thepending queue threshold 110. If so, atblock 330, if thealternate queue threshold 120 is greater than the pendingqueue threshold 110, the pendingqueue threshold 110 is set to equal thealternate queue threshold 120. If thealternate queue threshold 120 is equal to thepending queue threshold 110, the pendingqueue threshold 110 is increased by a constant amount to enable it to accept the request and the request is placed at the back of thepending queue 108. Atblock 335, the request is processed and the routine terminates atblock 360. Back atdecision block 320, if it is determined that thealternate queue threshold 120 is less than the pendingqueue threshold 110, the request is rejected atblock 325. The rejection of the request is communicated to theclient device 102 by sending a rejection message to theclient device 102 including thepending queue threshold 110. As discussed above, after receiving the rejection message, theclient device 102 will reroute the request, including thepending queue threshold 110, to another server device selected randomly. Atdecision block 315, if it is determined that the request is not from analternate server device 116, the request is rejected atblock 325 because accepting the request would cause thepending queue threshold 110 to be exceeded. The routine terminates atblock 360. - Returning to block 340, if the pending queue threshold has not been exceeded at
decision block 310, the routine 300 proceeds to decision block 345 where it is determined whether to decrease thepending queue threshold 110. As discussed above, smaller queue lengths result in less request processing delay and increased overall system performance Decreasing queue threshold decreases the average queue length. The determination to decrease thepending queue threshold 110 is based on the length of thepending queue 108. In one illustrative embodiment, thequeue processing application 112 continuously polls the length of thepending queue 108 to determine whether the length is less than a predetermined fraction of thepending queue threshold 110. If so, then thepending queue threshold 110 is decreased. In one illustrative embodiment, the pendingqueue threshold 110 is reduced by a fixed amount. In another illustrative embodiment, the pendingqueue threshold 110 is reduced by an amount which is a percentage of the current value, such as ten percent. In yet another illustrative embodiment, thequeue processing application 112 may be notified, via a system message generated by theserver device 106, that an event associated with the pending queue length has taken place. The event may be specified based on the queue length being less than the pendingqueue threshold 110 for a predetermined length of time or a predetermined number of requests. If it is determined that the pendingqueue threshold 110 should be decreased, the routine proceeds to block 350 where the pendingqueue threshold 110 is decreased by an appropriate amount, as discussed above and the routine terminates atblock 360. If it is determined that the pendingqueue threshold 110 should not be decreased, the routine proceeds to block 360, and the routine 300 terminates atblock 360. - Test and simulation results indicate that the embodiments of the present invention improve request handling performance, in a client-server computing environment, at 99.9th percentile for different loads. This means that request handling performance is improved for almost all requests under various load conditions. Such performance improvements are very close to those achieved by hardware-based, central load distribution methods without the drawbacks of such methods discussed above. More specifically, the request handling performance is improved by lowering latency and queue thresholds.
- While illustrative embodiments have been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.
Claims (1)
1. A computer-implemented method of operating a server computing device to process data requests, the method comprising:
receiving a first data request from a client computing device at a first server computing device, wherein the first data request includes a second queue threshold obtained from a second server computing device;
comparing, by the first server computing device, a first queue threshold associated with the first server computing device and the second queue threshold obtained from the second server computing device;
determining, by the first server computing device, whether to process the data request at the first server computing device based on the comparison of the first queue threshold and the second queue threshold; and
managing, by the first server computing device, the first queue threshold based on the comparison of the first and second queue thresholds; and
processing the data request, at the first server computing device, based on the comparison of the first and second queue thresholds.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/276,551 US20170111478A1 (en) | 2007-03-30 | 2016-09-26 | Load balancing utilizing adaptive thresholding |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/694,667 US8159961B1 (en) | 2007-03-30 | 2007-03-30 | Load balancing utilizing adaptive thresholding |
US13/438,734 US8576710B2 (en) | 2007-03-30 | 2012-04-03 | Load balancing utilizing adaptive thresholding |
US14/071,245 US9456056B2 (en) | 2007-03-30 | 2013-11-04 | Load balancing utilizing adaptive thresholding |
US15/276,551 US20170111478A1 (en) | 2007-03-30 | 2016-09-26 | Load balancing utilizing adaptive thresholding |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/071,245 Continuation US9456056B2 (en) | 2007-03-30 | 2013-11-04 | Load balancing utilizing adaptive thresholding |
Publications (1)
Publication Number | Publication Date |
---|---|
US20170111478A1 true US20170111478A1 (en) | 2017-04-20 |
Family
ID=45931413
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/694,667 Active 2029-01-01 US8159961B1 (en) | 2007-03-30 | 2007-03-30 | Load balancing utilizing adaptive thresholding |
US13/438,734 Active 2027-04-12 US8576710B2 (en) | 2007-03-30 | 2012-04-03 | Load balancing utilizing adaptive thresholding |
US14/071,245 Active US9456056B2 (en) | 2007-03-30 | 2013-11-04 | Load balancing utilizing adaptive thresholding |
US15/276,551 Abandoned US20170111478A1 (en) | 2007-03-30 | 2016-09-26 | Load balancing utilizing adaptive thresholding |
Family Applications Before (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/694,667 Active 2029-01-01 US8159961B1 (en) | 2007-03-30 | 2007-03-30 | Load balancing utilizing adaptive thresholding |
US13/438,734 Active 2027-04-12 US8576710B2 (en) | 2007-03-30 | 2012-04-03 | Load balancing utilizing adaptive thresholding |
US14/071,245 Active US9456056B2 (en) | 2007-03-30 | 2013-11-04 | Load balancing utilizing adaptive thresholding |
Country Status (1)
Country | Link |
---|---|
US (4) | US8159961B1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170111227A1 (en) * | 2014-05-23 | 2017-04-20 | Nec Europe Ltd. | Method for mounting a device at a server in a network |
US10547693B2 (en) * | 2012-09-07 | 2020-01-28 | Avigilon Corporation | Security device capability discovery and device selection |
US10567303B2 (en) | 2006-03-14 | 2020-02-18 | Amazon Technologies, Inc. | System and method for routing service requests |
Families Citing this family (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8159961B1 (en) | 2007-03-30 | 2012-04-17 | Amazon Technologies, Inc. | Load balancing utilizing adaptive thresholding |
US7797426B1 (en) | 2008-06-27 | 2010-09-14 | BitGravity, Inc. | Managing TCP anycast requests |
US8621065B1 (en) * | 2008-10-23 | 2013-12-31 | Amazon Technologies, Inc. | Dynamic blocking of suspicious electronic submissions |
US8769541B2 (en) | 2009-12-31 | 2014-07-01 | Facebook, Inc. | Load balancing web service by rejecting connections |
US8972551B1 (en) * | 2010-04-27 | 2015-03-03 | Amazon Technologies, Inc. | Prioritizing service requests |
EP2395710B1 (en) * | 2010-06-08 | 2013-11-06 | Alcatel Lucent | Device and method for data load balancing |
US20120151479A1 (en) | 2010-12-10 | 2012-06-14 | Salesforce.Com, Inc. | Horizontal splitting of tasks within a homogenous pool of virtual machines |
US8799378B2 (en) * | 2010-12-17 | 2014-08-05 | Microsoft Corporation | Non-greedy consumption by execution blocks in dataflow networks |
US8868730B2 (en) * | 2011-03-09 | 2014-10-21 | Ncr Corporation | Methods of managing loads on a plurality of secondary data servers whose workflows are controlled by a primary control server |
US9246985B2 (en) * | 2011-06-28 | 2016-01-26 | Novell, Inc. | Techniques for prevent information disclosure via dynamic secure cloud resources |
JP5724687B2 (en) * | 2011-07-04 | 2015-05-27 | 富士通株式会社 | Information processing apparatus, server selection method, and program |
US9241031B2 (en) * | 2011-08-02 | 2016-01-19 | Verizon Patent And Licensing Inc. | Selecting an auxiliary event-package server |
WO2013062599A1 (en) * | 2011-10-26 | 2013-05-02 | Box, Inc. | Enhanced multimedia content preview rendering in a cloud content management system |
US11232481B2 (en) | 2012-01-30 | 2022-01-25 | Box, Inc. | Extended applications of multimedia content previews in the cloud-based content management system |
US9742676B2 (en) * | 2012-06-06 | 2017-08-22 | International Business Machines Corporation | Highly available servers |
US20140181112A1 (en) * | 2012-12-26 | 2014-06-26 | Hon Hai Precision Industry Co., Ltd. | Control device and file distribution method |
US9900252B2 (en) * | 2013-03-08 | 2018-02-20 | A10 Networks, Inc. | Application delivery controller and global server load balancer |
US9967163B2 (en) * | 2013-04-16 | 2018-05-08 | Hitachi, Ltd. | Message system for avoiding processing-performance decline |
US10091066B2 (en) * | 2013-09-13 | 2018-10-02 | Abb Schweiz Ag | Integration method and system |
JP6305078B2 (en) * | 2014-01-29 | 2018-04-04 | キヤノン株式会社 | System and control method |
US9426215B2 (en) * | 2014-04-08 | 2016-08-23 | Aol Inc. | Determining load state of remote systems using delay and packet loss rate |
US10348837B2 (en) * | 2014-12-16 | 2019-07-09 | Citrix Systems, Inc. | Methods and systems for connecting devices to applications and desktops that are receiving maintenance |
US10554554B2 (en) | 2016-12-06 | 2020-02-04 | Microsoft Technology Licensing, Llc | Hybrid network processing load distribution in computing systems |
US10826841B2 (en) * | 2016-12-06 | 2020-11-03 | Microsoft Technology Licensing, Llc | Modification of queue affinity to cores based on utilization |
US10715424B2 (en) | 2016-12-06 | 2020-07-14 | Microsoft Technology Licensing, Llc | Network traffic management with queues affinitized to one or more cores |
CN108173894A (en) * | 2016-12-07 | 2018-06-15 | 阿里巴巴集团控股有限公司 | The method, apparatus and server apparatus of server load balancing |
EP3556078B1 (en) * | 2016-12-13 | 2021-10-13 | FIMER S.p.A. | A multi-client/multi-server managing method and system with a routine of rejection of already connected clients for balancing the system |
US11010193B2 (en) * | 2017-04-17 | 2021-05-18 | Microsoft Technology Licensing, Llc | Efficient queue management for cluster scheduling |
JP6931080B2 (en) | 2017-06-13 | 2021-09-01 | グーグル エルエルシーGoogle LLC | Transmission of high-latency digital components in a low-latency environment |
US10706053B2 (en) * | 2017-06-13 | 2020-07-07 | Oracle International Corporation | Method and system for defining an object-agnostic offlinable data storage model |
US11693906B2 (en) | 2017-06-13 | 2023-07-04 | Oracle International Comporation | Method and system for using access patterns to suggest or sort objects |
US10846283B2 (en) | 2017-06-13 | 2020-11-24 | Oracle International Corporation | Method and system for defining an adaptive polymorphic data model |
CN109271265B (en) * | 2018-09-29 | 2023-09-15 | 平安科技(深圳)有限公司 | Request processing method, device, equipment and storage medium based on message queue |
US11182205B2 (en) * | 2019-01-02 | 2021-11-23 | Mellanox Technologies, Ltd. | Multi-processor queuing model |
US10772062B1 (en) * | 2019-04-15 | 2020-09-08 | T-Mobile Usa, Inc. | Network-function monitoring and control |
US11099891B2 (en) * | 2019-04-22 | 2021-08-24 | International Business Machines Corporation | Scheduling requests based on resource information |
CN114615275A (en) * | 2022-03-04 | 2022-06-10 | 国家工业信息安全发展研究中心 | Distributed load balancing control method and device for cloud storage |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030187931A1 (en) * | 2002-03-29 | 2003-10-02 | Olsen Gregory P. | Facilitating resource access using prioritized multicast responses to a discovery request |
US20060212873A1 (en) * | 2005-03-15 | 2006-09-21 | Takashi Takahisa | Method and system for managing load balancing in data processing system |
Family Cites Families (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5325464A (en) | 1990-05-22 | 1994-06-28 | International Business Machines Corporation | Pyramid learning architecture neurocomputer |
US6185619B1 (en) | 1996-12-09 | 2001-02-06 | Genuity Inc. | Method and apparatus for balancing the process load on network servers according to network and serve based policies |
US5742762A (en) | 1995-05-19 | 1998-04-21 | Telogy Networks, Inc. | Network management gateway |
AU714336B2 (en) | 1996-07-25 | 1999-12-23 | Clearway Acquisition, Inc. | Web serving system with primary and secondary servers |
US5944782A (en) | 1996-10-16 | 1999-08-31 | Veritas Software Corporation | Event management system for distributed computing environment |
US6061722A (en) | 1996-12-23 | 2000-05-09 | T E Network, Inc. | Assessing network performance without interference with normal network operations |
US6256675B1 (en) | 1997-05-06 | 2001-07-03 | At&T Corp. | System and method for allocating requests for objects and managing replicas of objects on a network |
US6134244A (en) | 1997-08-30 | 2000-10-17 | Van Renesse; Robert | Method and system for optimizing layered communication protocols |
US6517587B2 (en) * | 1998-12-08 | 2003-02-11 | Yodlee.Com, Inc. | Networked architecture for enabling automated gathering of information from Web servers |
US6184829B1 (en) | 1999-01-08 | 2001-02-06 | Trueposition, Inc. | Calibration for wireless location system |
US6760775B1 (en) | 1999-03-05 | 2004-07-06 | At&T Corp. | System, method and apparatus for network service load and reliability management |
US6505254B1 (en) | 1999-04-19 | 2003-01-07 | Cisco Technology, Inc. | Methods and apparatus for routing requests in a network |
US6411967B1 (en) | 1999-06-18 | 2002-06-25 | Reliable Network Solutions | Distributed processing system with replicated management information base |
US6629149B1 (en) | 1999-08-17 | 2003-09-30 | At&T Corp. | Network system and method |
US7062556B1 (en) * | 1999-11-22 | 2006-06-13 | Motorola, Inc. | Load balancing method in a communication network |
US6560717B1 (en) | 1999-12-10 | 2003-05-06 | Art Technology Group, Inc. | Method and system for load balancing and management |
US6529953B1 (en) | 1999-12-17 | 2003-03-04 | Reliable Network Solutions | Scalable computer network resource monitoring and location system |
US6724770B1 (en) | 2000-02-17 | 2004-04-20 | Kenneth P. Birman | Multicast protocol with reduced buffering requirements |
US7058706B1 (en) | 2000-03-31 | 2006-06-06 | Akamai Technologies, Inc. | Method and apparatus for determining latency between multiple servers and a client |
US7240100B1 (en) | 2000-04-14 | 2007-07-03 | Akamai Technologies, Inc. | Content delivery network (CDN) content server request handling mechanism with metadata framework support |
US7937470B2 (en) | 2000-12-21 | 2011-05-03 | Oracle International Corp. | Methods of determining communications protocol latency |
US6757543B2 (en) | 2001-03-20 | 2004-06-29 | Keynote Systems, Inc. | System and method for wireless data performance monitoring |
GB0119145D0 (en) * | 2001-08-06 | 2001-09-26 | Nokia Corp | Controlling processing networks |
US20030167295A1 (en) * | 2002-03-01 | 2003-09-04 | Verity, Inc. | Automatic network load balancing using self-replicating resources |
US7047315B1 (en) | 2002-03-19 | 2006-05-16 | Cisco Technology, Inc. | Method providing server affinity and client stickiness in a server load balancing device without TCP termination and without keeping flow states |
KR100442610B1 (en) * | 2002-04-22 | 2004-08-02 | 삼성전자주식회사 | Flow control method of radius protocol |
US7650403B2 (en) * | 2002-11-20 | 2010-01-19 | Microsoft Corporation | System and method for client side monitoring of client server communications |
US7389510B2 (en) * | 2003-11-06 | 2008-06-17 | International Business Machines Corporation | Load balancing of servers in a cluster |
US7630313B2 (en) * | 2004-09-30 | 2009-12-08 | Alcatel-Lucent Usa Inc. | Scheduled determination of network resource availability |
US7665092B1 (en) * | 2004-12-15 | 2010-02-16 | Sun Microsystems, Inc. | Method and apparatus for distributed state-based load balancing between task queues |
US7685270B1 (en) | 2005-03-31 | 2010-03-23 | Amazon Technologies, Inc. | Method and apparatus for measuring latency in web services |
US7853953B2 (en) * | 2005-05-27 | 2010-12-14 | International Business Machines Corporation | Methods and apparatus for selective workload off-loading across multiple data centers |
US20060285509A1 (en) | 2005-06-15 | 2006-12-21 | Johan Asplund | Methods for measuring latency in a multicast environment |
US20070143460A1 (en) | 2005-12-19 | 2007-06-21 | International Business Machines Corporation | Load-balancing metrics for adaptive dispatching of long asynchronous network requests |
US7519734B1 (en) | 2006-03-14 | 2009-04-14 | Amazon Technologies, Inc. | System and method for routing service requests |
US7797406B2 (en) * | 2006-07-27 | 2010-09-14 | Cisco Technology, Inc. | Applying quality of service to application messages in network elements based on roles and status |
US8493858B2 (en) * | 2006-08-22 | 2013-07-23 | Citrix Systems, Inc | Systems and methods for providing dynamic connection spillover among virtual servers |
US8291108B2 (en) * | 2007-03-12 | 2012-10-16 | Citrix Systems, Inc. | Systems and methods for load balancing based on user selected metrics |
US8159961B1 (en) | 2007-03-30 | 2012-04-17 | Amazon Technologies, Inc. | Load balancing utilizing adaptive thresholding |
-
2007
- 2007-03-30 US US11/694,667 patent/US8159961B1/en active Active
-
2012
- 2012-04-03 US US13/438,734 patent/US8576710B2/en active Active
-
2013
- 2013-11-04 US US14/071,245 patent/US9456056B2/en active Active
-
2016
- 2016-09-26 US US15/276,551 patent/US20170111478A1/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030187931A1 (en) * | 2002-03-29 | 2003-10-02 | Olsen Gregory P. | Facilitating resource access using prioritized multicast responses to a discovery request |
US20060212873A1 (en) * | 2005-03-15 | 2006-09-21 | Takashi Takahisa | Method and system for managing load balancing in data processing system |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10567303B2 (en) | 2006-03-14 | 2020-02-18 | Amazon Technologies, Inc. | System and method for routing service requests |
US10547693B2 (en) * | 2012-09-07 | 2020-01-28 | Avigilon Corporation | Security device capability discovery and device selection |
US20170111227A1 (en) * | 2014-05-23 | 2017-04-20 | Nec Europe Ltd. | Method for mounting a device at a server in a network |
Also Published As
Publication number | Publication date |
---|---|
US8576710B2 (en) | 2013-11-05 |
US8159961B1 (en) | 2012-04-17 |
US9456056B2 (en) | 2016-09-27 |
US20140222895A1 (en) | 2014-08-07 |
US20120254300A1 (en) | 2012-10-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9456056B2 (en) | Load balancing utilizing adaptive thresholding | |
US11418620B2 (en) | Service request management | |
US10567303B2 (en) | System and method for routing service requests | |
CN107801086B (en) | The dispatching method and system of more cache servers | |
CN104580538B (en) | A kind of method of raising Nginx server load balancing efficiency | |
JP4974888B2 (en) | Distributed request routing | |
CN101370035B (en) | Method and system for dynamic client/server network management using proxy servers | |
US20020083117A1 (en) | Assured quality-of-service request scheduling | |
CN106790340B (en) | Link scheduling method and device | |
CN109672711B (en) | Reverse proxy server Nginx-based http request processing method and system | |
US9438669B2 (en) | System and method for packetizing data stream in peer-to-peer (P2P) based streaming service | |
US7844708B2 (en) | Method and apparatus for load sharing and data distribution in servers | |
US8549078B2 (en) | Communications system providing load balancing based upon connectivity disruptions and related methods | |
US20120137017A1 (en) | System and method for controlling server usage in peer-to-peer (p2p) based streaming service | |
CN105025042B (en) | A kind of method and system of determining data information, proxy server | |
US20100057914A1 (en) | Method, apparatus and system for scheduling contents | |
Nakai et al. | Improving the QoS of web services via client-based load distribution | |
Bhowmik et al. | Distributed adaptive video streaming using inter-server data distribution and agent-based adaptive load balancing | |
Liaw et al. | A load balancing scheme for web server design | |
Lorenz et al. | Tuning of QoS Aware Load Balancing Algorithm (QoS–LB) for Highly Loaded Server Clusters | |
Chaudhari et al. | Load Balancing with Specialized Server Using Database | |
Ding et al. | A chord-based load balancing algorithm for P2P network | |
Shvayka | Load balancing in IoT applications using consistent hashing | |
Kwak et al. | Dynamic information‐based scalable hashing on a cluster of web cache servers | |
Goddard | ASSURED QUALITY-OF-SERVICE REQUEST SCHEDULING |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |