WO2016137496A1 - Responsive server identification among multiple data servers linked to a shared memory - Google Patents
Responsive server identification among multiple data servers linked to a shared memory Download PDFInfo
- Publication number
- WO2016137496A1 WO2016137496A1 PCT/US2015/018019 US2015018019W WO2016137496A1 WO 2016137496 A1 WO2016137496 A1 WO 2016137496A1 US 2015018019 W US2015018019 W US 2015018019W WO 2016137496 A1 WO2016137496 A1 WO 2016137496A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- server
- data
- servers
- response
- responsive
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1008—Server selection for load balancing based on parameters of servers, e.g. available memory or workload
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/30—Definitions, standards or architectural aspects of layered protocol stacks
- H04L69/32—Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
- H04L69/322—Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
Definitions
- Figure 1 shows an example of a device that supports responsive server identification among multiple data servers linked to a shared memory.
- Figure 2 shows an example of the server selection circuitry identifying a responsive server among multiple data servers linked to a shared memory.
- Figure 3 shows an example of the server selection circuitry selecting a responsive server for servicing a data query.
- Figure 4 shows an example of logic that the server selection circuitry may implement.
- Figure 5 shows another example of logic that the server selection circuitry may implement.
- Figure 6 shows an example of a computing device that supports responsive server identification among multiple data servers linked to a shared memory.
- Figure 1 shows an example of a device 100 that supports server selection for data servers linked to a shared memory.
- the device 100 may be a client device that executes a client application.
- the device 100 may be, for example, a desktop or laptop computer, a web server, a data center computing device, a personal computing device, and more.
- the device 100 may include a memory 102.
- the memory 102 may be any machine-readable storage medium such as Random Access Memory (RAM), an Electrically-Erasable Programmable Read-Only Memory (EEPROM), a storage drive, an optical disk, a processor cache, or any other volatile or nonvolatile storage medium.
- the memory 102 may store a server response list 104, which may list multiple data servers linked to the device 100.
- the multiple data servers may be linked to a shared memory (not shown) storing data for querying.
- the device 100 (or application executing on the device 100) may request data stored in the shared memory through a data query.
- the device 100 may include server selection circuitry 1 10 through which the device 100 may determine a particular data server to send a data query through.
- the server selection circuitry 1 10 may be implemented through any combination of hardware (e.g., a processor) or machine-readable instructions (e.g., stored on a machine-readable medium).
- the server selection circuitry 1 10 includes a processor and a machine-readable medium storing executable instructions to perform any of the features described herein.
- the server selection circuitry 1 10 may identify a responsive server amongst the multiple data servers linked to the shared memory in which to send a data query to retrieve requested data. In doing so, the server selection circuitry 1 10 may apply any number of responsiveness criteria in determining whether a data server is responsive or unresponsive.
- the server selection circuitry 1 10 may track responsive and unresponsive data servers through the server response list 104.
- the server selection circuitry 1 10 maintains the server response list by 104 determining response times for the multiple data servers to service a data query, including concurrently sending the data query to the multiple data servers to track the response time for each of the multiple servers to service the data query.
- the server selection circuitry 1 10 may then identify responsive servers among the multiple data servers listed in the server response list 104 according to the response time to service the data query and select one of the responsive servers to send a subsequent data query to be serviced through the shared memory.
- FIG. 2 shows an example of the server selection circuitry 1 10 identifying a responsive server among multiple data servers linked to a shared memory.
- the server selection circuitry 1 10 is communicatively linked to the multiple data servers 202 including the servers labeled as data server A, data server B, data server C, and data server D.
- the multiple data servers 202 may be some or all of the data servers linked to a shared memory through which the server selection circuitry 1 10 may query for data.
- the server selection circuitry 1 10 may query data through any one of the multiple data servers 202.
- a data server may include any combination of circuitry or logic for performing a particular task or operation.
- the multiple data servers 202 may be part of a data serving system through which client applications (e.g., a web server) request data.
- client applications e.g., a web server
- An individual data server among the multiple data servers 202 may service a data query from a client application by retrieving the queried data from a shared memory 210 accessible to the multiple data servers 202.
- the server selection circuitry 1 10 may send data queries to the multiple data servers 202 through a communication network 204.
- the communication network 204 may include a local area network (LAN), for example as part of a data center implementation as client devices interface with backend servers.
- LAN local area network
- the multiple data servers 202 may be linked to a shared memory, such as the shared memory 210 shown in Figure 2.
- the shared memory 210 may implement a common memory namespace accessible to client applications through the multiple data servers 202.
- the shared memory 210 may take the form of a random access memory (RAM), which may be a dynamic RAM or a non-volatile RAM for example.
- RAM random access memory
- the shared memory is byte- addressable, thus supporting access to a particular memory address or memory address range within the shared memory.
- the shared memory is block addressable.
- a memory medium that implements the shared memory may be volatile or non-volatile.
- the shared memory is a non-volatile computer storage medium, such as a non-volatile RAM, a hard drive, flash memory, optical disk, memristor array, solid state drive, and the like.
- the multiple data servers 202 and shared memory 210 may be implemented, for example, as a rack-scale system.
- the multiple data servers 202 may access the shared memory 210 through a high-speed memory fabric or network, supporting increased access speed to retrieve data as compared to shared disk systems.
- a data serving system or other system implementing the multiple data servers 202 may share physical resources of the rack-scale system with other applications or services, which may cause resource contention and delay in servicing data queries.
- the server selection circuitry 1 10 may reduce such delay by identifying responsive data servers among the multiple data servers 202.
- any of the multiple data servers 202 may be capable of servicing a data query, and thus the server selection circuitry 1 10 may leverage the common access to the shared memory 210 and flexibly select a particular data server that meets specific performance constraints or quality of service (QoS) requirements, e.g., as set forth by a client application.
- QoS quality of service
- server selection circuitry 1 10 may implement any combination of the features described for the probe phase and selection phase, e.g., operating at separate times, sequentially, concurrently, or according to any other combination.
- the server selection circuitry 1 10 may collect response data from the multiple data servers 202. Through the collected response data, the server selection circuitry 1 10 may categorize an individual data server as responsive or unresponsive, which may also be referred to as a responsiveness determination. In some examples, the server selection circuitry 1 10 collects response data as a response latency for an individual data server to service (e.g., respond to) a data query. As other examples, the server selection circuitry 1 10 may collect response data as bandwidth availability, resource consumption, or any other performance metric. As discussed in greater detail below, the server selection circuitry 1 10 may categorize responsive and unresponsive servers according to collected response data.
- the server selection circuitry 1 10 may send a data query to some or all of the multiple data servers 202.
- the server selection circuitry 1 10 sends the data query 220 to each of the data servers A, B, C, and D.
- the data query 220 may be a data query or request generated by a client application, such as a web search application querying for a search term.
- the server selection circuitry 1 10 may send the same data query 220 to more than one data server, even though any individual data server can service the data query 220.
- the server selection circuitry 1 10 may, for example, clone the data query 220 so as to send an instance of the data query 220 to some or all of the multiple data servers 202.
- the server selection circuitry 1 10 may concurrently send the data query 220 to the multiple data servers 202.
- the server selection circuitry 1 10 may duplicate and send the data query 220 to more than one data server, independent of whether any of the data servers has responded to the data query 220.
- the server selection circuitry 1 10 may not inject a calculated or deliberate delay in sending successive clones of the data query 220 to successive data servers among the multiple data servers 202, stopping the transmission of the data query 220 if another data server responds to earlier sent instance of the data query 220.
- the server selection circuitry 1 10 may unconditionally send the instances of the data queries 220 to the multiple data servers.
- the server selection circuitry may inject delay (e.g., to prevent overloading the shared memory or causing resource contention), but nonetheless send each of the cloned instances of the data query 220 to the multiple data servers without dependence on whether a response has been received for a previously sent instance of the data query 220.
- the server selection circuitry 1 10 may thus send the cloned copies of the data query 220 without hedging, and may instead concurrently (or with delay) send the data query 220 to multiple data servers 202 to service the query as well as to collect response data.
- the server selection circuitry 1 10 may thus send the cloned copies of the data query 220 without hedging.
- the server selection circuitry 1 10 may record the response time for individual data servers to respond to the data query 220. Upon receiving responses to the data query 220, the server selection circuitry 1 10 may populate corresponding portions of the server response list 104 with the response times. In the example shown in Figure 2, the server selection circuitry 1 10 records the response times of 500 microseconds ( ⁇ ), 450 ⁇ , 950 ⁇ , and 300 ⁇ for the data servers A, B, C, and D respectively to respond to the data query 220. As the data query 220 may originate from a client application, the server selection circuitry 1 10 may determine the data query 220 as serviced upon receiving the first response from the multiple data servers 202. Thus, in Figure 2, the server selection circuitry 1 10 may receive the requested data when the data server D responds to the data query 220 with a shortest response time of 300 ⁇ , providing the queried data returned by data server D to the client application.
- the server selection circuitry 1 10 may record response data for some or all of the multiple data servers 202.
- the server selection circuitry 1 10 may, for example, populate the server response list 104 through multiple, different data queries directed at different subsets of the pool of data servers linked to the shared memory 210.
- the server selection circuitry 1 10 may probe subsets of the multiple data servers 202 through different data queries until response data has been collected for the multiple data servers 202.
- the server selection circuitry 1 10 may probe 'k' number of distinct data servers through a first data query, cloning and concurrently sending the first data query to the 'k' distinct data servers.
- the 'k' distinct data servers may be some, but not all, of the multiple data servers linked to the shared memory and capable of servicing the data query.
- the server selection circuitry 1 10 may clone the first data query for sending to the data servers A and B, for instance.
- the server selection circuitry 1 10 may record the response times for the 'k' distinct data servers to service the first data query.
- the server selection circuitry 1 10 may clone and concurrently send a second data query to another data server subset among the multiple data servers 202, such as data servers C and D in Figure 2.
- the server selection circuitry 1 10 may thus record the response times for the next subset of the multiple data servers 202.
- the server selection circuitry 1 10 may continue to clone and send data queries until, for example, response data has been collected for each of the multiple data servers 202.
- the server selection circuitry 1 10 may collect response data for the multiple data servers 202, which may be tracked through the server response list 104.
- the server selection circuitry 1 10 may identify responsive servers among the multiple data servers 202 and select a particular responsive server to query data through.
- Figure 3 shows an example of the server selection circuitry 1 10 selecting a responsive server for servicing a data query.
- the server selection circuitry 1 10 may perform a responsive server determination using the response data collected during the probe phase.
- the response data may be specified in the server response list 104, for example.
- the server selection circuitry 1 10 tracks response times for the data servers A, B, C, and D through the server response list 104.
- the server response list 104 indicates response times of 500 ⁇ , 450 ⁇ , 950 ⁇ , and 300 ⁇ for data servers A, B, C, and D respectively to service the data query 220.
- the server selection circuitry 1 10 may identify a data server as responsive according to any number of responsiveness criteria, one example of which is through a response latency threshold.
- the response latency threshold may specify a time boundary for servicing a data query that a data server is to satisfy to be categorized as responsive.
- the server selection circuitry 1 10 may identify responsive servers from the server response list 104 as those with an associated response time that does not exceed a response latency threshold.
- the server selection circuitry 1 10 may define a response latency threshold as 700 ⁇ according to an application QoS requirement, and thus identify data servers A, B, and D as responsive and server C as unresponsive.
- the server response list 104 may provide an indication as to which of the multiple data servers 202 are determined as responsive according to the responsiveness criteria, and which are determined as unresponsive.
- the server response list 104 may include a responsiveness flag or bit indicating the categorization of a particular data server.
- the response latency threshold may be set in various ways.
- the server selection circuitry 1 10 defines a response latency threshold according to a QoS requirement specified by a client application that originates the data query.
- the application QoS requirement may specify a memory access bandwidth requirement, throughput requirement, resource availability requirement, or any other quality or performance metric.
- the server selection circuitry 1 10 may translate the application QoS requirement into a corresponding response latency threshold that satisfies the application QoS requirement.
- the server selection circuitry 1 10 may translate the application QoS requirement into a corresponding responsiveness criteria for the response data that the server selection circuitry 1 10 may apply for the responsive server determination.
- the server selection circuitry 1 10 may identify a responsive data server among the multiple data servers 202 on an application-specific basis. As application QoS requirements may vary by application, the server selection circuitry 1 10 may flexibly support differentiated responsive server determination for different applications with varying quality and/or performance requirements.
- the server selection circuitry 1 10 may select a responsive server for handling a data query.
- the server selection circuitry 1 10 may select a particular responsive server among data servers A, B, and D for servicing a data query.
- the server selection circuitry 1 10 selects among the responsive servers in a round-robin manner.
- the server selection circuitry 1 10 randomly selects among the identified responsive servers. Selection techniques among multiple responsive servers may be feasible as any of the responsive data servers may be capable of servicing the data query, as the data servers share access to the shared memory 210.
- the server selection circuitry 1 10 may select responsive data servers for servicing the data queries, e.g., data servers with response times that meet the QoS requirements for the client application.
- the server selection circuitry 1 10 may maintain a QoS level for data service in a shared memory system, which may be provided through selection of responsive servers in the selection phase.
- the server selection circuitry 1 10 may implement the features described herein in combination. For example, during the selection phase, the server selection circuitry 1 10 may continue to monitor response times or otherwise collect response data for the multiple data servers 202, whether for responsive data servers, unresponsive data servers, or both. Even after an initial probe phase (e.g., upon startup of an application or physical device(s) implementing server selection circuitry 1 10), the server selection circuitry 1 10 may update the server response list 104 with collected response data and perform subsequent responsive and unresponsive server determinations for the multiple data servers 202. [0030] In some examples, the server selection circuitry 1 10 differentiates between different types of data queries.
- a client application may generate multiple data query types, examples of which include search queries, scan queries, add queries, change queries, delete queries, and more. Different data query types may be characterized with different QoS requirements or response latency thresholds.
- the server selection circuitry 1 10 may separately track response times and apply separate responsiveness criteria for the multiple data query types. Thus, for a particular data query type (e.g., web search data query), the server selection circuitry 1 10 may identify responsive servers specifically using response data relevant to the particular data query type, such as response times for the multiple data servers 202 in servicing the particular data query type.
- the server response list 104 separately tracks response times or other response data for multiple data query types.
- the server selection circuitry 1 10 may maintain multiple server response lists 104, e.g., individually for a particular data query type.
- the server selection circuitry 1 10 may apply varying response latency thresholds depending on the particular data query type being serviced, which may be preconfigured, customized, or set forth according to an application QoS requirement for the particular data query types.
- the server selection circuitry 1 10 may differentiate data queries of different sizes.
- the data size of returned data for the data query may impact performance, and impact meeting of application QoS requirements.
- data queries for selective data searches may return less result data whereas less selective or broader searches may return a greater amount of result data.
- the server selection circuitry 1 10 may differentiate between these data queries of differing size by separately collecting and tracking response data for data queries within particular data size tiers, for example.
- the server response list 104 may track response data for the multiple data servers with respect to the multiple data size tiers, which may support application of different or individualized response latency thresholds for different data size tiers.
- the server selection circuitry 1 10 may reduce tail latencies.
- Tail latencies may refer to the latency value at the tail or upper end of a latency distribution (e.g., 95 th or 99 th percentile).
- a latency distribution e.g. 95 th or 99 th percentile
- data queries or other web server operations may be subject to resource contention resulting from other applications or services concurrently utilizing the same physical resources.
- High server response times may result in shared resource systems, which may result in high tail latency values.
- the server selection circuitry 1 10 may reduce the tail latency values for a data serving system, through specifically querying responsive servers among the multiple data servers 202, each of which may be capable of retrieving data from the shared memory.
- the server selection circuitry 1 10 may take advantage of the access speeds for a shared memory implemented as non-volatile memory (e.g., a memristor array). Doing so may likewise reduce tail latency values.
- a shared memory implemented as non-volatile memory (e.g., a memristor array). Doing so may likewise reduce tail latency values.
- FIG. 4 shows an example of logic 400 that the server selection circuitry 1 10 may implement.
- the server selection circuitry 1 10 may implement the logic 400 as hardware and/or machine-readable instructions, for example.
- the server selection circuitry 1 10 may execute the logic 400 as a process or method to identify a responsive server among multiple data servers linked to a shared memory.
- the server selection circuitry 1 10 may send a first data query to multiple data servers, the multiple data servers linked to a shared memory storing data requested by the first data query (402).
- each individual server among the multiple data servers can service the first data query by retrieving requested data from the shared memory.
- the multiple data servers and shared memory may be implemented as part of a rack-scale system.
- the multiple data servers to which the server selection circuitry 1 10 sends the first data query may be some or all of the data servers that access the shared memory.
- the server selection circuitry 1 10 may clone the first data query to send an instance of the first data query to each of the multiple data servers.
- the server selection circuitry 1 10 may track response times for the multiple data servers (404), and the response times may include a particular response time for each of the multiple data servers to service the first data query. In some examples, the server selection circuitry 1 10 tracks the response times through a server response list 104. The server selection circuitry 1 10 may identify a responsive server from among the multiple servers according to the response times (406), such as by applying a response latency threshold corresponding to an application QoS requirement. The server selection circuitry 1 10 may identify a responsive server by determining that the particular response time for the responsive server to service the first data query does not exceed the response latency threshold. Then, the server selection circuitry 1 10 may send a second data query to the response server (408).
- FIG. 5 shows an example of logic 500 that the server selection circuitry 1 10 may implement.
- the server selection circuitry 1 10 may implement the logic 500 as hardware and/or machine-readable instructions, for example.
- the server selection circuitry 1 10 may execute the logic 500 as a process or method to support identifying and selecting a responsive server among multiple data servers linked to a shared memory.
- the server selection circuitry 1 10 may monitor the multiple data servers to determine a previously identified responsive server as unresponsive, for example during the selection phase and after an initial probe phase.
- a data query may be sent to a responsive server (e.g., selected through the server response list 104)
- the server selection circuitry 1 10 may monitor and track the response time for the responsive server to service the data query.
- the data query may be a second data query sent to the responsive server (the first data query being sent as part of the probe phase or as the first data query described in Figure 4).
- the server selection circuitry 1 10 may perform, in response, a responsiveness verification process to determine whether the responsive server has become unresponsive.
- the responsiveness verification process may occur during normal operation of the server selection circuitry 1 10 in selecting data servers to query data to.
- the server selection circuitry 1 10 may send a subsequent data query for servicing by the responsive server (506).
- the subsequent data query may be sent as a third data query and at, for example, a subsequent time when the responsive server is selected by the server selection circuitry 1 10 for servicing a data query.
- the server selection circuitry 1 10 may determine that the responsive server has become unresponsive (510).
- the server selection circuitry 1 10 may also identify that the responsive server as an unresponsive server instead (512), for example by updating the server response list 104 or otherwise categorizing the responsive server instead as unresponsive.
- the server selection circuitry 1 10 may determine that the responsive server has not become responsive (514) and continue to select the responsive server for handling data queries to the shared common memory.
- the server selection circuitry 1 10 may determine not to send subsequent data queries to an unresponsive server (516), such as the previously responsive server now identified as unresponsive.
- the server selection circuitry 1 10 may employ any number or combination of responsiveness verification criteria to determine whether a previously identified responsive server has become unresponsive.
- a responsiveness verification criterion may be met when a threshold number of data queries serviced by the previously identified responsive server have a response time that exceeds the response latency threshold, e.g., when 3 or more data queries serviced by previously identified responsive server have response times that exceed the response latency threshold.
- responsiveness verification criteria for the previously identified responsive server include: (i) when a threshold number of serviced data queries have a response time exceeding the response latency threshold over a set or predetermined period of time, (ii) when a threshold percentage of the serviced data queries have a response time exceeding the response latency threshold (e.g., as measured over a predetermined number of previous data queries or over a predetermined period of time); and (iii) when a serviced data query exceeds the response latency threshold by more than a threshold time amount.
- the server selection circuitry 1 10 may employ any combination of responsiveness verification criteria to determine a previously- identified responsive server as unresponsive.
- the server selection circuitry 1 10 applies the responsiveness verification criteria through normal operation of server selection and data querying, e.g., by performing responsiveness verification as a background process in the selection phase.
- the server selection circuitry 1 10 employs a dedicated responsiveness verification, e.g., directly in response to or specifically upon detecting that a response time for the previously identified responsive server has exceeded a response latency threshold.
- the server selection circuitry 1 10 performs a probe phase as part of the responsiveness verification process.
- the server selection circuitry 1 10 may collect response data for some or all of the multiple data servers upon detecting that a particular data server has become unresponsive, including collecting response data for responsive data servers.
- the server selection circuitry 1 10 may detect whether multiple responsive servers have become unresponsive, e.g., when a newly executed application, job, or task consumes computation resources or bandwidth of multiple servers, compute nodes, or other hardware.
- the server selection circuitry 1 10 may monitor an unresponsive server to determine the unresponsive server has become responsive (518). For example, the server selection circuitry 1 10 may periodically probe unresponsive data servers to determine whether response data for the unresponsive data servers indicates the servers have become responsive. The server selection circuitry 1 10 may clone a data query to send to both a selected responsive server (e.g., for servicing the data query) and an unresponsive server (e.g., to probe the unresponsive server). The server selection circuitry 1 10 track the response time for the unresponsive server to service the data query, and deem the unresponsive server as responsive instead when the response time does not exceed a response latency threshold.
- a selected responsive server e.g., for servicing the data query
- an unresponsive server e.g., to probe the unresponsive server
- the server selection circuitry 1 10 may apply various criteria for determining whether an unresponsive server has become responsive, e.g., any of the responsiveness verification criteria discussed above. Upon determining that an unresponsive server has become responsive, the server selection circuitry 1 10 may update the server response list 104 and select the now responsive server for servicing subsequent data queries.
- Figure 6 shows an example of a computing device 600 that supports responsive server identification among multiple data servers linked to a shared memory.
- the computing device 600 may implement any of the functionality described herein, including according to any of the features described herein for the server selection circuitry 1 10.
- the computing device 600 may include a processor 610.
- the processor 610 may include a central processing unit (CPU), microprocessor, and/or any hardware device suitable for executing instructions stored on a machine- readable medium.
- the computing device 600 may include a machine-readable medium 620.
- the machine-readable medium 620 may be any electronic, magnetic, optical, or other physical storage device that stores executable instructions, such as the server selection instructions 622 shown in Figure 6.
- the machine-readable medium 620 may be, for example, Random Access Memory (RAM), an Electrically-Erasable Programmable Read-Only Memory (EEPROM), a storage drive, an optical disk, and the like.
- RAM Random Access Memory
- EEPROM Electrically-Erasable Programmable Read-Only Memory
- the computing device 600 may execute instructions stored on the machine-readable medium 620 through the processor 610. Executing the instructions may cause the computing device 600 to perform any of the features described herein. One specific example is shown in Figure 6 through the server selection instructions 622. Executing the server selection instructions 622 may cause the computing device 600 to operate according to any combination of features of the server selection circuitry 1 10. [0047] As one example, executing the server selection instructions 622 may cause the computing device 600 to populate a server response list 104 in a probe phase, which may include sending a first data query to multiple data servers, the multiple data servers linked to a shared memory accessible to the multiple data servers and recording a response time for the multiple data servers to service the first data query.
- Executing the server selection instructions 622 may further cause the computing device 600 to identify a selected data server from the server response list 104 for servicing a second data query in a selection phase, which may include defining a response latency threshold according to a quality of service requirement specified by an application; identifying, from the server response list 104, responsive servers with a response time that does not exceed a response latency threshold; and determining the selected data server for servicing the second data query from among the responsive servers.
- executing the server selection instructions 622 may cause the computing device 600 to monitor subsequent response times for the responsive servers in servicing data queries to determine whether a responsive server among the responsive servers has become unresponsive. Determination that a responsive server has become unresponsive may include identifying when a threshold number of the response times for the responsive server exceed the response latency threshold.
- server selection circuitry 1 10 may include circuitry in a controller, a microprocessor, or an application specific integrated circuit (ASIC), or may be implemented with discrete logic or components, or a combination of other types of analog or digital circuitry, combined on a single integrated circuit or distributed among multiple integrated circuits.
- a product such as a computer program product, may include a storage medium and machine readable instructions stored on the medium, which when executed in an endpoint, computer system, or other device, cause the device to perform operations according to any of the description above.
- the processing capability of the systems, devices, and circuitry described herein, including the server selection circuitry 1 10, may be distributed among multiple system components, such as among multiple processors and memories, optionally including multiple distributed processing systems.
- Parameters, databases, and other data structures may be separately stored and managed, may be incorporated into a single memory or database, may be logically and physically organized in many different ways, and may implemented in many ways, including data structures such as linked lists, hash tables, or implicit storage mechanisms.
- Programs may be parts (e.g., subroutines) of a single program, separate programs, distributed across several memories and processors, or implemented in many different ways, such as in a library, such as a shared library (e.g., a dynamic link library (DLL)).
- the DLL for example, may store code that performs any of the system processing described above.
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer And Data Communications (AREA)
Abstract
In some examples, a method may be performed by server selection circuitry in a device. The method may include sending a first data query to multiple data servers, the multiple data servers linked to a shared memory storing data requested by the first data query. The method may also include tracking response times for the multiple data servers, where the response times include a particular response time for each of the multiple data servers to service the first data query, identifying a responsive server from among the multiple data servers according to the response times, and sending a second data query to the responsive server.
Description
RESPONSIVE SERVER IDENTIFICATION AMONG MU LTIPLE DATA SERVERS LINKED TO A SHARED MEMORY
BACKGROUND
[0001] With rapid advances in technology, computing systems are increasingly prevalent in society today. Vast computing systems execute and support applications that communicate and process immense amounts of data, many times with performance constraints to meet the increasing demands of users. Increasing the efficiency, speed, and effectiveness of computing systems will further improve user experience.
BRIEF DESCRIPTION OF THE DRAWINGS
[0002] Certain examples are described in the following detailed description and in reference to the drawings.
[0003] Figure 1 shows an example of a device that supports responsive server identification among multiple data servers linked to a shared memory.
[0004] Figure 2 shows an example of the server selection circuitry identifying a responsive server among multiple data servers linked to a shared memory.
[0005] Figure 3 shows an example of the server selection circuitry selecting a responsive server for servicing a data query.
[0006] Figure 4 shows an example of logic that the server selection circuitry may implement.
[0007] Figure 5 shows another example of logic that the server selection circuitry may implement.
[0008] Figure 6 shows an example of a computing device that supports responsive server identification among multiple data servers linked to a shared memory.
DETAILED DESCRIPTION
[0009] Figure 1 shows an example of a device 100 that supports server selection for data servers linked to a shared memory. The device 100 may be a client device that executes a client application. Thus, the device 100 may be, for example, a desktop or laptop computer, a web server, a data center computing device, a personal computing device, and more.
[0010] The device 100 may include a memory 102. The memory 102 may be any machine-readable storage medium such as Random Access Memory (RAM), an Electrically-Erasable Programmable Read-Only Memory (EEPROM), a storage drive, an optical disk, a processor cache, or any other volatile or nonvolatile storage medium. The memory 102 may store a server response list 104, which may list multiple data servers linked to the device 100. The multiple data servers may be linked to a shared memory (not shown) storing data for querying. The device 100 (or application executing on the device 100) may request data stored in the shared memory through a data query.
[0011] The device 100 may include server selection circuitry 1 10 through which the device 100 may determine a particular data server to send a data query through. The server selection circuitry 1 10 may be implemented through any combination of hardware (e.g., a processor) or machine-readable instructions (e.g., stored on a machine-readable medium). Thus, in some examples, the server selection circuitry 1 10 includes a processor and a machine-readable medium storing executable instructions to perform any of the features described herein.
[0012] As described in greater detail below, the server selection circuitry 1 10 may identify a responsive server amongst the multiple data servers linked to the shared memory in which to send a data query to retrieve requested data. In doing so, the server selection circuitry 1 10 may apply any number of responsiveness criteria in determining whether a data server is responsive or
unresponsive. The server selection circuitry 1 10 may track responsive and unresponsive data servers through the server response list 104. In some examples, the server selection circuitry 1 10 maintains the server response list by 104 determining response times for the multiple data servers to service a data query, including concurrently sending the data query to the multiple data servers to track the response time for each of the multiple servers to service the data query. The server selection circuitry 1 10 may then identify responsive servers among the multiple data servers listed in the server response list 104 according to the response time to service the data query and select one of the responsive servers to send a subsequent data query to be serviced through the shared memory.
[0013] Figure 2 shows an example of the server selection circuitry 1 10 identifying a responsive server among multiple data servers linked to a shared memory. In the example shown in Figure 2, the server selection circuitry 1 10 is communicatively linked to the multiple data servers 202 including the servers labeled as data server A, data server B, data server C, and data server D. The multiple data servers 202 may be some or all of the data servers linked to a shared memory through which the server selection circuitry 1 10 may query for data.
[0014] The server selection circuitry 1 10 may query data through any one of the multiple data servers 202. A data server may include any combination of circuitry or logic for performing a particular task or operation. For example, the multiple data servers 202 may be part of a data serving system through which client applications (e.g., a web server) request data. An individual data server among the multiple data servers 202 may service a data query from a client application by retrieving the queried data from a shared memory 210 accessible to the multiple data servers 202. The server selection circuitry 1 10 may send data queries to the multiple data servers 202 through a communication network 204. The communication network 204 may include a local area network (LAN), for example as part of a data center implementation as client devices interface with backend servers.
[0015] The multiple data servers 202 may be linked to a shared memory, such as the shared memory 210 shown in Figure 2. The shared memory 210 may implement a common memory namespace accessible to client applications through the multiple data servers 202. The shared memory 210 may take the form of a random access memory (RAM), which may be a dynamic RAM or a non-volatile RAM for example. In some examples, the shared memory is byte- addressable, thus supporting access to a particular memory address or memory address range within the shared memory. In other examples, the shared memory is block addressable. A memory medium that implements the shared memory may be volatile or non-volatile. Thus, in some examples, the shared memory is a non-volatile computer storage medium, such as a non-volatile RAM, a hard drive, flash memory, optical disk, memristor array, solid state drive, and the like.
[0016] The multiple data servers 202 and shared memory 210 may be implemented, for example, as a rack-scale system. The multiple data servers 202 may access the shared memory 210 through a high-speed memory fabric or network, supporting increased access speed to retrieve data as compared to shared disk systems. A data serving system or other system implementing the multiple data servers 202 may share physical resources of the rack-scale system with other applications or services, which may cause resource contention and delay in servicing data queries. The server selection circuitry 1 10 may reduce such delay by identifying responsive data servers among the multiple data servers 202. In a rack-scale system, any of the multiple data servers 202 may be capable of servicing a data query, and thus the server selection circuitry 1 10 may leverage the common access to the shared memory 210 and flexibly select a particular data server that meets specific performance constraints or quality of service (QoS) requirements, e.g., as set forth by a client application.
[0017] Next, various features of the server selection circuitry 1 10 are described with respect to a probe phase and a selection phase. The server selection circuitry 1 10 may implement any combination of the features
described for the probe phase and selection phase, e.g., operating at separate times, sequentially, concurrently, or according to any other combination.
Probe Phase
[0018] In a probe phase, the server selection circuitry 1 10 may collect response data from the multiple data servers 202. Through the collected response data, the server selection circuitry 1 10 may categorize an individual data server as responsive or unresponsive, which may also be referred to as a responsiveness determination. In some examples, the server selection circuitry 1 10 collects response data as a response latency for an individual data server to service (e.g., respond to) a data query. As other examples, the server selection circuitry 1 10 may collect response data as bandwidth availability, resource consumption, or any other performance metric. As discussed in greater detail below, the server selection circuitry 1 10 may categorize responsive and unresponsive servers according to collected response data.
[0019] To collect response data from the multiple data servers 202, the server selection circuitry 1 10 may send a data query to some or all of the multiple data servers 202. In the example shown in Figure 2, the server selection circuitry 1 10 sends the data query 220 to each of the data servers A, B, C, and D. The data query 220 may be a data query or request generated by a client application, such as a web search application querying for a search term. In sending the data query, the server selection circuitry 1 10 may send the same data query 220 to more than one data server, even though any individual data server can service the data query 220. The server selection circuitry 1 10 may, for example, clone the data query 220 so as to send an instance of the data query 220 to some or all of the multiple data servers 202.
[0020] The server selection circuitry 1 10 may concurrently send the data query 220 to the multiple data servers 202. In that regard, the server selection circuitry 1 10 may duplicate and send the data query 220 to more than one data server, independent of whether any of the data servers has responded to the data query 220. Put another way, the server selection circuitry 1 10 may not inject a calculated or deliberate delay in sending successive clones of the data query 220 to successive data servers among the multiple data servers 202,
stopping the transmission of the data query 220 if another data server responds to earlier sent instance of the data query 220.
[0021] To illustrate in another way, the server selection circuitry 1 10 may unconditionally send the instances of the data queries 220 to the multiple data servers. In that regard, the server selection circuitry may inject delay (e.g., to prevent overloading the shared memory or causing resource contention), but nonetheless send each of the cloned instances of the data query 220 to the multiple data servers without dependence on whether a response has been received for a previously sent instance of the data query 220. The server selection circuitry 1 10 may thus send the cloned copies of the data query 220 without hedging, and may instead concurrently (or with delay) send the data query 220 to multiple data servers 202 to service the query as well as to collect response data. The server selection circuitry 1 10 may thus send the cloned copies of the data query 220 without hedging.
[0022] The server selection circuitry 1 10 may record the response time for individual data servers to respond to the data query 220. Upon receiving responses to the data query 220, the server selection circuitry 1 10 may populate corresponding portions of the server response list 104 with the response times. In the example shown in Figure 2, the server selection circuitry 1 10 records the response times of 500 microseconds (μβ), 450 μβ, 950 μβ, and 300 μβ for the data servers A, B, C, and D respectively to respond to the data query 220. As the data query 220 may originate from a client application, the server selection circuitry 1 10 may determine the data query 220 as serviced upon receiving the first response from the multiple data servers 202. Thus, in Figure 2, the server selection circuitry 1 10 may receive the requested data when the data server D responds to the data query 220 with a shortest response time of 300 μβ, providing the queried data returned by data server D to the client application.
[0023] During the probe phase, the server selection circuitry 1 10 may record response data for some or all of the multiple data servers 202. The server selection circuitry 1 10 may, for example, populate the server response list 104 through multiple, different data queries directed at different subsets of the pool
of data servers linked to the shared memory 210. The server selection circuitry 1 10 may probe subsets of the multiple data servers 202 through different data queries until response data has been collected for the multiple data servers 202. To illustrate, the server selection circuitry 1 10 may probe 'k' number of distinct data servers through a first data query, cloning and concurrently sending the first data query to the 'k' distinct data servers. The 'k' distinct data servers may be some, but not all, of the multiple data servers linked to the shared memory and capable of servicing the data query. For the example shown in Figure 2, the server selection circuitry 1 10 may clone the first data query for sending to the data servers A and B, for instance. The server selection circuitry 1 10 may record the response times for the 'k' distinct data servers to service the first data query.
[0024] Then, the server selection circuitry 1 10 may clone and concurrently send a second data query to another data server subset among the multiple data servers 202, such as data servers C and D in Figure 2. The server selection circuitry 1 10 may thus record the response times for the next subset of the multiple data servers 202. The server selection circuitry 1 10 may continue to clone and send data queries until, for example, response data has been collected for each of the multiple data servers 202. Thus, in a probe phase, the server selection circuitry 1 10 may collect response data for the multiple data servers 202, which may be tracked through the server response list 104.
Selection Phase
[0025] In a selection phase, the server selection circuitry 1 10 may identify responsive servers among the multiple data servers 202 and select a particular responsive server to query data through. Figure 3 shows an example of the server selection circuitry 1 10 selecting a responsive server for servicing a data query. In particular, the server selection circuitry 1 10 may perform a responsive server determination using the response data collected during the probe phase. The response data may be specified in the server response list 104, for example. In Figure 3, the server selection circuitry 1 10 tracks response times for the data servers A, B, C, and D through the server response list 104. In the particular example shown in Figure 3, the server response list 104 indicates
response times of 500 μβ, 450 μβ, 950 μβ, and 300 μβ for data servers A, B, C, and D respectively to service the data query 220.
[0026] The server selection circuitry 1 10 may identify a data server as responsive according to any number of responsiveness criteria, one example of which is through a response latency threshold. The response latency threshold may specify a time boundary for servicing a data query that a data server is to satisfy to be categorized as responsive. Thus, the server selection circuitry 1 10 may identify responsive servers from the server response list 104 as those with an associated response time that does not exceed a response latency threshold. In the example shown in Figure 3, the server selection circuitry 1 10 may define a response latency threshold as 700 μβ according to an application QoS requirement, and thus identify data servers A, B, and D as responsive and server C as unresponsive. The server response list 104 may provide an indication as to which of the multiple data servers 202 are determined as responsive according to the responsiveness criteria, and which are determined as unresponsive. For example, the server response list 104 may include a responsiveness flag or bit indicating the categorization of a particular data server.
[0027] The response latency threshold may be set in various ways. In some examples, the server selection circuitry 1 10 defines a response latency threshold according to a QoS requirement specified by a client application that originates the data query. The application QoS requirement may specify a memory access bandwidth requirement, throughput requirement, resource availability requirement, or any other quality or performance metric. The server selection circuitry 1 10 may translate the application QoS requirement into a corresponding response latency threshold that satisfies the application QoS requirement. Similarly, for other types of collected response data (e.g., bandwidth requirement), the server selection circuitry 1 10 may translate the application QoS requirement into a corresponding responsiveness criteria for the response data that the server selection circuitry 1 10 may apply for the responsive server determination. Accordingly, the server selection circuitry 1 10 may identify a responsive data server among the multiple data servers 202 on
an application-specific basis. As application QoS requirements may vary by application, the server selection circuitry 1 10 may flexibly support differentiated responsive server determination for different applications with varying quality and/or performance requirements.
[0028] The server selection circuitry 1 10 may select a responsive server for handling a data query. Thus, in Figure 3, the server selection circuitry 1 10 may select a particular responsive server among data servers A, B, and D for servicing a data query. In some examples, the server selection circuitry 1 10 selects among the responsive servers in a round-robin manner. In other examples, the server selection circuitry 1 10 randomly selects among the identified responsive servers. Selection techniques among multiple responsive servers may be feasible as any of the responsive data servers may be capable of servicing the data query, as the data servers share access to the shared memory 210. As a client application generates data queries for stored data in the shared memory 210, the server selection circuitry 1 10 may select responsive data servers for servicing the data queries, e.g., data servers with response times that meet the QoS requirements for the client application. Implementing the features described herein, the server selection circuitry 1 10 may maintain a QoS level for data service in a shared memory system, which may be provided through selection of responsive servers in the selection phase.
[0029] While the probe phase and selection phase are distinctly discussed above, the server selection circuitry 1 10 may implement the features described herein in combination. For example, during the selection phase, the server selection circuitry 1 10 may continue to monitor response times or otherwise collect response data for the multiple data servers 202, whether for responsive data servers, unresponsive data servers, or both. Even after an initial probe phase (e.g., upon startup of an application or physical device(s) implementing server selection circuitry 1 10), the server selection circuitry 1 10 may update the server response list 104 with collected response data and perform subsequent responsive and unresponsive server determinations for the multiple data servers 202.
[0030] In some examples, the server selection circuitry 1 10 differentiates between different types of data queries. A client application may generate multiple data query types, examples of which include search queries, scan queries, add queries, change queries, delete queries, and more. Different data query types may be characterized with different QoS requirements or response latency thresholds. The server selection circuitry 1 10 may separately track response times and apply separate responsiveness criteria for the multiple data query types. Thus, for a particular data query type (e.g., web search data query), the server selection circuitry 1 10 may identify responsive servers specifically using response data relevant to the particular data query type, such as response times for the multiple data servers 202 in servicing the particular data query type. In some examples, the server response list 104 separately tracks response times or other response data for multiple data query types. In other examples, the server selection circuitry 1 10 may maintain multiple server response lists 104, e.g., individually for a particular data query type. Along similar lines, the server selection circuitry 1 10 may apply varying response latency thresholds depending on the particular data query type being serviced, which may be preconfigured, customized, or set forth according to an application QoS requirement for the particular data query types.
[0031] Along similar lines, the server selection circuitry 1 10 may differentiate data queries of different sizes. The data size of returned data for the data query may impact performance, and impact meeting of application QoS requirements. As one example, data queries for selective data searches may return less result data whereas less selective or broader searches may return a greater amount of result data. The server selection circuitry 1 10 may differentiate between these data queries of differing size by separately collecting and tracking response data for data queries within particular data size tiers, for example. Thus, the server response list 104 may track response data for the multiple data servers with respect to the multiple data size tiers, which may support application of different or individualized response latency thresholds for different data size tiers.
[0032] Through responsive server determination and selection, the server selection circuitry 1 10 may reduce tail latencies. Tail latencies may refer to the latency value at the tail or upper end of a latency distribution (e.g., 95th or 99th percentile). In a data serving system, data queries or other web server operations may be subject to resource contention resulting from other applications or services concurrently utilizing the same physical resources. High server response times may result in shared resource systems, which may result in high tail latency values. The server selection circuitry 1 10 may reduce the tail latency values for a data serving system, through specifically querying responsive servers among the multiple data servers 202, each of which may be capable of retrieving data from the shared memory. Moreover, by concurrently sending cloned instances of a data query to multiple data servers, the server selection circuitry 1 10 may take advantage of the access speeds for a shared memory implemented as non-volatile memory (e.g., a memristor array). Doing so may likewise reduce tail latency values.
[0033] Figure 4 shows an example of logic 400 that the server selection circuitry 1 10 may implement. The server selection circuitry 1 10 may implement the logic 400 as hardware and/or machine-readable instructions, for example. The server selection circuitry 1 10 may execute the logic 400 as a process or method to identify a responsive server among multiple data servers linked to a shared memory.
[0034] The server selection circuitry 1 10 may send a first data query to multiple data servers, the multiple data servers linked to a shared memory storing data requested by the first data query (402). In some examples, each individual server among the multiple data servers can service the first data query by retrieving requested data from the shared memory. For example, the multiple data servers and shared memory may be implemented as part of a rack-scale system. The multiple data servers to which the server selection circuitry 1 10 sends the first data query may be some or all of the data servers that access the shared memory. In sending the first data query, the server selection circuitry 1 10 may clone the first data query to send an instance of the first data query to each of the multiple data servers.
[0035] The server selection circuitry 1 10 may track response times for the multiple data servers (404), and the response times may include a particular response time for each of the multiple data servers to service the first data query. In some examples, the server selection circuitry 1 10 tracks the response times through a server response list 104. The server selection circuitry 1 10 may identify a responsive server from among the multiple servers according to the response times (406), such as by applying a response latency threshold corresponding to an application QoS requirement. The server selection circuitry 1 10 may identify a responsive server by determining that the particular response time for the responsive server to service the first data query does not exceed the response latency threshold. Then, the server selection circuitry 1 10 may send a second data query to the response server (408).
[0036] Figure 5 shows an example of logic 500 that the server selection circuitry 1 10 may implement. The server selection circuitry 1 10 may implement the logic 500 as hardware and/or machine-readable instructions, for example. The server selection circuitry 1 10 may execute the logic 500 as a process or method to support identifying and selecting a responsive server among multiple data servers linked to a shared memory.
[0037] The server selection circuitry 1 10 may monitor the multiple data servers to determine a previously identified responsive server as unresponsive, for example during the selection phase and after an initial probe phase. When sending a data query to a responsive server (e.g., selected through the server response list 104), the server selection circuitry 1 10 may monitor and track the response time for the responsive server to service the data query. The data query may be a second data query sent to the responsive server (the first data query being sent as part of the probe phase or as the first data query described in Figure 4). When the server selection circuitry 1 10 determines that the response time for the responsive server to service the data query exceeds a response latency threshold (502), the server selection circuitry 1 10 may perform, in response, a responsiveness verification process to determine whether the responsive server has become unresponsive. The responsiveness
verification process may occur during normal operation of the server selection circuitry 1 10 in selecting data servers to query data to.
[0038] As part of a responsiveness verification process, the server selection circuitry 1 10 may send a subsequent data query for servicing by the responsive server (506). The subsequent data query may be sent as a third data query and at, for example, a subsequent time when the responsive server is selected by the server selection circuitry 1 10 for servicing a data query. When the response time for servicing the subsequent data query exceeds the response latency threshold, the server selection circuitry 1 10 may determine that the responsive server has become unresponsive (510). The server selection circuitry 1 10 may also identify that the responsive server as an unresponsive server instead (512), for example by updating the server response list 104 or otherwise categorizing the responsive server instead as unresponsive.
[0039] When the response time for servicing the subsequent data query does not exceed the response latency threshold, the server selection circuitry 1 10 may determine that the responsive server has not become responsive (514) and continue to select the responsive server for handling data queries to the shared common memory. The server selection circuitry 1 10 may determine not to send subsequent data queries to an unresponsive server (516), such as the previously responsive server now identified as unresponsive.
[0040] While the logic 500 shown in Figure 5 provides one example, the server selection circuitry 1 10 may employ any number or combination of responsiveness verification criteria to determine whether a previously identified responsive server has become unresponsive. As one illustrative example, a responsiveness verification criterion may be met when a threshold number of data queries serviced by the previously identified responsive server have a response time that exceeds the response latency threshold, e.g., when 3 or more data queries serviced by previously identified responsive server have response times that exceed the response latency threshold.
[0041] Additional examples of responsiveness verification criteria for the previously identified responsive server include: (i) when a threshold number of serviced data queries have a response time exceeding the response latency
threshold over a set or predetermined period of time, (ii) when a threshold percentage of the serviced data queries have a response time exceeding the response latency threshold (e.g., as measured over a predetermined number of previous data queries or over a predetermined period of time); and (iii) when a serviced data query exceeds the response latency threshold by more than a threshold time amount. The server selection circuitry 1 10 may employ any combination of responsiveness verification criteria to determine a previously- identified responsive server as unresponsive. In some examples, the server selection circuitry 1 10 applies the responsiveness verification criteria through normal operation of server selection and data querying, e.g., by performing responsiveness verification as a background process in the selection phase. In some examples, the server selection circuitry 1 10 employs a dedicated responsiveness verification, e.g., directly in response to or specifically upon detecting that a response time for the previously identified responsive server has exceeded a response latency threshold.
[0042] In some examples, the server selection circuitry 1 10 performs a probe phase as part of the responsiveness verification process. Thus, the server selection circuitry 1 10 may collect response data for some or all of the multiple data servers upon detecting that a particular data server has become unresponsive, including collecting response data for responsive data servers. In doing so, the server selection circuitry 1 10 may detect whether multiple responsive servers have become unresponsive, e.g., when a newly executed application, job, or task consumes computation resources or bandwidth of multiple servers, compute nodes, or other hardware.
[0043] Once categorized as unresponsive, the server selection circuitry 1 10 may monitor an unresponsive server to determine the unresponsive server has become responsive (518). For example, the server selection circuitry 1 10 may periodically probe unresponsive data servers to determine whether response data for the unresponsive data servers indicates the servers have become responsive. The server selection circuitry 1 10 may clone a data query to send to both a selected responsive server (e.g., for servicing the data query) and an unresponsive server (e.g., to probe the unresponsive server). The server
selection circuitry 1 10 track the response time for the unresponsive server to service the data query, and deem the unresponsive server as responsive instead when the response time does not exceed a response latency threshold. The server selection circuitry 1 10 may apply various criteria for determining whether an unresponsive server has become responsive, e.g., any of the responsiveness verification criteria discussed above. Upon determining that an unresponsive server has become responsive, the server selection circuitry 1 10 may update the server response list 104 and select the now responsive server for servicing subsequent data queries.
[0044] Figure 6 shows an example of a computing device 600 that supports responsive server identification among multiple data servers linked to a shared memory. In that regard, the computing device 600 may implement any of the functionality described herein, including according to any of the features described herein for the server selection circuitry 1 10.
[0045] The computing device 600 may include a processor 610. The processor 610 may include a central processing unit (CPU), microprocessor, and/or any hardware device suitable for executing instructions stored on a machine- readable medium. The computing device 600 may include a machine-readable medium 620. The machine-readable medium 620 may be any electronic, magnetic, optical, or other physical storage device that stores executable instructions, such as the server selection instructions 622 shown in Figure 6. Thus, the machine-readable medium 620 may be, for example, Random Access Memory (RAM), an Electrically-Erasable Programmable Read-Only Memory (EEPROM), a storage drive, an optical disk, and the like.
[0046] The computing device 600 may execute instructions stored on the machine-readable medium 620 through the processor 610. Executing the instructions may cause the computing device 600 to perform any of the features described herein. One specific example is shown in Figure 6 through the server selection instructions 622. Executing the server selection instructions 622 may cause the computing device 600 to operate according to any combination of features of the server selection circuitry 1 10.
[0047] As one example, executing the server selection instructions 622 may cause the computing device 600 to populate a server response list 104 in a probe phase, which may include sending a first data query to multiple data servers, the multiple data servers linked to a shared memory accessible to the multiple data servers and recording a response time for the multiple data servers to service the first data query. Executing the server selection instructions 622 may further cause the computing device 600 to identify a selected data server from the server response list 104 for servicing a second data query in a selection phase, which may include defining a response latency threshold according to a quality of service requirement specified by an application; identifying, from the server response list 104, responsive servers with a response time that does not exceed a response latency threshold; and determining the selected data server for servicing the second data query from among the responsive servers.
[0048] As additional examples, executing the server selection instructions 622 may cause the computing device 600 to monitor subsequent response times for the responsive servers in servicing data queries to determine whether a responsive server among the responsive servers has become unresponsive. Determination that a responsive server has become unresponsive may include identifying when a threshold number of the response times for the responsive server exceed the response latency threshold.
[0049] The methods, devices, systems, and logic described above, including the server selection circuitry 1 10, may be implemented in many different ways in many different combinations of hardware, software or both hardware and software. For example, server selection circuitry 1 10 may include circuitry in a controller, a microprocessor, or an application specific integrated circuit (ASIC), or may be implemented with discrete logic or components, or a combination of other types of analog or digital circuitry, combined on a single integrated circuit or distributed among multiple integrated circuits. A product, such as a computer program product, may include a storage medium and machine readable instructions stored on the medium, which when executed in an endpoint,
computer system, or other device, cause the device to perform operations according to any of the description above.
[0050] The processing capability of the systems, devices, and circuitry described herein, including the server selection circuitry 1 10, may be distributed among multiple system components, such as among multiple processors and memories, optionally including multiple distributed processing systems. Parameters, databases, and other data structures may be separately stored and managed, may be incorporated into a single memory or database, may be logically and physically organized in many different ways, and may implemented in many ways, including data structures such as linked lists, hash tables, or implicit storage mechanisms. Programs may be parts (e.g., subroutines) of a single program, separate programs, distributed across several memories and processors, or implemented in many different ways, such as in a library, such as a shared library (e.g., a dynamic link library (DLL)). The DLL, for example, may store code that performs any of the system processing described above.
[0051] While various examples have been described above, many more implementations are possible.
Claims
1 . A method comprising:
through server selection circuitry of a device:
sending a first data query to multiple data servers, the multiple data servers linked to a shared memory storing data requested by the first data query;
tracking response times for the multiple data servers, the response times comprising a particular response time for each of the multiple data servers to service the first data query;
identifying a responsive server from among the multiple data servers according to the response times; and
sending a second data query to the responsive server.
2. The method of claim 1 , wherein sending the first data query to the multiple data servers comprises cloning the first data query to send an instance of the first data query to each of the multiple data servers.
3. The method of claim 1 , comprising concurrently sending the first data query to the multiple data servers.
4. The method of claim 1 , wherein identifying the responsive server comprises determining that the particular response time for the responsive server in servicing the first data query does not exceed a response latency threshold.
5. The method of claim 1 , further comprising:
tracking response times for the multiple data servers in servicing multiple data query types; and
identifying a responsive server for servicing a particular data query type according to the response times for the multiple data servers in servicing the particular data query type.
6. The method of claim 1 , further comprising:
determining that a response time for the responsive server in servicing the second data query exceeds a response latency threshold, and in response: performing a responsiveness verification process to determine whether the responsive server has become unresponsive.
7. The method of claim 6, wherein performing the responsiveness verification process comprises:
sending a third data query to the responsive server, and when a response time for the responsive server in servicing the third data query also exceeds the response latency threshold:
determining that the responsive server has become unresponsive; identifying the responsive server as an unresponsive server instead; and
further comprising:
determining not to send subsequent data queries to the unresponsive server.
8. The method of claim 6, wherein performing the responsiveness verification process comprises:
sending multiple subsequent data queries to the responsive server, and when a threshold number of the multiple subsequent data queries have a response time that exceeds the response latency threshold:
determining that the responsive server has become unresponsive; and
identifying the responsive server as an unresponsive server instead.
9. A device comprising:
a memory to store a server response list, the server response list including a listing of multiple data servers linked to a shared memory; and
server selection circuitry to:
maintain the server response list by determining response times for the multiple data servers to service a data query, including
concurrently sending the data query to the multiple data servers to track a response time for each of the multiple servers to service the data query;
identify responsive servers among the multiple data servers listed in the server response list according to the response times to service the data query; and
select one of the responsive servers to send a subsequent data query.
10. The device of claim 9, wherein the server selection circuitry is to maintain the server response list further to:
track response times for the multiple data servers in servicing multiple data query types; and
determine a responsive server for servicing a data query of a particular data query type according to the response times for the multiple data servers in servicing the particular data query type.
1 1 . The device of claim 9, wherein the server selection circuitry is to identify the responsive servers by determining which of the multiple servers have a response time in servicing the data query that does not exceed a response latency threshold.
12. The device of claim 1 1 , wherein the server selection circuitry is further to obtain the response latency threshold according to an application quality of service requirement.
13. A device comprising:
a non-transitory machine readable medium storing executable instructions to: populate a server response list in a probe phase, including:
sending a first data query to multiple data servers, the multiple data servers linked to a shared memory accessible to the multiple data servers;
recording a response time for the multiple data servers to service the first data query; and
identify a selected data server from the server response list for servicing a second data query in a selection phase, including:
defining a response latency threshold according to a quality of service requirement specified by a client application;
identifying, from the server response list, responsive servers with a response time that does not exceed a response latency threshold; and
determining the selected data server for servicing the second data query from among the responsive servers.
14. The device of claim 13, wherein the executable instructions are further to monitor subsequent response times for the responsive servers in servicing data queries to determine whether a responsive server among the responsive servers has become unresponsive.
15. The device of claim 14, wherein the executable instructions are to determine that a responsive server has become unresponsive when a threshold number of the response times for the responsive server exceed the response latency threshold.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2015/018019 WO2016137496A1 (en) | 2015-02-27 | 2015-02-27 | Responsive server identification among multiple data servers linked to a shared memory |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2015/018019 WO2016137496A1 (en) | 2015-02-27 | 2015-02-27 | Responsive server identification among multiple data servers linked to a shared memory |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016137496A1 true WO2016137496A1 (en) | 2016-09-01 |
Family
ID=56789515
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2015/018019 WO2016137496A1 (en) | 2015-02-27 | 2015-02-27 | Responsive server identification among multiple data servers linked to a shared memory |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2016137496A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220353320A1 (en) * | 2019-09-23 | 2022-11-03 | Institute Of Acoustics, Chinese Academy Of Sciences | System for providing exact communication delay guarantee of request response for distributed service |
EP4440041A1 (en) * | 2023-03-31 | 2024-10-02 | Juniper Networks, Inc. | Dynamic load balancing of radius requests from network access server device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090276520A1 (en) * | 2008-05-05 | 2009-11-05 | Lockheed Martin Corporation | Method and apparatus for server election, discovery and selection in mobile ad hoc networks |
US7937473B2 (en) * | 2005-09-20 | 2011-05-03 | Nec Corporation | Resource-amount calculation system, and method and program thereof |
US8037197B2 (en) * | 2007-10-26 | 2011-10-11 | International Business Machines Corporation | Client-side selection of a server |
US8244873B2 (en) * | 2008-07-03 | 2012-08-14 | International Business Machines Corporation | Method, system and computer program product for server selection, application placement and consolidation planning of information technology systems |
-
2015
- 2015-02-27 WO PCT/US2015/018019 patent/WO2016137496A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7937473B2 (en) * | 2005-09-20 | 2011-05-03 | Nec Corporation | Resource-amount calculation system, and method and program thereof |
US8037197B2 (en) * | 2007-10-26 | 2011-10-11 | International Business Machines Corporation | Client-side selection of a server |
US20090276520A1 (en) * | 2008-05-05 | 2009-11-05 | Lockheed Martin Corporation | Method and apparatus for server election, discovery and selection in mobile ad hoc networks |
US8244873B2 (en) * | 2008-07-03 | 2012-08-14 | International Business Machines Corporation | Method, system and computer program product for server selection, application placement and consolidation planning of information technology systems |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220353320A1 (en) * | 2019-09-23 | 2022-11-03 | Institute Of Acoustics, Chinese Academy Of Sciences | System for providing exact communication delay guarantee of request response for distributed service |
US12010164B2 (en) * | 2019-09-23 | 2024-06-11 | Institute Of Acoustics, Chinese Academy Of Sciences | System for providing exact communication delay guarantee of request response for distributed service |
EP4440041A1 (en) * | 2023-03-31 | 2024-10-02 | Juniper Networks, Inc. | Dynamic load balancing of radius requests from network access server device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8554790B2 (en) | Content based load balancer | |
US9477743B2 (en) | System and method for load balancing in a distributed system by dynamic migration | |
EP3221795B1 (en) | Service addressing in distributed environment | |
US7401248B2 (en) | Method for deciding server in occurrence of fault | |
US20190197028A1 (en) | Database management system with database hibernation and bursting | |
US10789085B2 (en) | Selectively providing virtual machine through actual measurement of efficiency of power usage | |
JP6447217B2 (en) | Execution information notification program, information processing apparatus, and information processing system | |
US20180136842A1 (en) | Partition metadata for distributed data objects | |
US8572621B2 (en) | Selection of server for relocation of application program based on largest number of algorithms with identical output using selected server resource criteria | |
US10223270B1 (en) | Predicting future access requests by inverting historic access requests in an object storage system | |
US8914582B1 (en) | Systems and methods for pinning content in cache | |
JP2023089891A (en) | Cluster capacity expansion method and device | |
US20150220438A1 (en) | Dynamic hot volume caching | |
JP2009176103A (en) | Method, system and program for controlling nic connection of virtual network system | |
CN107872517A (en) | A kind of data processing method and device | |
US20180004409A1 (en) | Method and apparatus for managing storage device | |
Kassela et al. | Automated workload-aware elasticity of NoSQL clusters in the cloud | |
CN112136114A (en) | Tuning resource setting levels for query execution | |
WO2016137496A1 (en) | Responsive server identification among multiple data servers linked to a shared memory | |
JP5661355B2 (en) | Distributed cache system | |
US9021499B2 (en) | Moving a logical device between processor modules in response to identifying a varying load pattern | |
Kaneko et al. | A guideline for data placement in heterogeneous distributed storage systems | |
JP6758479B2 (en) | Methods and distributed storage systems for aggregating statistics | |
CN115237960A (en) | Information pushing method and device, storage medium and electronic equipment | |
US11561934B2 (en) | Data storage method and method for executing an application with reduced access time to the stored data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15883589 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15883589 Country of ref document: EP Kind code of ref document: A1 |