US20190342380A1 - Adaptive resource-governed services for performance-compliant distributed workloads - Google Patents

Adaptive resource-governed services for performance-compliant distributed workloads Download PDF

Info

Publication number
US20190342380A1
US20190342380A1 US15/991,953 US201815991953A US2019342380A1 US 20190342380 A1 US20190342380 A1 US 20190342380A1 US 201815991953 A US201815991953 A US 201815991953A US 2019342380 A1 US2019342380 A1 US 2019342380A1
Authority
US
United States
Prior art keywords
server
workload
performance
task
downstream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/991,953
Inventor
Shireesh Kumar Thota
Momin Mahmoud Al-Ghosien
Rajeev Sudhakar BHOPI
Samer Boshra
Madhan Gajendran
Atul Katiyar
Abhijit Padmanabh PAI
Karthik Raman
Ankur Savailal SHAH
Pankaj Sharma
Dharma Shukla
Shreshth Singhal
Hari Sudan SUNDAR
Lalitha Manjapara VISWANATHAN
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Priority to US15/991,953 priority Critical patent/US20190342380A1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: PAI, ABHIJIT PADMANABH, AL-GHOSIEN, MOMIN MAHMOUD, RAMAN, KARTHIK, BHOPI, RAJEEV SUDHAKAR, GAJENDRAN, MADHAN, SHARMA, PANKAJ, SUNDAR, HARI SUDAN, SHUKLA, DHARMA, THOTA, SHIREESH KUMAR, SHAH, ANKUR SAVAILAL, BOSHRA, SAMER, KATIYAR, Atul, VISWANATHAN, LALITHA MANJAPARA, SINGHAL, SHRESHTH
Publication of US20190342380A1 publication Critical patent/US20190342380A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2002Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where interconnections or communication control functionality are redundant
    • G06F11/2007Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where interconnections or communication control functionality are redundant using redundant communication media
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2056Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/184Distributed file systems implemented as replicated file system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • G06F16/2322Optimistic concurrency control using timestamps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • G06F16/2329Optimistic concurrency control using versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2336Pessimistic concurrency control approaches, e.g. locking or multiple versions without time stamps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/273Asynchronous replication or reconciliation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0896Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5009Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5019Ensuring fulfilment of SLA
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5019Ensuring fulfilment of SLA
    • H04L41/5022Ensuring fulfilment of SLA by giving priorities, e.g. assigning classes of service
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5032Generating service level reports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/70Admission control; Resource allocation
    • H04L47/72Admission control; Resource allocation using reservation actions during connection setup
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/70Admission control; Resource allocation
    • H04L47/76Admission control; Resource allocation using dynamic resource allocation, e.g. in-call renegotiation requested by the user or requested by the network in response to changing network conditions
    • H04L47/762Admission control; Resource allocation using dynamic resource allocation, e.g. in-call renegotiation requested by the user or requested by the network in response to changing network conditions triggered by the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/70Admission control; Resource allocation
    • H04L47/83Admission control; Resource allocation based on usage prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1012Server selection for load balancing based on compliance of requirements or conditions with available server resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1034Reaction to server failures by a load balancer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/565Conversion or adaptation of application format or content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/24Negotiation of communication capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/80Database-specific techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/505Clust
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/468Specific access rights for resources, e.g. using capability register
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1029Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers using data related to the state of servers by a load balancer

Definitions

  • a database service may provide a distributed set of servers with various capabilities, such as a query intake server that receives a query; a query processing server that parses the query; and a storage server that applies the logical operations of the parsed query over a data set.
  • a large-scale, distributed server set may involve a significant number of servers that perform a large number of distinct workloads for a variety of applications and/or clients. Moreover, the workloads of various applications and clients may utilize different process paths through the server set. For example, a process path for a first workload may involve a first sequence of tasks to be performed by a corresponding first sequence of servers, such as a first intake server in a first region; a query processing server; and a first storage server that stores records involved in the first workload.
  • a process path for a second workload may involve a different sequence of tasks to be performed by a corresponding second sequence of servers, such as a second intake server in a second region; the same query processing server; a second storage server that stores records involved in the second workload.
  • a workload may be sensitive to latency (e.g., a realtime application in which users or devices have to receive a result of the workload within a limited time, and in which delays may be perceptible and/or problematic).
  • latency e.g., a realtime application in which users or devices have to receive a result of the workload within a limited time, and in which delays may be perceptible and/or problematic.
  • a workload may be sensitive to scalability and throughput (e.g., demand for the workload may fluctuate over time, and the inability of the server set to scale up to handle an influx of volume may be problematic).
  • a workload may be sensitive to consistency and/or concurrency issues (e.g., a strictly deterministic workload may have to receive the same result across multiple instances, where inconsistent results may be problematic).
  • a workload may be sensitive to replication and/or resiliency (e.g., downtime, data loss, or the failure of the workload may be problematic).
  • it may be desirable to enable the server set to provide a performance guarantee for a workload, e.g., a guarantee that the server set is capable of handling a surge of volume up to a particular amount while maintaining latency below a particular level.
  • workloads for different applications and/or clients may share a process path.
  • Other workloads may take different process paths through the server set, but may both include one or more servers of the server set that are to be shared by the workloads.
  • Servers may be shared among workloads that are associated with different clients and/or applications; in some scenarios, a server may concurrently handle workloads on behalf of hundreds or even thousands of different applications or clients.
  • a variety of multi-tenancy techniques may be utilized to ensure that a first workload on behalf of a first client or application does not interfere with a second workload on behalf of a second client or application.
  • the server may utilize process and data isolation techniques to ensure that a first workload on behalf of a first client cannot achieve unauthorized access to a second workload on behalf of a second client, including accessing data owned by the second workload or even identifying the presence of the second workload, including the second client or application for which the second workload is processed.
  • the server may protect against resource overutilization. For instance, if a first workload through the server begins exhibiting a surge of volume that exceeds the share of computing resources allocated to the first workload, the use of a resource-sharing technique may enable the server to confine the consequences of the excessive volume to the first workload and to avoid impacting the processing of the second workload, such that a performance guarantee extended to the second workload remains fulfilled. In this manner, servers may allocate and regulate computing resource utilization to promote fulfillment of performance guarantees and allocate computational resources fairly over all of the workloads handled by the server.
  • a workload may involve a process path that involves a sequence of tasks to be performed by an intake server, a query processing server, and a storage server, and to fulfill a performance guarantee.
  • the respective servers may utilize computational resource allocation to handle the individual tasks of the workload, a problem may arise if the storage server begins to experience an excessive computational load.
  • Such excessive computational load may arise, e.g., due to an over-allocation of tasks onto the storage server; a shortage of computational resources, such as a reduction of network bandwidth; or an unanticipated surge of processing, such as a failure of a process on the storage server that necessitates the invocation of a recovery process.
  • the excessive computational load of the storage server may create a processing jam between the storage server and the query processing server that is upstream of the storage server in the process path.
  • the query processing server may continue to handle the query processing task of the workload at the same rate, but the rate at which the query processing server is able to pass the workload to the storage server may be diminished.
  • the query processing server may address the discrepancy in various ways, such as utilizing an outbound queue for the completed workload of processed queries; however, if the reduced processing rate of the storage server persists, the outbound queue may overflow. Moreover, the additional processing burden placed upon the query processing server may propagate the processing jam upward, even to the point of potentially affecting other workloads that have a processing path through the query processing server that do not utilize the storage server. The resulting gridlock may cause a widespread failure of performance guarantees for a variety of workloads due the processing jam between the storage server and the upstream query processing server.
  • a server of a process path may estimate, measure, and/or monitor its processing capabilities, and compare such processing capabilities with the performance guarantees of the workloads utilizing a process path through the server. If the server detects a risk of failing the performance guarantee (e.g., due to an overprovisioning of the server or a computational resource shortage), the server may transmit a performance capability alert to an upstream server of the server path, as an indication that the workload being passed to the server may be too large to ensure that the performance guarantees are met.
  • the upstream server that receives the performance capability alert may respond by rate-limiting its processing of the workload, within the performance guarantee, thereby downscaling the processing rate of the upstream server upon the workload to match the diminished processing rate of the downstream server.
  • the upstream server may propagate the performance capability alert further upstream. If the performance capability alert reaches a first server of the server path, the first server may refuse or slow a workload acceptance rate. In this manner, the server set may adapt the acceptance of the workload for which the process path is capable of fulfilling the performance guarantee, and in response to fluctuating processing capabilities (including an unexpected loss of processing capability) of the servers of the server path.
  • a first embodiment of the presented techniques involves a server of a server set that performs workloads according to a performance guarantee.
  • the server comprises a processor and a memory storing instructions that, when executed by the processor, cause the server to operate in accordance with the techniques presented herein.
  • the server performs a task of the workload according to a performance guarantee, wherein the workload is processed through the server set according to a process path.
  • the server may rate limit the task of the workload to reduce the computational load of the downstream server. After completing the task (to which the rate-limit may have been applied), the server delivers the workload to the downstream server.
  • a second embodiment of the presented techniques involves a method of configuring a server of a server set to participate in workloads.
  • the method involves, executing, by a processor of the server, instructions that cause the server to operate in accordance with the techniques presented herein.
  • the server receives a workload from an upstream server of a process path of the workload, wherein the workload is associated with a performance guarantee.
  • the server performs a task on the workload, and further identifies a performance capability of the server and compares the performance capability with the performance guarantee of the workload. Responsive to determining that the performance capability risks failing the performance guarantee, the server transmits a performance capability alert to the upstream server.
  • the server rate limits a receipt of additional workloads from the upstream server.
  • a third embodiment of the presented techniques involves a method of configuring a server set to perform a workload according to a performance guarantee.
  • the method involves configuring a server within a process path of the workload through the server set to operate in accordance with the techniques presented herein.
  • the method further involves configuring the server to perform a task on the workload according to the performance guarantee.
  • the method further involves configuring the server to receive a performance capability alert from the downstream server, wherein the performance capability alert indicates that a computational load of the downstream server risks failing the performance guarantee for the workload.
  • the method further involves configuring the server to rate-limit the task of the server to reduce the workload delivered to the downstream server.
  • the method further involves configuring the server to, after performing the task on the workload, deliver the workload to a downstream server of the process path.
  • FIG. 1 is an illustration of an example scenario featuring a processing of workloads through a server set.
  • FIG. 2 is an illustration of an example scenario featuring a processing of a workload through a server set in accordance with the techniques presented herein.
  • FIG. 3 is a component block diagram illustrating an example server featuring an example system for configuring a server set to process a workload in accordance with the techniques presented herein.
  • FIG. 4 is a flow diagram illustrating an exemplary method of configuring a server to process a workload through a process path of a server set in accordance with the techniques presented herein.
  • FIG. 5 is a flow diagram illustrating an exemplary method of configuring a server set to process a workload through a process path in accordance with the techniques presented herein.
  • FIG. 6 is an illustration of an example computer-readable medium storing instructions that provide an embodiment of the techniques presented herein.
  • FIG. 7 is an illustration of a set of example scenarios featuring a variety of rate-limiting mechanisms for a task in accordance with the techniques presented herein.
  • FIG. 8 is an illustration of a set of example scenarios featuring a variety of process path modifications for a workflow in accordance with the techniques presented herein.
  • FIG. 9 illustrates an exemplary computing environment wherein one or more of the provisions set forth herein may be implemented.
  • FIG. 1 is an illustration of an example scenario 100 featuring a server set 106 that processes workloads 104 on behalf of clients 102 .
  • the server set 106 comprises a distributed query processing system that accepts queries from clients 102 and processes the queries over a data set.
  • the example scenario 100 may similarly apply to a variety of workloads 104 , such as content generation, media rendering and presentation, communication exchange, simulation, supercomputing, etc.
  • the server set 106 comprises a set of server 108 that respectively perform a task 110 .
  • a pair of intake servers 108 may serve as a front-end, client-facing interface that accepts queries to be processed on behalf of the clients 102 ; a query processing server 108 that parses queries, such as translating the query from a query language into a sequence of logical relationships to be applied over the data set; and a pair of storage servers 108 that store a replica or a portion of the data set over which the queries are to be applied.
  • the servers 108 may be arranged such that the workloads 104 of the clients 102 are processed according to a process path 112 , e.g., a sequence of servers 108 that respectively apply a task 110 to the workload 104 and deliver the partially processed workload 104 to the next, downstream server 108 in the process path 112 .
  • the process path 112 may enable a pipelined evaluation of workloads 104 that enable the servers 108 to apply the tasks 110 in the manner of an assembly line, thereby reducing idle processing capacity and promoting the scalability of the server set 106 to handle a significant volume of workloads 104 in a concurrent manner.
  • different process paths 112 may be utilized for different workloads 104 ; e.g., the first intake server 108 may receive workloads 104 from a first set of clients 102 and/or geographic regions while the second intake server 108 receives workloads 104 from a second set of clients 102 and/or geographic regions. Similarly, the first workload 104 may present a first query over the portion of the database stored by the first storage server 108 , while the second workload 104 may present a second query over the portion the database stored by the second storage server 108 . Conversely, the process paths 112 of different workloads 102 may coincide through one or more servers 108 ; e.g., both workloads 104 utilize the same query processing server 108 .
  • the workloads 104 may be associated with various kinds of performance constraints.
  • the first workload 104 may be particularly sensitive to latency; e.g., the first workload 104 may comprise time-sensitive data that the client 102 seeks to process in an expedited manner, and delays in the completion of the workload 104 may be highly visible, and may reduce or negate the value of the completed workload 104 .
  • the second workload 104 may involve fluctuating volume, such as a data-driven service that is sometimes heavily patronized by users and other times is used by only a few users.
  • the second workload 104 may therefore be sensitive to scalability, and may depend upon comparatively consistent processing behavior of the server set 106 to handle the second workload 104 even as demand scales upward.
  • the performance dependencies may be driven, e.g., by the computational constraints of the workload 104 ; the intended uses of the results of the processing; the circumstances of an application for which the workload 104 is performed; and/or the preferences of the client 102 submitting the workload 104 .
  • the sensitivities and/or tolerances of different clients 102 and workloads 104 may vary; e.g., the first workload 104 may present a highly consistent and regular volume such that which scalability is not a concern, while the second client 102 is able to tolerate reasonable variations in latency 114 , such as marginally delayed completion of the workloads 104 , as long as processing is completed correctly at peak volume.
  • the server set 106 may extend to each client 102 a performance guarantee 114 of a performance capability of the server set 106 to handle the workload 104 .
  • the server set 106 may extend to the first client 102 a performance guarantee 114 that processing of the majority of workloads 104 (e.g., 95% of workloads 104 ) will complete within 10 milliseconds.
  • the server set 106 may offer a performance guarantee 114 of correct processing of the second workload 104 up to a defined volume, such as 1,000 requests per second.
  • the server set 106 may be arranged to fulfill the performance guarantees 114 of the workloads 104 , e.g., by supplementing, adapting, and/or optimizing the configuration of servers 108 comprising the data set.
  • the server 108 may have to secure and isolate the workloads 104 of the first client 102 from the workloads 104 of the second client 102 , e.g., to prevent the second load 104 from accessing proprietary information of the first client 102 and tampering with the operation of the first workload 104 .
  • the server 108 may comprise a limited set of computational resources, such as processor capacity, storage capacity, and network bandwidth.
  • An overconsumption of the computational resources of the query processing server 108 by the first workload 104 may create a shortage of computational resources of the server 108 for the second workload 104 , such as limited processing capacity; an exhaustion of available storage; and/or constrained network bandwidth.
  • Such overconsumption by the first workload 104 may lead to delays or even a failure in the processing of the second workload 104 , including a failure of the performance guarantee 114 of the second workload 104 , such as an inability of the query processing server 108 to scale up in order to handle peak volume of the second client 102 .
  • techniques may be utilized to allocate and compartmentalize the computational resources of the server 108 for each workload 104 , such as processor time-slicing and per-workload quotas or caps on storage and network capacity.
  • a server 108 may limit the adverse effects of an overconsumption of resources to the workload 104 responsible for the overconsumption; e.g., increased processing demand by the first workload 104 may result in delayed completion of the first workload 104 without impacting the processing of the second workload 104 .
  • resource limitations may also consume computational resources (e.g., processor time-slicing among a set of workloads 104 may present overhead due to context-switching).
  • Such inefficiency may scale with the scalability of the server set 106 , such as using a particular server 108 to process hundreds or even thousands of workloads 104 , so the isolation and allocation techniques may have to be implemented with careful attention to efficiency.
  • a performance guarantee 114 of a particular workload 104 may be limited by systemic factors, such as a shortage of storage capacity due to a failure of a hard disk drive in a storage array, or a surge in a computational process, such as a background maintenance process.
  • An introduction of line noise into a network connection, such as due to electromagnetic interference or a faulty cable, may lead to diminished throughput and increased latency.
  • the server set 106 may experience a failure of a particular server 108 and may have to re-route the workload 104 of the failed server to other servers 108 of the server set 106 , thereby increasing the computational load of the individual servers 108 . Any such change in the performance of the server set 106 may interfere with the process path 112 of a workload 104 , which may risk failing the performance guarantee 114 .
  • the respective servers 108 of the server set 106 may include a monitoring process of a performance capability, such as available processor capacity, storage, network throughput, and latency.
  • the performance capabilities may also include considerations such as resiliency to data loss, e.g., the volume of data stored by the server 108 that has not yet been transmitted to a replica, as a measure of the risk of data loss in the event of a failure of the server 108 .
  • the server 108 may track the performance capabilities and, if detecting a potential shortage that risks failing a performance guarantee 114 , may invoke a variety of “self-help” measurements to alleviate the shortage.
  • the server 108 may place the processor into a “boost” mode; awaken and utilize dormant processing capacity, such as additional processing cores; and/or reduce or suspend some deferrable processing, such as maintenance tasks.
  • the server 108 may delete or compress data that is not currently in use, including significant data that may later be restored from a replica.
  • the server 108 may suspend processes that are consuming network capacity, or shift network bandwidth allocation from processes that are tolerant of reduced network bandwidth and latency to processes that are sensitive to constrained network bandwidth or latency.
  • the server 108 may report the performance capability shortage to a network administrator or network monitoring process, which may intercede to reconfigure the server set 106 .
  • the computational load of a storage server 108 may be alleviated by provisioning a new server 108 , replicating the data set onto the new server 108 , and altering process paths 112 to utilize the new server 108 .
  • reconfiguration of the architecture of the server set 106 may be a comparatively expensive step, and/or may involve a delay to implement, during which time the performance guarantee 114 of a workload 104 may fail.
  • a storage server 108 may encounter a processing capacity shortage 116 that delays the processing of a workload 104 through the storage server 108 .
  • Such delay by the storage server 108 may lead to a lag in the acceptance by the storage server 108 of the workload 104 delivered by the upstream query processing server 108 . That is, the query processing server 108 may complete the task 110 of parsing a number of queries that are to be applied by the second storage server 108 , but the second storage server 108 may not be ready to accept the parsed queries.
  • the acceptance rate of the second storage server 108 may be diminished; in other cases, the acceptance rate of the second storage server 108 may be reduced to zero, such as an overflow of an input queue that the query processing server 108 uses to record parsed queries for processing by the second storage server 108 .
  • the interface between the query processing server 108 and the storage server 108 may therefore experience a processing jam 118 that interferes with the delivery of the partially processed workload 104 from the query processing server 108 to the storage server 108 .
  • the query processing server 108 may respond to the processing jam 118 in numerous ways. For example, the query processing server 108 may retry the delivery of the workload 104 for a period of time, in case the processing jam 118 is ephemeral and is momentarily alleviated. The query processing server 108 may utilize an outbound queue for the workload 104 that the storage server 108 may be able to work through and empty when the processing capacity shortage 116 is alleviated, or that may be transferred to a replica of the storage server 108 following a reconfiguration of the server set 106 . However, these techniques may also fail if the processing jam 118 is prolonged and a substitute for the storage server 108 is unavailable.
  • the outbound queue of the query processing server 108 may also overflow, or the workloads 104 allocated to the query processing server 108 may begin to starve, inducing a failure of the performance guarantee 114 .
  • the source of the fault may be misattributed to the query processing server 108 , since the performance guarantees 114 failed while the query processing server 108 retained the workloads 104 for a prolonged period.
  • an automated diagnostic process may identify the query processing server 108 as a processing bottleneck, and may initiate a failover of the query processing server 108 that fails to resolve the actual limitation of the performance of the server set 106 .
  • the processing capacity shortage 116 of the storage server 108 spills over to create a processing capacity shortage 116 of the upstream server.
  • the volume of completed workloads 104 that the query processing server 108 that are pending delivery to the storage server 108 may cause delays in the handling of other workloads 104 by the query processing server 108 .
  • This backward propagation of the processing capacity shortage 116 may create processing jam 118 in the interfaces of the query processing server 108 not with the second intake server 108 along the same process path 112 of the second workload 104 .
  • the processing capacity shortage 116 may create a processing jam in the interface with the first intake server 108 , leading to delayed processing and completion of the first workload 104 , even though the process path 112 of the first workload 104 does not include the second storage server 108 .
  • the processing capacity shortage 116 of the second storage server 108 may induce delays in other servers 108 and process paths 112 of the server set 106 , and the failure of performance guarantees 114 even of workloads 104 that do not utilize the second storage server 108 .
  • a server set 106 that handles a variety of workloads 104 and process paths 112 , such as multitenant distributed server sets 106 , may benefit from the use of techniques to detect and alleviate processing jams 118 that occur between servers 108 , wherein a processing capacity shortage 116 of a downstream server 108 impacts the performance capabilities of an upstream server 108 .
  • FIG. 2 is an illustration of an example scenario 200 featuring a server set 200 that operates in accordance with the techniques presented herein.
  • a server set 106 processes a workload 104 as a sequence of servers 108 that apply respective tasks 110 as a process path 112 .
  • the workload 104 is associated with a performance guarantee 114 , such as a maximum total processing duration of the workload 104 ; a scalability guarantee that the process path 112 will remain capable of handling the workload 104 at a higher volume; and/or a resiliency of the server set 108 to data loss, such as a maximum volume of data of the workload 104 that is not replicated over at least two replicas and that is therefore subject to data loss.
  • the servers 108 of the server set may apply the tasks 110 to the workload 104 , where each server 108 completes the task 110 on a portion of the workload 104 and delivers the partially completed workload 104 to the next downstream server 108 of the process path 112 .
  • the servers 108 may individually monitor the performance capabilities 202 , and compare the performance capabilities 202 with the performance guarantee 114 . For example, if the performance guarantee 114 comprises a maximum latency, such as 10 milliseconds, the respective servers 108 may monitor the duration of completing the task 110 over a selected portion of the workload 104 to ensure that the task 110 is completed within 2.5 milliseconds on each server 108 .
  • the server 108 may monitor and manage a queue of unreplicated data that is awaiting synchronization with a replica. In this manner, the respective servers 108 may ensure that the performance capabilities 202 of the individual servers 108 are sufficient to satisfy the performance guarantee 114 ; such that maintaining adequate individual performance capabilities 202 of all servers 108 in the server path 112 results in a satisfaction of the performance guarantee.
  • the third server 108 in the process path 112 may detect a diminished performance capability 202 , such as limited processing capacity, storage capacity, or network bandwidth. Comparison of the diminished performance capability 202 with the performance guarantee 114 may reveal a processing capacity shortage 116 that introduces a risk 204 of failing the performance guarantee 114 for the workload 114 .
  • the third server 108 may be capable of utilizing “self-help” measures to restore the performance capability 202 .
  • the processing capacity shortage 116 may rapidly be identified as severe and unresolvable, such as a complete failure of a storage device that necessitates substitution of the third server 108 .
  • the diminished performance capability 202 may be resolved by temporarily reducing the workload 104 handled by the third server 108 . Such reduction of the workload 104 may be achieved by reducing the delivery of the workload 104 to the third server 108 by the upstream servers 108 of the process path 112 .
  • Such reduction may provide a window of opportunity in which the third server 108 may apply the available performance capabilities 202 to a workload 104 of reduced volume, which may enable the third server 108 to catch up with the volume of the workload 104 .
  • the third server 108 may utilize an input buffer of workloads 104 delivered by the upstream server 108 . If the rate at which the workload 104 is delivered into the input buffer exceeds the rate at which the third server 108 removes and completes the workload 104 from the input buffer, the input buffer may steadily grow to reflect a deepening processing queue with a growing latency.
  • Reducing the input rate of delivery of the workload 104 into the input buffer below the rate at which the third server 108 takes the workload 104 out of the input buffer may shrink the input buffer and enable the third server 108 .
  • the input buffer is depleted or at least reduced to an acceptable latency, and/or the cause of the diminished performance capability 202 and processing capacity shortage is resolved, the input rate to the input buffer may be restored.
  • the reduction of the delivery rate of the workload to the third server 108 may be achieved through coordination with the upstream servers 108 .
  • the third server 108 may transmit a performance capability alert 206 to the upstream server 108 .
  • the second server may receive the performance capability alert 206 and respond by applying a rate limit 208 the task 110 performed on the workload 104 by the second server 108 .
  • the rate limit 208 may comprise, e.g., slowing the rate at which the task 110 is performed on the workload 104 , such as by reducing the processing priority of the task 110 , a processor rate or core count of a processor that handles the task 110 , or an allocation of network bandwidth used by the task 110 .
  • the rate limit 208 may also comprise slowing the acceptance rate of the workload 104 by the second server 108 from the upstream first server 108 and thereby reducing the rate of the completed workload 104 delivered to the third server 108 .
  • the rate limit 208 may also comprise enqueuing the workload 104 received from the upstream first server 108 for a delay period; and/or enqueuing the workload 104 over which the task 110 has been completed for a delay period before attempting delivery to the third server 108 .
  • the second server 108 may continue to apply the rate limit 208 to task 110 for the duration of the processing capacity shortage 116 of the third server 108 .
  • the third server 108 may eventually report an abatement of the processing capacity shortage 116 , or the second server 108 may detect such abatement, e.g., by detecting an emptying of the outbound queue to the third server 108 , at which point the second server 108 may remove the rate limit 208 and resume unrate limited processing of the task 110 for the workload 104 .
  • the second server 108 may propagate the performance capability alert 206 to the next upstream server 108 of the process path 112 , i.e., the first server 108 .
  • the first server 108 may similarly respond to the performance capability alert 206 by applying a rate limit 208 to the task 110 of the first server 108 over the workload 104 .
  • the first server 108 may reduce the commitment of the entire process path 112 to a smaller workload volume over which the performance capability 202 may be guaranteed even while afflicted with the processing capacity shortage 116 .
  • the backward propagation of performance capability alerts 206 and the application of a rate limit 208 the task 110 of the second server 108 operate as a form of “backpressure” on the upstream servers 108 of the process path 112 , which reduces the computational overload of the third server 108 and promotes the satisfaction of the performance guarantee 114 of the server set 106 over the workload 104 in accordance with the techniques presented herein.
  • a first technical effect that may arise from the techniques presented herein involves the resolution of the processing capacity shortage 116 of a downstream server 108 and the risk 204 of failing the performance guarantee 114 of the workload 104 through the application of backpressure on upstream servers 108 of the process path 112 .
  • rate limit 208 by the second server 108 may effectively address the process capacity shortage 116 and the risk 204 of failure of the performance guarantee 114 of the workload 104 in numerous ways.
  • it may be feasible for the second server 108 to apply the rate limit 208 to the task 110 may be feasible by the second server 108 without the introduction of the rate limit 208 exacerbating the risk 204 of failing the performance guarantee 114 .
  • the second server 108 may have a surplus performance capability 202 , and may be capable of working through the workload 104 significantly faster than required by the performance guarantee 114 (e.g., the second server 108 may have a maximum allocation of 2.5 milliseconds to perform the task 110 over the workload 104 within the performance guarantee 114 , but may be capable of completing the task 110 in only 0.7 milliseconds). That is, the application of the rate limit 208 to the task 110 may offload some of the delay caused by the processing capacity shortage 116 from the third server 108 to the second server 108 , thus enabling the third server 108 to work through a backlog of the workload 104 and restore the performance capability 202 .
  • the second server 108 and third server 108 may share two workloads 104 , wherein the processing capacity shortage 116 may introduce a risk 204 of failing the performance guarantee 114 of the first workload 104 , but may pose no risk 204 of failing the performance guarantee 114 of the second workload 104 (e.g., the first workload 104 may be sensitive to latency, while the second workload 104 may be relatively tolerant of latency).
  • the application of the rate limit 208 to the task 110 of the second server 108 may reduce the rate of delivery to the third server 108 of both the first workload 104 and the second workload 104 .
  • the reduced volume of the second workload 104 may enable the third server 108 to apply the performance capability 202 to work through a backlog of the first workload 104 and therefore alleviate the processing capacity shortage 116 , without introducing a risk 204 of failing a performance guarantee 114 for the second workload 104 that is not significantly affected by increased latency.
  • the use of a performance capability alert 206 and rate limit 208 may be applied to further upstream servers 108 .
  • the second server 108 may be unable to apply a rate limit 208 to the task 110 without creating a further risk 204 of failing the performance guarantee 114 (e.g., the margin between the performance capability 202 of the second server 108 and the performance guarantee 114 may already be thin).
  • the rate limit 208 may initially be applied to the task 110 by the second server 108 , but a protracted and/or unresolvable processing capacity shortage 116 by the third server 108 may eventually render the rate limit 208 insufficient, such as an overflow of the outbound queue of the second server 108 , or where the application of the rate limit 208 to the task 110 introduces a risk 204 of failing a performance guarantee 114 of another workload 104 over which the server 108 applies the task 110 .
  • the “backpressure” induced by the backward propagation of the performance capability alert 206 and the application of the rate limit 208 to a task 110 of an upstream server 108 may effectively alleviate the processing capacity shortage 116 of the downstream server 108 .
  • a second technical effect that may arise from the techniques presented herein involves the capability of the server set 106 to respond to performance capacity shortages in an efficient, rapid, and automated manner.
  • the techniques presented herein may be applied without conducting a holistic, extensive analysis of the capacity of the server set 106 , such as may be performed by a network monitoring or network administrator, to determine the root cause of the processing capacity shortage 116 and to assess the available options. Rather, the server 108 afflicted by diminished performance capability 202 may simply detect the processing capacity shortage 116 and transmit the performance capability alert 206 to the upstream server 108 .
  • the techniques presented herein do not involve a significant and potentially expensive reconfiguration of the server set 106 or a commitment of resources, such as provisioning a substitute server for the afflicted server 108 , which may involve remapping associations within the server set 106 and/or introduce a delay in the recovery process.
  • the delay involved in applying the recovery may outlast the duration of the processing capacity shortage 116 .
  • the performance guarantee 114 for the workload 104 may fail during the delay involved in applying such heavy recovery techniques. In some circumstances, such recovery may impose additional computational load on the afflicted server 108 , thus hastening the failure of the performance guarantee 114 .
  • the comparatively simple techniques presented herein are applicable merely by transmitting the performance capability alert 206 to the upstream server 108 and causing the second server 108 to apply the rate limit 208 to the task 110 may be applied rapidly and with a negligible expenditure of resources, and may therefore be effective at resolving some processing capacity shortages 116 , particularly serious but ephemeral shortages, that other techniques may not adequately address.
  • the transmission of the performance capability alert 206 and the application of the rate limit 208 to the task 110 utilize currently existing and available resources and capabilities of the downstream and upstream servers (e.g., processor clock rate adjustment, adjustment of thread and process priorities, and/or the use of queues), and do not depend upon the introduction of complex new process management machinery or protocols.
  • a third technical effect that may arise from the techniques presented herein involves the extension of the process to reduce or avoid the risk 204 of failing the performance guarantee 114 altogether.
  • the first server 108 is positioned at the top of the process path 112 and serves as an intake point for the workload 104 . If the server set 106 propagates the performance capability alert 206 all the way to the first server 108 at the top of the process path 112 , the first server 108 may respond by reducing the acceptance rate of the workload 104 into the process path 112 .
  • the first server 108 may reduce the performance guarantee 114 that is offered for the workload 104 (e.g., raising a latency performance guarantee 114 of the workload 104 from 10 milliseconds to 50 milliseconds), and/or may altogether refrain from offering a performance guarantee 114 or accepting new workloads 104 until the processing capacity shortage 116 is alleviated.
  • the “backpressure” techniques presented herein may enable the process path 112 to respond to processing capacity shortages 116 by reducing the initial commitment of the server set 108 to the workload, thus avoiding problems of overcommitment of the server set 108 by only offering performance guarantees 114 that the process path 112 is capable of fulfilling.
  • Many such technical effects may arise from the processing of the workload 104 by the server set 106 in accordance with the techniques presented herein.
  • FIG. 3 is an illustration of an example scenario 300 featuring some example embodiments of the techniques presented herein, including an example server 302 that processes a workload 104 as part of a server set 106 .
  • the example server 302 comprises a processor 304 and a memory 306 (e.g., a memory circuit, a platter of a hard disk drive, a solid-state storage device, or a magnetic or optical disc) encoding instructions that, when executed by the processor 304 of the example server 302 , cause the example server 302 to process the workload 104 in accordance with the techniques presented herein.
  • the instructions encode components of example system 308 that perform various portions of the presented techniques. The interoperation of the components of the example system 308 enables the example server 302 to process the workload 104 in accordance with the techniques presented herein.
  • the example system 308 comprises a task processor 310 , which performs a task 110 of the workload according to a performance guarantee, wherein the workload 104 is processed through the server set 106 according to a process path 112 that includes an upstream server 108 and a downstream server 108 relative to the example server 302 .
  • the example system 308 also includes a task rate limit 314 , which receives a performance capability alert 206 from a downstream server 108 of the process path 112 , e.g., in response to a comparison of the performance capability 202 of the downstream server 108 with the performance guarantee 114 of the workload 104 , which indicates a processing capacity shortage 116 and a risk 204 of failing the performance guarantee 114 of the workload 104 .
  • the task rate limit 314 applies a rate limit 208 to the task 110 performed on the workload 104 to reduce the computational load of the downstream server 108 .
  • the example system 308 also includes a workload streamer 312 , which, after completion of the task 110 on the workload 104 , delivers the workload 104 to the downstream server 108 of the process path 112 . In this manner, the example system 308 enables the example server 302 to apply the task 110 to the workload 104 as part of the process path 112 in accordance with the techniques presented herein.
  • FIG. 4 is an illustration of an example scenario featuring a third example embodiment of the techniques presented herein, wherein the example embodiment comprises an example method 400 of configuring a server 108 to process a workload 104 in accordance with techniques presented herein.
  • the example method 400 involves a server 108 comprising a processor 304 , and may be implemented, e.g., as a set of instructions stored in a memory 306 of the server 108 , such as firmware, system memory, a hard disk drive, a solid-state storage component, or a magnetic or optical medium, wherein the execution of the instructions by the processor 304 causes the server 108 to operate in accordance with the techniques presented herein.
  • the example method 400 begins at 402 and involves executing 404 , by the server, instructions that cause the server to perform in the following manner.
  • the execution of the instructions causes the server 108 to receive 406 a workload 104 from an upstream server 108 of a process path 112 , wherein the workload 104 is associated with a performance guarantee 114 .
  • the execution of the instructions also causes the server 108 to perform 408 a task 110 on the workload 104 .
  • the execution of the instructions also causes the server 108 to identify 410 a performance capability 202 of the server 108 .
  • the execution of the instructions also causes the server 108 to compare 412 the performance capability 202 with the performance guarantee 114 of the workload 104 .
  • the execution of the instructions also causes the server 108 to respond to determining that the performance capability 202 risks failing the performance guarantee 114 by transmit 414 a performance capability alert 206 to the upstream server 108 .
  • the execution of the instructions also causes the server 108 to respond to a performance capability alert 206 received from a downstream server 108 of the process path 112 by rate-limiting 416 the task 110 performed on the workload 104 to reduce the computational load of the downstream server 108 .
  • the example method 400 may enable the server 108 to process the workload 104 as part of the process path 112 in accordance with the techniques presented herein, and so ends at 418 .
  • FIG. 5 is an illustration of an example scenario featuring a third example embodiment of the techniques presented herein, wherein the example embodiment comprises an example method 500 of configuring a server set 106 to process a workload 104 that is associated with a performance guarantee 114 in accordance with techniques presented herein.
  • the example method 500 involves a server set 106 comprising a collection of servers 108 respectively comprising a processor 304 , and may be implemented, e.g., as a set of instructions stored in a memory 306 of the server 108 , such as firmware, system memory, a hard disk drive, a solid-state storage component, or a magnetic or optical medium, wherein the execution of the instructions by the processor 404 causes the server 108 to operate as a member of the server set 106 in accordance with the techniques presented herein.
  • a server set 106 comprising a collection of servers 108 respectively comprising a processor 304 , and may be implemented, e.g., as a set of instructions stored in a memory 306 of the server 108 , such as firmware, system memory, a hard disk drive, a solid-state storage component, or a magnetic or optical medium, wherein the execution of the instructions by the processor 404 causes the server 108 to operate as a member of the server set 106 in accordance with the techniques presented herein
  • the first example method 500 begins at 502 and involves configuring a server 108 of the server set 106 that is within the process path 112 to process the workload 104 in the following manner.
  • the server 108 performs 506 a task 110 on the workload 104 according to the performance guarantee 114 .
  • the server 108 further receives 508 a performance capability alert 206 from the downstream server 108 , wherein the performance capability alert 206 indicates that a computational load of the downstream server 108 risks failing the performance guarantee 114 for the workload 104 .
  • the server 108 Responsive to the performance capability alert 206 , the server 108 further rate-limits 510 the task 110 of the server 108 to reduce the workload delivered to the downstream server 108 .
  • the server 108 After performing the task 110 on the workload 104 , the server 108 further delivers 512 the workload 104 to a downstream server 108 of the process path 112 .
  • the example method 500 may enable the server 108 to operate as part of a server set 106 to participate in the processing of the workload 104 in accordance with the techniques presented herein, and so ends at 514 .
  • Still another embodiment involves a computer-readable medium comprising processor-executable instructions configured to apply the techniques presented herein.
  • Such computer-readable media may include various types of communications media, such as a signal that may be propagated through various physical phenomena (e.g., an electromagnetic signal, a sound wave signal, or an optical signal) and in various wired scenarios (e.g., via an Ethernet or fiber optic cable) and/or wireless scenarios (e.g., a wireless local area network (WLAN) such as WiFi, a personal area network (PAN) such as Bluetooth, or a cellular or radio network), and which encodes a set of computer-readable instructions that, when executed by a processor of a device, cause the device to implement the techniques presented herein.
  • WLAN wireless local area network
  • PAN personal area network
  • Bluetooth a cellular or radio network
  • Such computer-readable media may also include (as a class of technologies that excludes communications media) computer-computer-readable memory devices, such as a memory semiconductor (e.g., a semiconductor utilizing static random access memory (SRAM), dynamic random access memory (DRAM), and/or synchronous dynamic random access memory (SDRAM) technologies), a platter of a hard disk drive, a flash memory device, or a magnetic or optical disc (such as a CD-R, DVD-R, or floppy disc), encoding a set of computer-readable instructions that, when executed by a processor of a device, cause the device to implement the techniques presented herein.
  • a memory semiconductor e.g., a semiconductor utilizing static random access memory (SRAM), dynamic random access memory (DRAM), and/or synchronous dynamic random access memory (SDRAM) technologies
  • SSDRAM synchronous dynamic random access memory
  • FIG. 6 An example computer-readable medium that may be devised in these ways is illustrated in FIG. 6 , wherein the implementation 600 comprises a computer-readable memory device 602 (e.g., a CD-R, DVD-R, or a platter of a hard disk drive), on which is encoded computer-readable data 604 .
  • This computer-readable data 604 in turn comprises a set of computer instructions 606 that, when executed on a processor 304 of a server, cause the server to operate according to the principles set forth herein.
  • the processor-executable instructions 606 may encode a system that processes a workload 104 as part of a server set 106 , such as the example system 308 of FIG. 3 .
  • the processor-executable instructions 606 may encode a method of configuring a server 108 to process a workload 104 as part of a server set 106 , such as the example method 400 of FIG. 4 .
  • the processor-executable instructions 606 may encode a method of configuring a server set 106 to process a workload 104 , such as the example method 500 of FIG. 5 .
  • Many such computer-readable media may be devised by those of ordinary skill in the art that are configured to operate in accordance with the techniques presented herein.
  • the techniques discussed herein may be devised with variations in many aspects, and some variations may present additional advantages and/or reduce disadvantages with respect to other variations of these and other techniques. Moreover, some variations may be implemented in combination, and some combinations may feature additional advantages and/or reduced disadvantages through synergistic cooperation.
  • the variations may be incorporated in various embodiments (e.g., the first example server 302 and/or the example system 308 of FIG. 3 ; the example method 400 of FIG. 4 ; the example method 500 of FIG. 5 ; and the example device 602 and/or example method 608 of FIG. 6 ) to confer individual and/or synergistic advantages upon such embodiments.
  • a first aspect that may vary among implementations of these techniques relates to scenarios in which the presented techniques may be utilized.
  • the presented techniques may be utilized with a variety of servers 108 and server sets 106 , such as workstations, laptops, consoles, tablets, phones, portable media and/or game players, embedded systems, appliances, vehicles, and wearable devices.
  • the server may also comprise a collection of server units, such as a collection of server processes executing on a device; a personal group of interoperating devices of a user; a local collection of server units comprising a computing cluster; and/or a geographically distributed collection of server units that span a region, including a global-scale distributed database.
  • Such servers 108 may be interconnected in a variety of ways, such as locally wired connections (e.g., a bus architecture such as Universal Serial Bus (USB) or a locally wired network such as Ethernet); locally wireless connections (e.g., Bluetooth connections or a WiFi network); remote wired connections (e.g., long-distance fiber optic connections comprising Internet); and/or remote wireless connections (e.g., cellular communication). Additionally, such servers 108 may serve a variety of clients 102 , such as a client process on one or more of the servers 108 ; other servers 108 within a different server set 106 ; and/or various client devices that utilize the server 108 and/or server group on behalf of one or more clients 102 and/or other devices.
  • a bus architecture such as Universal Serial Bus (USB) or a locally wired network such as Ethernet
  • locally wireless connections e.g., Bluetooth connections or a WiFi network
  • remote wired connections e.g., long-distance fiber optic connections comprising Internet
  • the server set 106 may present a variety of services that involve applying tasks 110 to workloads 108 .
  • the service may comprise a distributed database or data storage system, involving tasks 110 such as receiving the data; storing the data; replicating and/or auditing the data; evaluating queries over the data; and/or running reports or user-defined functions over the data.
  • the service may comprise a content presentation system, such as a news service, a social network service, or social media service, which may involve tasks 110 such as retrieving and storing content items; generating new content items; aggregating content items into a digest or collage; and transmitting or communicating the content items to clients 102 .
  • the service may comprise a media presentation system, which may involve tasks 110 such as acquiring, storing, cataloging, and archiving the media objects; rendering and presenting media objects to clients 102 ; and/or tracking engagement of the clients 102 with the media objects.
  • the service may comprise a software repository, which may involve tasks 110 such as storing and cataloging software; deploying software to various clients 102 ; and receiving and applying updates such as patches and upgrades to the software deployed of the clients 102 .
  • the service may comprise a gaming system, which may involve tasks 110 such as initiating game sessions; running game sessions; and compiling the results of game sessions among various clients 102 .
  • the service may comprise an enterprise operational service that provides operational computing for an enterprise, which may involve tasks 110 such as providing a directory of entities such as individuals and operating units; exchanging communication among the entities; controlling and managing various processes; monitoring and logging various processes, such as machine sensors; and generating alerts.
  • tasks 110 such as providing a directory of entities such as individuals and operating units; exchanging communication among the entities; controlling and managing various processes; monitoring and logging various processes, such as machine sensors; and generating alerts.
  • Those of ordinary skill in the art may devise a range of scenarios in which a server set 106 configured in accordance with the techniques presented herein may be utilized.
  • a second aspect that may vary among embodiments of the techniques presented herein involves the performance capabilities 202 monitored by the servers 108 and the comparison with performance guarantees 114 over the workload 104 to identify a processing capacity shortage 116 and a risk 204 of failing the performance guarantee 114 .
  • the performance capabilities 202 may include, e.g., processor capacity; storage capacity; network bandwidth; availability of the server set 106 ; scalability to handle fluctuations in the volume of a workload 104 ; resiliency to address faults such as the failure of a server 108 ; latency of processing the workload 104 through the server set 106 ; and/or adaptability to handle new types of workloads 104 .
  • the performance guarantees 114 of the workloads 104 may involve, e.g., a processing latency, such as a maximum end-to-end processing duration for processing the workload 104 to completion; a processing throughput of the workload 104 , such as a sustainable rate of completed items; a processing consistency of the workload 104 , such as a guarantee of consistency among portions of the workload 104 processed at different times and/or by different servers 108 ; scalability to handle a peak volume of the workload 104 to a defined level; a processing replication of the workload 104 , such as a maximum volume of unreplicated data that may be subject to data loss; and/or a minimum availability of the server set 106 , such as a “sigma” level.
  • a processing latency such as a maximum end-to-end processing duration for processing the workload 104 to completion
  • a processing throughput of the workload 104 such as a sustainable rate of completed items
  • a processing consistency of the workload 104 such as a guarantee of consistency among portions of
  • a server 108 may identify the performance capabilities 202 in various ways.
  • a server 108 may predict the performance capability of the server 202 over the workload 104 , such as an estimate of the amount of time involved in applying the task 110 to the workload 104 or a realistically achievable throughput of the server 108 .
  • Such predictions may be based, e.g., upon an analysis of the workload 104 , a set of typical performance characteristics or heuristics of the server 108 , or previous assessments of processing the task 110 over similar workloads 104 .
  • the server 108 may measure the performance capability 202 of the server while performing the workload 104 .
  • Such measurement may occur with various granularity and/or periodicity, and may involve techniques such as low-level hardware monitors (e.g., hardware timers or rate meters) and/or high-level software monitors (e.g., a timer placed upon a thread executing the task 110 ).
  • a server 108 may not actively monitor the performance capability 202 but may receive an alert if an apparent processing capacity shortage 116 arises (e.g., a message from a downstream server 108 of a reduced delivery of the completed workload 104 ).
  • a server 108 may and compare such performance capabilities 202 with the performance guarantees 114 of the workload 104 in various ways.
  • the server 108 may compare an instantaneous measurement of the performance capability 202 with an instantaneous performance guarantee 114 , such as a current data transfer rate compared with a minimum acceptable data transfer rate, and/or periodic measurements, such as a number of completed tasks 110 over a workload 104 in a given period vs. a quota of completed tasks 110 .
  • the server 108 may compare a trend in the performance capability 202 , e.g., detecting a gradual reduction of processing capacity over time that, while currently satisfying the performance guarantee 114 , may indicate an imminent or eventual risk 204 of failing the performance guarantee 114 , such as a gradually diminishing rate of completed tasks 110 .
  • a workload 104 may be associated with a set of at least two performance guarantees 114 for at least two performance capabilities 202 and a priority order of the performance guarantees 114 (e.g., a first priority of a maximum latency of processing individual tasks 110 over the workload 104 at a typical rate of 10 milliseconds, but in the event of an ephemeral failure of the first performance guarantee 114 , a second priority of a maximum throughput of processing tasks 110 of the workload 104 within a given period, such as at least 100 tasks completed per second).
  • Such prioritization may enable the performance guarantees 114 to be specified in a layered or more nuanced manner.
  • the server 108 may compare the respective performance capabilities 202 of the server 202 according to the priority order to evaluate the risk 204 of failing the collection of performance guarantees 114 for the workload 114 .
  • a performance capability alert 206 may be relayed from a downstream server 108 to an upstream server 108 in a variety of ways.
  • the performance capability alert 206 may comprise a message initiated by the downstream server 208 and transmitted to the upstream server 108 in response to the identification of a risk 204 of failing the performance guarantee 114 .
  • the message may be delivered in-band (e.g., as part of an ordinary communication stream) or out-of-band (e.g., using a separate and dedicated communication channel).
  • the performance capability alert 206 may comprise a performance metric that is continuously and/or periodically reported by the downstream server 108 to the upstream server 108 (e.g., an instantaneous measurement of processing capacity), where the upstream server 108 may construe a fluctuation of the metric as a performance capability alert 206 (e.g., the downstream server 108 may periodically report its latency in completing the task 110 over the workload 104 , and the metric may reveal an excessive latency that is approaching a maximum latency specified by the performance guarantee 114 ).
  • a performance metric that is continuously and/or periodically reported by the downstream server 108 to the upstream server 108 (e.g., an instantaneous measurement of processing capacity)
  • the upstream server 108 may construe a fluctuation of the metric as a performance capability alert 206 (e.g., the downstream server 108 may periodically report its latency in completing the task 110 over the workload 104 , and the metric may reveal an excessive latency that is approaching a maximum latency
  • the performance capability alert 206 may comprise part of a data structure shared by the downstream server 108 and the upstream server 108 , such as a flag of a status field or a queue count of a workload queue provided at the interface between the downstream server 108 and the upstream server 108 .
  • the performance capability alert 206 may comprise a function of the upstream server 108 that is invoked by the downstream server 108 , such as an API call, a remote procedure call, a delegate function, or an interrupt that the downstream server 108 initiates on the upstream server 108 .
  • Many such techniques may be utilized to compare the performance capability 202 to the performance guarantee 14 to identify a risk 204 of failing the performance guarantee 114 of the workload 104 in accordance with the techniques presented herein.
  • FIG. 7 is an illustration of a set 700 of example scenarios featuring various techniques for rate-limiting a task 110 applied to a workload 104 of a server 302 .
  • a server 302 may rate limit a task 110 , responsive to receiving a performance capability alert 206 from a downstream server 108 , by reducing the performance capabilities 202 of the server 108 .
  • the server 208 may reduce a processor speed 702 of a processor 304 , such as reducing the clock speed or core count that is applied to perform the task 110 over the workload 104 .
  • the server 302 may reduce a thread priority of the task 110 , such that a multiprocessing processor 304 performs increments of the task 110 less frequently, or even suspends the task 110 temporarily if other tasks 110 are of higher priority.
  • performance capabilities 202 that may be reduced for the workload 104 include volatile or nonvolatile memory allocation; network bandwidth; and/or access to a peripheral device such as a rendering pipeline.
  • the server 302 may rate limit the task 110 , relative to a severity of the performance capability alert 206 , such as the degree of constraint on the network capacity of the downstream server 108 or the pending volume of unprocessed work that the downstream server 108 has to work through to alleviate the performance capability alert 206 .
  • a server 302 may rate limit a task 110 , responsive to receiving a performance capability alert 206 from a downstream server 108 , by temporarily refusing to accept the workload 104 from an upstream server 108 , e.g., by initiating a processing jam 118 .
  • the processing jam 118 may be initiated in increments, such that the upstream server 108 is only capable of sending batches of the workload 104 to the server 302 in intervals that are interspersed by a cessation of the workload 104 arriving at the server 302 .
  • the server 302 may reduce an acceptance rate of the workload 104 from the upstream server 108 ; e.g., the upstream server 108 may utilize an output queue of workload 104 to deliver to the server 302 , and the server 302 may only check the output queue at an interval, or at a reduced interval, thereby slowing the rate at which the server 302 accepts workload 104 from the upstream server 108 and delivers the workload 104 to the downstream server 108 .
  • a server 108 may rate limit a task 110 , responsive to receiving a performance capability alert 206 from a downstream server 108 , by utilizing one or more queues that slow the intake and/or delivery of the workload 104 to the downstream server 108 .
  • the server 302 may implement an input queue 704 that enqueues the task 110 for the workload 104 for a delay period, and withdraw the task 110 from the input queue to perform the task 110 on the workload 104 only after the delay period.
  • the server 302 may implement an output queue 706 with a delivery delay that slows the rate at which processed work is delivered to the downstream server 108 .
  • a server 302 may rate limit the task 110 over the workload 104 only within the performance guarantee 114 of the workload 104 .
  • the performance guarantee 114 may comprise a maximum 10-millisecond latency of processing the workload 104 through the process path 112 , and a particular server 302 may be permitted to expend up to 2.5 milliseconds per task 110 while the task 110 remains in conformity with the performance guarantee 114 .
  • the server 302 may rate limit the task 110 for up to or close to an additional 1.8 milliseconds to reduce the rate at which the workload 104 is delivered to the downstream server 108 .
  • the server 302 may refrain from further rate-limiting the task 110 . Instead, as shown in the fourth example scenario 716 of FIG. 4 , the server 302 may propagate the performance capability alert 206 to an upstream server 108 . Additionally, if the server 302 comprises a first server in the process path 112 that is assigned the task 110 of intake of new workload 708 from one or more clients 102 , the server 302 may rate limit the workload by reducing an intake rate of the new workload 708 to the entire process path 112 . That is, the server 302 may only agree to accept a diminished volume of the new workload 708 for which the performance guarantee 114 is assigned.
  • the first server 302 may apply the rate limit 208 to the performance guarantee 114 in an offer 706 provided to the client 102 to extend the new workload 708 , such as extending a maximum latency of the performance guarantee from 10 milliseconds to 20 milliseconds.
  • the server 302 may adapt the commitment offered by the server set 302 toward a performance guarantee 114 that the process path 12 , including the server 108 afflicted by a processing capacity shortage 116 , is currently able to guarantee.
  • Many such techniques may be utilized to rate limit the task 110 of a server 108 in response to a performance capability alert 206 in accordance with the techniques presented herein.
  • a fourth aspect that may vary among embodiments of the techniques presented herein involves adjustment of the process path 112 to adapt to a processing capacity shortage 116 .
  • rate-limiting the tasks 110 of upstream servers 108 may be adequate to resolve a processing capacity shortage 116 .
  • the processing capacity shortage 116 may be severe, prolonged, and/or of indefinite duration, such that in addition to rate-limiting a task 110 of an upstream server 108 , the server set 106 may implement more significant steps to maintain the satisfaction of the performance guarantee 114 .
  • FIG. 8 is an illustration of a set 800 of example scenarios that illustrate some of the variations of this fifth aspect.
  • a server 302 may respond to a performance capability alert 206 of a downstream server 108 by redirecting the process path 112 through the server set 102 to provide a substitute server 802 in lieu of the server 108 exhibiting the performance capacity shortage 116 .
  • the substitute server 802 may be previously allocated and allocated and ready for designation as a substitute server 802 , or may be newly provisioned for 802 and inserted into the process path 112 .
  • the substitute server 802 may already exist in the process path 112 of the workload 104 or in another process path 112 of the server set 104 , and the task 110 performed by the server 108 may be transferred to the substitute server 802 along with the workload 104 .
  • a server 302 may respond to a performance capability alert 206 by expanding a computational resource set of the server 108 exhibiting the processing capacity shortage 116 .
  • the server 108 may comprise a virtual machine, and the processing resources allocated to the virtual machine may be increased (e.g., raising a thread priority and/or processing core usage of the virtual machine).
  • the server 108 exhibiting the processing capacity shortage 116 may be supplemented by the addition of an auxiliary server 804 that expands the processing capacity of the server 108 .
  • the workload 104 may be shared between the server 108 and the auxiliary server 804 until the processing capacity shortage 116 of the server 108 is alleviated.
  • a server 108 exhibiting a processing capacity shortage 116 may experience a processing capacity shortage 116 that risks failing a performance guarantee 114 of a first workload 104 , but that presents lower or no risk of failing performance guarantees 114 of other workloads 104 of the server 108 .
  • the server 108 may therefor prioritize the processing of the first workload 104 over the other workloads 104 alleviate the processing capacity shortage 106 .
  • the server 108 may adjust by reducing a process priority 806 of another workload 104 that the server 108 processes, e.g., a workload 104 that involves no performance guarantee 114 , or may involve a second performance guarantee 114 that is amply satisfied (e.g., a dependency upon a different type of performance capability of the server 108 , such as a CPU-bound workload as compared with a network-bound workload).
  • the relative adjustment of the process priorities 806 may enable the server 108 to work through a backlog and resolve the processing capacity shortage 116 .
  • the server 108 may redirect the second process path for the third workload 110 through a substitute server 802 .
  • the server 108 may therefore reserve a greater proportion of computational resources to address the processing capacity shortage 116 .
  • a server 108 that implements rate-limiting of a task 110 in order to alleviate a processing capacity shortage 116 of a downstream server 108 may curtail or end the rate-limiting of the task 110 based upon an alleviation of the performance capability shortage 116 of the downstream sever 108 .
  • a downstream server 108 initiating the performance capability alert 206 may send a notification to an upstream server 302 applying the rate-limiting to indicate an abatement of the processing capacity shortage 116 .
  • an upstream server 302 applying rate-limiting to a task 110 may detect an abatement of the processing capacity shortage 116 , e.g., as a depletion of an output queue of workloads 104 to deliver to the downstream server 108 .
  • the upstream server 302 may apply the rate-limiting only for a set interval, such as one second, and may then remove the rate-limiting, such that a persistence of the processing capacity shortage 116 at the downstream server 108 may result in a second performance capability alert 206 and a reapplication of the rate limit to the process 110 .
  • the reapplication may occur at an increasing interval (e.g., first one second, then two seconds, etc.) to reduce an inefficiency of the transmission and receipt of multiple performance capability alerts 206 , which may reduce the ability of the downstream server 108 to alleviate the processing capacity shortage 116 .
  • the adjustments of the process paths 112 may be requested and/or implemented by the server 108 experiencing the processing capacity shortage 116 .
  • the adjustments of the processing paths 112 may be requested and/or implemented by the upstream server 302 , e.g., upon determining that the rate-limiting of the task 110 by the upstream server 302 is insufficient to resolve a processing capacity shortage 116 that is prolonged, indefinite, overly frequent, and/or unresolvable by rate-limiting.
  • the adjustments of the processing paths 112 may be implemented at the request of an automated network monitor or network administrator. Many such techniques may be utilized to provide further adaptations of the server set 106 , in conjunction with the rate-limiting of the task 110 by the upstream server 302 , in accordance with the techniques presented herein.
  • FIG. 9 and the following discussion provide a brief, general description of a suitable computing environment to implement embodiments of one or more of the provisions set forth herein.
  • the operating environment of FIG. 9 is only one example of a suitable operating environment and is not intended to suggest any limitation as to the scope of use or functionality of the operating environment.
  • Example computing devices include, but are not limited to, personal computers, server computers, hand-held or laptop devices, mobile devices (such as mobile phones, Personal Digital Assistants (PDAs), media players, and the like), multiprocessor systems, consumer electronics, mini computers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
  • Computer readable instructions may be distributed via computer readable media (discussed below).
  • Computer readable instructions may be implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), data structures, and the like, that perform particular tasks or implement particular abstract data types.
  • APIs Application Programming Interfaces
  • the functionality of the computer readable instructions may be combined or distributed as desired in various environments.
  • FIG. 9 illustrates an example of a system comprising a computing device 902 configured to implement one or more embodiments provided herein.
  • computing device 902 includes at least one processing unit 906 and memory 908 .
  • memory 908 may be volatile (such as RAM, for example), non-volatile (such as ROM, flash memory, etc., for example) or some combination of the two. This configuration is illustrated in FIG. 9 by dashed line 904 .
  • device 902 may include additional features and/or functionality.
  • device 902 may also include additional storage (e.g., removable and/or non-removable) including, but not limited to, magnetic storage, optical storage, and the like.
  • additional storage e.g., removable and/or non-removable
  • FIG. 9 Such additional storage is illustrated in FIG. 9 by storage 910 .
  • computer readable instructions to implement one or more embodiments provided herein may be in storage 910 .
  • Storage 910 may also store other computer readable instructions to implement an operating system, an application program, and the like. Computer readable instructions may be loaded in memory 908 for execution by processing unit 906 , for example.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions or other data.
  • Memory 908 and storage 910 are examples of computer storage media.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVDs) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by device 902 . Any such computer storage media may be part of device 902 .
  • Device 902 may also include communication connection(s) 916 that allows device 902 to communicate with other devices.
  • Communication connection(s) 916 may include, but is not limited to, a modem, a Network Interface Card (NIC), an integrated network interface, a radio frequency transmitter/receiver, an infrared port, a USB connection, or other interfaces for connecting computing device 902 to other computing devices.
  • Communication connection(s) 916 may include a wired connection or a wireless connection. Communication connection(s) 916 may transmit and/or receive communication media.
  • Computer readable media may include communication media.
  • Communication media typically embodies computer readable instructions or other data in a “modulated data signal” such as a carrier wave or other transport mechanism and includes any information delivery media.
  • modulated data signal may include a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • Device 902 may include input device(s) 914 such as keyboard, mouse, pen, voice input device, touch input device, infrared cameras, video input devices, and/or any other input device.
  • Output device(s) 912 such as one or more displays, speakers, printers, and/or any other output device may also be included in device 902 .
  • Input device(s) 914 and output device(s) 912 may be connected to device 902 via a wired connection, wireless connection, or any combination thereof.
  • an input device or an output device from another computing device may be used as input device(s) 914 or output device(s) 912 for computing device 902 .
  • Components of computing device 902 may be connected by various interconnects, such as a bus.
  • Such interconnects may include a Peripheral Component Interconnect (PCI), such as PCI Express, a Universal Serial Bus (USB), Firewire (IEEE 1394), an optical bus structure, and the like.
  • PCI Peripheral Component Interconnect
  • USB Universal Serial Bus
  • Firewire IEEE 1394
  • optical bus structure an optical bus structure, and the like.
  • components of computing device 902 may be interconnected by a network.
  • memory 908 may be comprised of multiple physical memory units located in different physical locations interconnected by a network.
  • a computing device 920 accessible via network 918 may store computer readable instructions to implement one or more embodiments provided herein.
  • Computing device 902 may access computing device 920 and download a part or all of the computer readable instructions for execution.
  • computing device 902 may download pieces of the computer readable instructions, as needed, or some instructions may be executed at computing device 902 and some at computing device 920 .
  • the terms “component,” “module,” “system”, “interface”, and the like are generally intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution.
  • One or more components may be localized on one computer and/or distributed between two or more computers.
  • the claimed subject matter may be implemented as a method, apparatus, or article of manufacture using standard programming and/or engineering techniques to produce software, firmware, hardware, or any combination thereof to control a computer to implement the disclosed subject matter.
  • article of manufacture as used herein is intended to encompass a computer program accessible from any computer-readable device, carrier, or media.
  • one or more of the operations described may constitute computer readable instructions stored on one or more computer readable media, which if executed by a computing device, will cause the computing device to perform the operations described.
  • the order in which some or all of the operations are described should not be construed as to imply that these operations are necessarily order dependent. Alternative ordering will be appreciated by one skilled in the art having the benefit of this description. Further, it will be understood that not all operations are necessarily present in each embodiment provided herein.
  • any aspect or design described herein as an “example” is not necessarily to be construed as advantageous over other aspects or designs. Rather, use of the word “example” is intended to present one possible aspect and/or implementation that may pertain to the techniques presented herein. Such examples are not necessary for such techniques or intended to be limiting. Various embodiments of such techniques may include such an example, alone or in combination with other features, and/or may vary and/or omit the illustrated example.
  • the term “or” is intended to mean an inclusive “or” rather than an exclusive “or”. That is, unless specified otherwise, or clear from context, “X employs A or B” is intended to mean any of the natural inclusive permutations. That is, if X employs A; X employs B; or X employs both A and B, then “X employs A or B” is satisfied under any of the foregoing instances.
  • the articles “a” and “an” as used in this application and the appended claims may generally be construed to mean “one or more” unless specified otherwise or clear from context to be directed to a singular form.

Abstract

Processing services are often provisioned by defining and adjusting the performance capabilities of individual servers, and in multitenancy scenarios, servers may allocate computational resources to ensure that a first client workload does not impact a second client workload. However, a reduced performance capability of a server may create a processing jam with respect to an upstream server of the process path of the workload, where the processing rate mismatch creates a risk of failing to fulfill the performance guarantee for the workload. Instead, the downstream server may monitor and compare its performance capability with the performance guarantee. If a performance guarantee failure risk arises, the server may transmit a performance capability alert to the upstream server, which may rate-limit the processing of the workload. Rate-limiting by the first server in the server path may limit workload intake to a volume for which the process path can fulfill the performance guarantee.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of, and claims priority under 35 U.S.C. §§ 119-120 to, U.S. Patent Application No. 62/668,226, entitled “DISTRIBUTED DATABASES,” filed on May 7, 2018, the entirety of which is hereby incorporated by reference as if fully rewritten herein.
  • BACKGROUND
  • Within the field of computing, many scenarios involve a processing service that performs a set of workloads using a server set. For example, a database service may provide a distributed set of servers with various capabilities, such as a query intake server that receives a query; a query processing server that parses the query; and a storage server that applies the logical operations of the parsed query over a data set.
  • A large-scale, distributed server set may involve a significant number of servers that perform a large number of distinct workloads for a variety of applications and/or clients. Moreover, the workloads of various applications and clients may utilize different process paths through the server set. For example, a process path for a first workload may involve a first sequence of tasks to be performed by a corresponding first sequence of servers, such as a first intake server in a first region; a query processing server; and a first storage server that stores records involved in the first workload. A process path for a second workload may involve a different sequence of tasks to be performed by a corresponding second sequence of servers, such as a second intake server in a second region; the same query processing server; a second storage server that stores records involved in the second workload.
  • In such scenarios, workloads may be subject to various forms of performance sensitivities. As a first such example, a workload may be sensitive to latency (e.g., a realtime application in which users or devices have to receive a result of the workload within a limited time, and in which delays may be perceptible and/or problematic). As a second such example, a workload may be sensitive to scalability and throughput (e.g., demand for the workload may fluctuate over time, and the inability of the server set to scale up to handle an influx of volume may be problematic). As a third such example, a workload may be sensitive to consistency and/or concurrency issues (e.g., a strictly deterministic workload may have to receive the same result across multiple instances, where inconsistent results may be problematic). As a fourth such example, a workload may be sensitive to replication and/or resiliency (e.g., downtime, data loss, or the failure of the workload may be problematic). In view of such sensitivities, it may be desirable to enable the server set to provide a performance guarantee for a workload, e.g., a guarantee that the server set is capable of handling a surge of volume up to a particular amount while maintaining latency below a particular level.
  • In multitenant scenarios, workloads for different applications and/or clients may share a process path. Other workloads may take different process paths through the server set, but may both include one or more servers of the server set that are to be shared by the workloads. Servers may be shared among workloads that are associated with different clients and/or applications; in some scenarios, a server may concurrently handle workloads on behalf of hundreds or even thousands of different applications or clients.
  • A variety of multi-tenancy techniques may be utilized to ensure that a first workload on behalf of a first client or application does not interfere with a second workload on behalf of a second client or application. As a first such example, the server may utilize process and data isolation techniques to ensure that a first workload on behalf of a first client cannot achieve unauthorized access to a second workload on behalf of a second client, including accessing data owned by the second workload or even identifying the presence of the second workload, including the second client or application for which the second workload is processed.
  • As a second such example, the server may protect against resource overutilization. For instance, if a first workload through the server begins exhibiting a surge of volume that exceeds the share of computing resources allocated to the first workload, the use of a resource-sharing technique may enable the server to confine the consequences of the excessive volume to the first workload and to avoid impacting the processing of the second workload, such that a performance guarantee extended to the second workload remains fulfilled. In this manner, servers may allocate and regulate computing resource utilization to promote fulfillment of performance guarantees and allocate computational resources fairly over all of the workloads handled by the server.
  • SUMMARY
  • This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key factors or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
  • The processing of workloads through a large-scale server set, in view of performance guarantees, may encounter difficulties due to the sequential nature of the workloads and interactions of servers along various process paths. For example, a workload may involve a process path that involves a sequence of tasks to be performed by an intake server, a query processing server, and a storage server, and to fulfill a performance guarantee. While the respective servers may utilize computational resource allocation to handle the individual tasks of the workload, a problem may arise if the storage server begins to experience an excessive computational load. Such excessive computational load may arise, e.g., due to an over-allocation of tasks onto the storage server; a shortage of computational resources, such as a reduction of network bandwidth; or an unanticipated surge of processing, such as a failure of a process on the storage server that necessitates the invocation of a recovery process. In addition to slowing the processing of the workload through the storage server, the excessive computational load of the storage server may create a processing jam between the storage server and the query processing server that is upstream of the storage server in the process path. For example, the query processing server may continue to handle the query processing task of the workload at the same rate, but the rate at which the query processing server is able to pass the workload to the storage server may be diminished. The query processing server may address the discrepancy in various ways, such as utilizing an outbound queue for the completed workload of processed queries; however, if the reduced processing rate of the storage server persists, the outbound queue may overflow. Moreover, the additional processing burden placed upon the query processing server may propagate the processing jam upward, even to the point of potentially affecting other workloads that have a processing path through the query processing server that do not utilize the storage server. The resulting gridlock may cause a widespread failure of performance guarantees for a variety of workloads due the processing jam between the storage server and the upstream query processing server.
  • In view of such problems, it may be desirable to configure the server set to evaluate the processing paths of the various workloads, and to provide techniques for mitigating a processing jam that may arise between a particular server and a downstream server. In particular, a server of a process path may estimate, measure, and/or monitor its processing capabilities, and compare such processing capabilities with the performance guarantees of the workloads utilizing a process path through the server. If the server detects a risk of failing the performance guarantee (e.g., due to an overprovisioning of the server or a computational resource shortage), the server may transmit a performance capability alert to an upstream server of the server path, as an indication that the workload being passed to the server may be too large to ensure that the performance guarantees are met. The upstream server that receives the performance capability alert may respond by rate-limiting its processing of the workload, within the performance guarantee, thereby downscaling the processing rate of the upstream server upon the workload to match the diminished processing rate of the downstream server. In some scenarios, the upstream server may propagate the performance capability alert further upstream. If the performance capability alert reaches a first server of the server path, the first server may refuse or slow a workload acceptance rate. In this manner, the server set may adapt the acceptance of the workload for which the process path is capable of fulfilling the performance guarantee, and in response to fluctuating processing capabilities (including an unexpected loss of processing capability) of the servers of the server path.
  • A first embodiment of the presented techniques involves a server of a server set that performs workloads according to a performance guarantee. The server comprises a processor and a memory storing instructions that, when executed by the processor, cause the server to operate in accordance with the techniques presented herein. In particular, the server performs a task of the workload according to a performance guarantee, wherein the workload is processed through the server set according to a process path. On condition of receiving, from a downstream server of the process path, a performance capability alert indicating that a computational load of the downstream server risks failing the performance guarantee for the workload, the server may rate limit the task of the workload to reduce the computational load of the downstream server. After completing the task (to which the rate-limit may have been applied), the server delivers the workload to the downstream server.
  • A second embodiment of the presented techniques involves a method of configuring a server of a server set to participate in workloads. The method involves, executing, by a processor of the server, instructions that cause the server to operate in accordance with the techniques presented herein. In particular, the server receives a workload from an upstream server of a process path of the workload, wherein the workload is associated with a performance guarantee. The server performs a task on the workload, and further identifies a performance capability of the server and compares the performance capability with the performance guarantee of the workload. Responsive to determining that the performance capability risks failing the performance guarantee, the server transmits a performance capability alert to the upstream server. Additionally, responsive to receiving a performance capability alert from a downstream server of the process path, the server rate limits a receipt of additional workloads from the upstream server.
  • A third embodiment of the presented techniques involves a method of configuring a server set to perform a workload according to a performance guarantee. The method involves configuring a server within a process path of the workload through the server set to operate in accordance with the techniques presented herein. The method further involves configuring the server to perform a task on the workload according to the performance guarantee. The method further involves configuring the server to receive a performance capability alert from the downstream server, wherein the performance capability alert indicates that a computational load of the downstream server risks failing the performance guarantee for the workload. The method further involves configuring the server to rate-limit the task of the server to reduce the workload delivered to the downstream server. The method further involves configuring the server to, after performing the task on the workload, deliver the workload to a downstream server of the process path.
  • To the accomplishment of the foregoing and related ends, the following description and annexed drawings set forth certain illustrative aspects and implementations. These are indicative of but a few of the various ways in which one or more aspects may be employed. Other aspects, advantages, and novel features of the disclosure will become apparent from the following detailed description when considered in conjunction with the annexed drawings.
  • DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is an illustration of an example scenario featuring a processing of workloads through a server set.
  • FIG. 2 is an illustration of an example scenario featuring a processing of a workload through a server set in accordance with the techniques presented herein.
  • FIG. 3 is a component block diagram illustrating an example server featuring an example system for configuring a server set to process a workload in accordance with the techniques presented herein.
  • FIG. 4 is a flow diagram illustrating an exemplary method of configuring a server to process a workload through a process path of a server set in accordance with the techniques presented herein.
  • FIG. 5 is a flow diagram illustrating an exemplary method of configuring a server set to process a workload through a process path in accordance with the techniques presented herein.
  • FIG. 6 is an illustration of an example computer-readable medium storing instructions that provide an embodiment of the techniques presented herein.
  • FIG. 7 is an illustration of a set of example scenarios featuring a variety of rate-limiting mechanisms for a task in accordance with the techniques presented herein.
  • FIG. 8 is an illustration of a set of example scenarios featuring a variety of process path modifications for a workflow in accordance with the techniques presented herein.
  • FIG. 9 illustrates an exemplary computing environment wherein one or more of the provisions set forth herein may be implemented.
  • DETAILED DESCRIPTION
  • The claimed subject matter is now described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the claimed subject matter. It may be evident, however, that the claimed subject matter may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to facilitate describing the claimed subject matter.
  • A. Introduction
  • FIG. 1 is an illustration of an example scenario 100 featuring a server set 106 that processes workloads 104 on behalf of clients 102. In this example scenario 100, the server set 106 comprises a distributed query processing system that accepts queries from clients 102 and processes the queries over a data set. However, the example scenario 100 may similarly apply to a variety of workloads 104, such as content generation, media rendering and presentation, communication exchange, simulation, supercomputing, etc.
  • In this example scenario 100, the server set 106 comprises a set of server 108 that respectively perform a task 110. For example, a pair of intake servers 108 may serve as a front-end, client-facing interface that accepts queries to be processed on behalf of the clients 102; a query processing server 108 that parses queries, such as translating the query from a query language into a sequence of logical relationships to be applied over the data set; and a pair of storage servers 108 that store a replica or a portion of the data set over which the queries are to be applied. The servers 108 may be arranged such that the workloads 104 of the clients 102 are processed according to a process path 112, e.g., a sequence of servers 108 that respectively apply a task 110 to the workload 104 and deliver the partially processed workload 104 to the next, downstream server 108 in the process path 112. The process path 112 may enable a pipelined evaluation of workloads 104 that enable the servers 108 to apply the tasks 110 in the manner of an assembly line, thereby reducing idle processing capacity and promoting the scalability of the server set 106 to handle a significant volume of workloads 104 in a concurrent manner.
  • As further shown in the example scenario 100 of FIG. 1, different process paths 112 may be utilized for different workloads 104; e.g., the first intake server 108 may receive workloads 104 from a first set of clients 102 and/or geographic regions while the second intake server 108 receives workloads 104 from a second set of clients 102 and/or geographic regions. Similarly, the first workload 104 may present a first query over the portion of the database stored by the first storage server 108, while the second workload 104 may present a second query over the portion the database stored by the second storage server 108. Conversely, the process paths 112 of different workloads 102 may coincide through one or more servers 108; e.g., both workloads 104 utilize the same query processing server 108.
  • As further shown in the example scenario 100 of FIG. 1, the workloads 104 may be associated with various kinds of performance constraints. As a first such example, the first workload 104 may be particularly sensitive to latency; e.g., the first workload 104 may comprise time-sensitive data that the client 102 seeks to process in an expedited manner, and delays in the completion of the workload 104 may be highly visible, and may reduce or negate the value of the completed workload 104. As a second such example, the second workload 104 may involve fluctuating volume, such as a data-driven service that is sometimes heavily patronized by users and other times is used by only a few users. The second workload 104 may therefore be sensitive to scalability, and may depend upon comparatively consistent processing behavior of the server set 106 to handle the second workload 104 even as demand scales upward. The performance dependencies may be driven, e.g., by the computational constraints of the workload 104; the intended uses of the results of the processing; the circumstances of an application for which the workload 104 is performed; and/or the preferences of the client 102 submitting the workload 104. Moreover, the sensitivities and/or tolerances of different clients 102 and workloads 104 may vary; e.g., the first workload 104 may present a highly consistent and regular volume such that which scalability is not a concern, while the second client 102 is able to tolerate reasonable variations in latency 114, such as marginally delayed completion of the workloads 104, as long as processing is completed correctly at peak volume.
  • Due to the performance dependencies of the workloads 104, the server set 106 may extend to each client 102 a performance guarantee 114 of a performance capability of the server set 106 to handle the workload 104. For example, the server set 106 may extend to the first client 102 a performance guarantee 114 that processing of the majority of workloads 104 (e.g., 95% of workloads 104) will complete within 10 milliseconds. For the second client 102, the server set 106 may offer a performance guarantee 114 of correct processing of the second workload 104 up to a defined volume, such as 1,000 requests per second. The server set 106 may be arranged to fulfill the performance guarantees 114 of the workloads 104, e.g., by supplementing, adapting, and/or optimizing the configuration of servers 108 comprising the data set.
  • When a server set 106 is arranged such that a particular server 108 processes different workloads 104, particularly for different clients 102, problems may arise due to the concurrent and/or consecutive sharing of the server 108. As a first example, the server 108 may have to secure and isolate the workloads 104 of the first client 102 from the workloads 104 of the second client 102, e.g., to prevent the second load 104 from accessing proprietary information of the first client 102 and tampering with the operation of the first workload 104. The significance of isolation may grow with the scalability of the server set 106; e.g., a particular server 108 may process hundreds or even thousands of workloads 104 of various clients 102, and safeguarding the information of each client 102 may be a high priority. As a second example, the server 108 may comprise a limited set of computational resources, such as processor capacity, storage capacity, and network bandwidth. An overconsumption of the computational resources of the query processing server 108 by the first workload 104 may create a shortage of computational resources of the server 108 for the second workload 104, such as limited processing capacity; an exhaustion of available storage; and/or constrained network bandwidth. Such overconsumption by the first workload 104 may lead to delays or even a failure in the processing of the second workload 104, including a failure of the performance guarantee 114 of the second workload 104, such as an inability of the query processing server 108 to scale up in order to handle peak volume of the second client 102. In view of such concerns, techniques may be utilized to allocate and compartmentalize the computational resources of the server 108 for each workload 104, such as processor time-slicing and per-workload quotas or caps on storage and network capacity. Using such techniques, a server 108 may limit the adverse effects of an overconsumption of resources to the workload 104 responsible for the overconsumption; e.g., increased processing demand by the first workload 104 may result in delayed completion of the first workload 104 without impacting the processing of the second workload 104. However, resource limitations may also consume computational resources (e.g., processor time-slicing among a set of workloads 104 may present overhead due to context-switching). Such inefficiency may scale with the scalability of the server set 106, such as using a particular server 108 to process hundreds or even thousands of workloads 104, so the isolation and allocation techniques may have to be implemented with careful attention to efficiency.
  • Other developments may also present a potential source of failure of a performance guarantee 114 of a particular workload 104. For example, computational resources may be limited by systemic factors, such as a shortage of storage capacity due to a failure of a hard disk drive in a storage array, or a surge in a computational process, such as a background maintenance process. An introduction of line noise into a network connection, such as due to electromagnetic interference or a faulty cable, may lead to diminished throughput and increased latency. The server set 106 may experience a failure of a particular server 108 and may have to re-route the workload 104 of the failed server to other servers 108 of the server set 106, thereby increasing the computational load of the individual servers 108. Any such change in the performance of the server set 106 may interfere with the process path 112 of a workload 104, which may risk failing the performance guarantee 114.
  • In view of such risks, load-balancing techniques are often utilized to detect and mitigate computational overload of a particular server 108. For example, the respective servers 108 of the server set 106 may include a monitoring process of a performance capability, such as available processor capacity, storage, network throughput, and latency. The performance capabilities may also include considerations such as resiliency to data loss, e.g., the volume of data stored by the server 108 that has not yet been transmitted to a replica, as a measure of the risk of data loss in the event of a failure of the server 108. The server 108 may track the performance capabilities and, if detecting a potential shortage that risks failing a performance guarantee 114, may invoke a variety of “self-help” measurements to alleviate the shortage. For example, in the event of a shortage of processing capacity, the server 108 may place the processor into a “boost” mode; awaken and utilize dormant processing capacity, such as additional processing cores; and/or reduce or suspend some deferrable processing, such as maintenance tasks. In the event of a shortage of storage capacity, the server 108 may delete or compress data that is not currently in use, including significant data that may later be restored from a replica. In the event of a shortage of network capacity, the server 108 may suspend processes that are consuming network capacity, or shift network bandwidth allocation from processes that are tolerant of reduced network bandwidth and latency to processes that are sensitive to constrained network bandwidth or latency.
  • Alternatively or additionally to such “self-help” techniques, the server 108 may report the performance capability shortage to a network administrator or network monitoring process, which may intercede to reconfigure the server set 106. For example, the computational load of a storage server 108 may be alleviated by provisioning a new server 108, replicating the data set onto the new server 108, and altering process paths 112 to utilize the new server 108. However, reconfiguration of the architecture of the server set 106 may be a comparatively expensive step, and/or may involve a delay to implement, during which time the performance guarantee 114 of a workload 104 may fail.
  • However, these and other techniques may be inadequate to address a particular source of interference with the processing of the server set 106 that may jeopardize performance guarantees 114.
  • As further illustrated in the example scenario 100 of FIG. 1, at a second time 122, a storage server 108 may encounter a processing capacity shortage 116 that delays the processing of a workload 104 through the storage server 108. Such delay by the storage server 108 may lead to a lag in the acceptance by the storage server 108 of the workload 104 delivered by the upstream query processing server 108. That is, the query processing server 108 may complete the task 110 of parsing a number of queries that are to be applied by the second storage server 108, but the second storage server 108 may not be ready to accept the parsed queries. In some cases, the acceptance rate of the second storage server 108 may be diminished; in other cases, the acceptance rate of the second storage server 108 may be reduced to zero, such as an overflow of an input queue that the query processing server 108 uses to record parsed queries for processing by the second storage server 108. The interface between the query processing server 108 and the storage server 108 may therefore experience a processing jam 118 that interferes with the delivery of the partially processed workload 104 from the query processing server 108 to the storage server 108.
  • The query processing server 108 may respond to the processing jam 118 in numerous ways. For example, the query processing server 108 may retry the delivery of the workload 104 for a period of time, in case the processing jam 118 is ephemeral and is momentarily alleviated. The query processing server 108 may utilize an outbound queue for the workload 104 that the storage server 108 may be able to work through and empty when the processing capacity shortage 116 is alleviated, or that may be transferred to a replica of the storage server 108 following a reconfiguration of the server set 106. However, these techniques may also fail if the processing jam 118 is prolonged and a substitute for the storage server 108 is unavailable. The outbound queue of the query processing server 108 may also overflow, or the workloads 104 allocated to the query processing server 108 may begin to starve, inducing a failure of the performance guarantee 114. In some cases, the source of the fault may be misattributed to the query processing server 108, since the performance guarantees 114 failed while the query processing server 108 retained the workloads 104 for a prolonged period. For example, an automated diagnostic process may identify the query processing server 108 as a processing bottleneck, and may initiate a failover of the query processing server 108 that fails to resolve the actual limitation of the performance of the server set 106.
  • As further illustrated in the example scenario 100 of FIG. 1, at a third time 124, even more significant problems may arise when the processing capacity shortage 116 of the storage server 108 spills over to create a processing capacity shortage 116 of the upstream server. For example, the volume of completed workloads 104 that the query processing server 108 that are pending delivery to the storage server 108 may cause delays in the handling of other workloads 104 by the query processing server 108. This backward propagation of the processing capacity shortage 116 may create processing jam 118 in the interfaces of the query processing server 108 not with the second intake server 108 along the same process path 112 of the second workload 104. Moreover, the processing capacity shortage 116 may create a processing jam in the interface with the first intake server 108, leading to delayed processing and completion of the first workload 104, even though the process path 112 of the first workload 104 does not include the second storage server 108. In this manner, the processing capacity shortage 116 of the second storage server 108 may induce delays in other servers 108 and process paths 112 of the server set 106, and the failure of performance guarantees 114 even of workloads 104 that do not utilize the second storage server 108.
  • B. Presented Techniques
  • In view of the problems depicted in the example scenario 100 of FIG. 1, a server set 106 that handles a variety of workloads 104 and process paths 112, such as multitenant distributed server sets 106, may benefit from the use of techniques to detect and alleviate processing jams 118 that occur between servers 108, wherein a processing capacity shortage 116 of a downstream server 108 impacts the performance capabilities of an upstream server 108.
  • Additionally, because such incidents may occur suddenly, may quickly present risks to the failure of a performance guarantee 114, and may often be only transient, it may be advantageous to utilize techniques that may be applied rapidly and without involving a significant and potentially expensive allocation of resources, such as inducing failover of the afflicted server 108 to a substitute server 108. It may also be advantageous to utilize techniques that may be applied automatically in the locality of the afflicted server 108, without necessarily resorting to a centralized manager that holistically evaluates the server set 106 to identify potential solutions, and/or without involving a human administrator who may not be able to respond to the processing capacity shortage 116 in due time.
  • Additionally, because the effects of a processing jam 118 may spill over onto other servers 108 in a process path 112, it may be advantageous to provide techniques that may be easily propagated to a broader neighborhood of the afflicted server 108, and therefore expand to incorporate the other servers 108 in the resolution of the processing capacity shortage 116.
  • FIG. 2 is an illustration of an example scenario 200 featuring a server set 200 that operates in accordance with the techniques presented herein. In this example scenario 200, a server set 106 processes a workload 104 as a sequence of servers 108 that apply respective tasks 110 as a process path 112. The workload 104 is associated with a performance guarantee 114, such as a maximum total processing duration of the workload 104; a scalability guarantee that the process path 112 will remain capable of handling the workload 104 at a higher volume; and/or a resiliency of the server set 108 to data loss, such as a maximum volume of data of the workload 104 that is not replicated over at least two replicas and that is therefore subject to data loss.
  • At a first time 210, the servers 108 of the server set may apply the tasks 110 to the workload 104, where each server 108 completes the task 110 on a portion of the workload 104 and delivers the partially completed workload 104 to the next downstream server 108 of the process path 112. Additionally, the servers 108 may individually monitor the performance capabilities 202, and compare the performance capabilities 202 with the performance guarantee 114. For example, if the performance guarantee 114 comprises a maximum latency, such as 10 milliseconds, the respective servers 108 may monitor the duration of completing the task 110 over a selected portion of the workload 104 to ensure that the task 110 is completed within 2.5 milliseconds on each server 108. If the performance guarantee 114 comprises a maximum volume of unreplicated data that is subject to data loss, the server 108 may monitor and manage a queue of unreplicated data that is awaiting synchronization with a replica. In this manner, the respective servers 108 may ensure that the performance capabilities 202 of the individual servers 108 are sufficient to satisfy the performance guarantee 114; such that maintaining adequate individual performance capabilities 202 of all servers 108 in the server path 112 results in a satisfaction of the performance guarantee.
  • However, at a second time 212, the third server 108 in the process path 112 may detect a diminished performance capability 202, such as limited processing capacity, storage capacity, or network bandwidth. Comparison of the diminished performance capability 202 with the performance guarantee 114 may reveal a processing capacity shortage 116 that introduces a risk 204 of failing the performance guarantee 114 for the workload 114.
  • In some circumstances, the third server 108 may be capable of utilizing “self-help” measures to restore the performance capability 202. In other circumstances, the processing capacity shortage 116 may rapidly be identified as severe and unresolvable, such as a complete failure of a storage device that necessitates substitution of the third server 108. However, in some circumstances, the diminished performance capability 202 may be resolved by temporarily reducing the workload 104 handled by the third server 108. Such reduction of the workload 104 may be achieved by reducing the delivery of the workload 104 to the third server 108 by the upstream servers 108 of the process path 112. Such reduction may provide a window of opportunity in which the third server 108 may apply the available performance capabilities 202 to a workload 104 of reduced volume, which may enable the third server 108 to catch up with the volume of the workload 104. For instance, the third server 108 may utilize an input buffer of workloads 104 delivered by the upstream server 108. If the rate at which the workload 104 is delivered into the input buffer exceeds the rate at which the third server 108 removes and completes the workload 104 from the input buffer, the input buffer may steadily grow to reflect a deepening processing queue with a growing latency. Reducing the input rate of delivery of the workload 104 into the input buffer below the rate at which the third server 108 takes the workload 104 out of the input buffer may shrink the input buffer and enable the third server 108. When the input buffer is depleted or at least reduced to an acceptable latency, and/or the cause of the diminished performance capability 202 and processing capacity shortage is resolved, the input rate to the input buffer may be restored.
  • As further shown in the example scenario 200 of FIG. 2, the reduction of the delivery rate of the workload to the third server 108 may be achieved through coordination with the upstream servers 108. At a third time point 214, responsive to detecting the processing capacity shortage 116 and identifying the risk 204 of failing the performance guarantee 114, the third server 108 may transmit a performance capability alert 206 to the upstream server 108. The second server may receive the performance capability alert 206 and respond by applying a rate limit 208 the task 110 performed on the workload 104 by the second server 108. The rate limit 208 may comprise, e.g., slowing the rate at which the task 110 is performed on the workload 104, such as by reducing the processing priority of the task 110, a processor rate or core count of a processor that handles the task 110, or an allocation of network bandwidth used by the task 110. The rate limit 208 may also comprise slowing the acceptance rate of the workload 104 by the second server 108 from the upstream first server 108 and thereby reducing the rate of the completed workload 104 delivered to the third server 108. The rate limit 208 may also comprise enqueuing the workload 104 received from the upstream first server 108 for a delay period; and/or enqueuing the workload 104 over which the task 110 has been completed for a delay period before attempting delivery to the third server 108.
  • The second server 108 may continue to apply the rate limit 208 to task 110 for the duration of the processing capacity shortage 116 of the third server 108. For example, the third server 108 may eventually report an abatement of the processing capacity shortage 116, or the second server 108 may detect such abatement, e.g., by detecting an emptying of the outbound queue to the third server 108, at which point the second server 108 may remove the rate limit 208 and resume unrate limited processing of the task 110 for the workload 104. Alternatively, if the processing capacity shortage 116 is prolonged or indefinite, or if the second server 108 identifies that applying the rate limit 208 to the task 110 may impose a new risk 204 of failing the performance guarantee 114, the second server 108 may propagate the performance capability alert 206 to the next upstream server 108 of the process path 112, i.e., the first server 108. The first server 108 may similarly respond to the performance capability alert 206 by applying a rate limit 208 to the task 110 of the first server 108 over the workload 104. The first server 108, as the intake point of the workload 104, may reduce the commitment of the entire process path 112 to a smaller workload volume over which the performance capability 202 may be guaranteed even while afflicted with the processing capacity shortage 116. In this manner, the backward propagation of performance capability alerts 206 and the application of a rate limit 208 the task 110 of the second server 108 operate as a form of “backpressure” on the upstream servers 108 of the process path 112, which reduces the computational overload of the third server 108 and promotes the satisfaction of the performance guarantee 114 of the server set 106 over the workload 104 in accordance with the techniques presented herein.
  • C. Technical Effects
  • A first technical effect that may arise from the techniques presented herein involves the resolution of the processing capacity shortage 116 of a downstream server 108 and the risk 204 of failing the performance guarantee 114 of the workload 104 through the application of backpressure on upstream servers 108 of the process path 112.
  • The use of a rate limit 208 by the second server 108 in accordance with the techniques presented herein may effectively address the process capacity shortage 116 and the risk 204 of failure of the performance guarantee 114 of the workload 104 in numerous ways. As a first such example, it may be feasible for the second server 108 to apply the rate limit 208 to the task 110 may be feasible by the second server 108 without the introduction of the rate limit 208 exacerbating the risk 204 of failing the performance guarantee 114. For example, the second server 108 may have a surplus performance capability 202, and may be capable of working through the workload 104 significantly faster than required by the performance guarantee 114 (e.g., the second server 108 may have a maximum allocation of 2.5 milliseconds to perform the task 110 over the workload 104 within the performance guarantee 114, but may be capable of completing the task 110 in only 0.7 milliseconds). That is, the application of the rate limit 208 to the task 110 may offload some of the delay caused by the processing capacity shortage 116 from the third server 108 to the second server 108, thus enabling the third server 108 to work through a backlog of the workload 104 and restore the performance capability 202.
  • As a second such example, the second server 108 and third server 108 may share two workloads 104, wherein the processing capacity shortage 116 may introduce a risk 204 of failing the performance guarantee 114 of the first workload 104, but may pose no risk 204 of failing the performance guarantee 114 of the second workload 104 (e.g., the first workload 104 may be sensitive to latency, while the second workload 104 may be relatively tolerant of latency). The application of the rate limit 208 to the task 110 of the second server 108 may reduce the rate of delivery to the third server 108 of both the first workload 104 and the second workload 104. The reduced volume of the second workload 104 may enable the third server 108 to apply the performance capability 202 to work through a backlog of the first workload 104 and therefore alleviate the processing capacity shortage 116, without introducing a risk 204 of failing a performance guarantee 114 for the second workload 104 that is not significantly affected by increased latency.
  • As a third such example, the use of a performance capability alert 206 and rate limit 208 may be applied to further upstream servers 108. For example, in the example scenario 200 of FIG. 2, the second server 108 may be unable to apply a rate limit 208 to the task 110 without creating a further risk 204 of failing the performance guarantee 114 (e.g., the margin between the performance capability 202 of the second server 108 and the performance guarantee 114 may already be thin). Alternatively, the rate limit 208 may initially be applied to the task 110 by the second server 108, but a protracted and/or unresolvable processing capacity shortage 116 by the third server 108 may eventually render the rate limit 208 insufficient, such as an overflow of the outbound queue of the second server 108, or where the application of the rate limit 208 to the task 110 introduces a risk 204 of failing a performance guarantee 114 of another workload 104 over which the server 108 applies the task 110. In these and other ways, the “backpressure” induced by the backward propagation of the performance capability alert 206 and the application of the rate limit 208 to a task 110 of an upstream server 108 may effectively alleviate the processing capacity shortage 116 of the downstream server 108.
  • A second technical effect that may arise from the techniques presented herein involves the capability of the server set 106 to respond to performance capacity shortages in an efficient, rapid, and automated manner. As a first such example, the techniques presented herein may be applied without conducting a holistic, extensive analysis of the capacity of the server set 106, such as may be performed by a network monitoring or network administrator, to determine the root cause of the processing capacity shortage 116 and to assess the available options. Rather, the server 108 afflicted by diminished performance capability 202 may simply detect the processing capacity shortage 116 and transmit the performance capability alert 206 to the upstream server 108. As a second such example, the techniques presented herein do not involve a significant and potentially expensive reconfiguration of the server set 106 or a commitment of resources, such as provisioning a substitute server for the afflicted server 108, which may involve remapping associations within the server set 106 and/or introduce a delay in the recovery process. In some cases, the delay involved in applying the recovery may outlast the duration of the processing capacity shortage 116. In other cases, the performance guarantee 114 for the workload 104 may fail during the delay involved in applying such heavy recovery techniques. In some circumstances, such recovery may impose additional computational load on the afflicted server 108, thus hastening the failure of the performance guarantee 114. By contrast, the comparatively simple techniques presented herein are applicable merely by transmitting the performance capability alert 206 to the upstream server 108 and causing the second server 108 to apply the rate limit 208 to the task 110 may be applied rapidly and with a negligible expenditure of resources, and may therefore be effective at resolving some processing capacity shortages 116, particularly serious but ephemeral shortages, that other techniques may not adequately address. Moreover, the transmission of the performance capability alert 206 and the application of the rate limit 208 to the task 110 utilize currently existing and available resources and capabilities of the downstream and upstream servers (e.g., processor clock rate adjustment, adjustment of thread and process priorities, and/or the use of queues), and do not depend upon the introduction of complex new process management machinery or protocols.
  • A third technical effect that may arise from the techniques presented herein involves the extension of the process to reduce or avoid the risk 204 of failing the performance guarantee 114 altogether. In the example scenario 200 of FIG. 2, the first server 108 is positioned at the top of the process path 112 and serves as an intake point for the workload 104. If the server set 106 propagates the performance capability alert 206 all the way to the first server 108 at the top of the process path 112, the first server 108 may respond by reducing the acceptance rate of the workload 104 into the process path 112. That is, rather than imposing a risk 204 of failing the performance guarantee 114 of a previously accepted workload 104, reducing the acceptance rate of the workload 104 into the process path 112 may alleviate the risk 204 altogether by reducing the volume of the workload 104 over which the performance guarantee 114 is offered. In more significant cases such as a protracted or indefinite processing capacity shortage 116, the first server 108 may reduce the performance guarantee 114 that is offered for the workload 104 (e.g., raising a latency performance guarantee 114 of the workload 104 from 10 milliseconds to 50 milliseconds), and/or may altogether refrain from offering a performance guarantee 114 or accepting new workloads 104 until the processing capacity shortage 116 is alleviated. In this manner, the “backpressure” techniques presented herein may enable the process path 112 to respond to processing capacity shortages 116 by reducing the initial commitment of the server set 108 to the workload, thus avoiding problems of overcommitment of the server set 108 by only offering performance guarantees 114 that the process path 112 is capable of fulfilling. Many such technical effects may arise from the processing of the workload 104 by the server set 106 in accordance with the techniques presented herein.
  • D. Example Embodiments
  • FIG. 3 is an illustration of an example scenario 300 featuring some example embodiments of the techniques presented herein, including an example server 302 that processes a workload 104 as part of a server set 106. The example server 302 comprises a processor 304 and a memory 306 (e.g., a memory circuit, a platter of a hard disk drive, a solid-state storage device, or a magnetic or optical disc) encoding instructions that, when executed by the processor 304 of the example server 302, cause the example server 302 to process the workload 104 in accordance with the techniques presented herein. More particularly, in this example scenario 300, the instructions encode components of example system 308 that perform various portions of the presented techniques. The interoperation of the components of the example system 308 enables the example server 302 to process the workload 104 in accordance with the techniques presented herein.
  • The example system 308 comprises a task processor 310, which performs a task 110 of the workload according to a performance guarantee, wherein the workload 104 is processed through the server set 106 according to a process path 112 that includes an upstream server 108 and a downstream server 108 relative to the example server 302. The example system 308 also includes a task rate limit 314, which receives a performance capability alert 206 from a downstream server 108 of the process path 112, e.g., in response to a comparison of the performance capability 202 of the downstream server 108 with the performance guarantee 114 of the workload 104, which indicates a processing capacity shortage 116 and a risk 204 of failing the performance guarantee 114 of the workload 104. Responsive to the performance capability alert 206, the task rate limit 314 applies a rate limit 208 to the task 110 performed on the workload 104 to reduce the computational load of the downstream server 108. The example system 308 also includes a workload streamer 312, which, after completion of the task 110 on the workload 104, delivers the workload 104 to the downstream server 108 of the process path 112. In this manner, the example system 308 enables the example server 302 to apply the task 110 to the workload 104 as part of the process path 112 in accordance with the techniques presented herein.
  • FIG. 4 is an illustration of an example scenario featuring a third example embodiment of the techniques presented herein, wherein the example embodiment comprises an example method 400 of configuring a server 108 to process a workload 104 in accordance with techniques presented herein. The example method 400 involves a server 108 comprising a processor 304, and may be implemented, e.g., as a set of instructions stored in a memory 306 of the server 108, such as firmware, system memory, a hard disk drive, a solid-state storage component, or a magnetic or optical medium, wherein the execution of the instructions by the processor 304 causes the server 108 to operate in accordance with the techniques presented herein.
  • The example method 400 begins at 402 and involves executing 404, by the server, instructions that cause the server to perform in the following manner. The execution of the instructions causes the server 108 to receive 406 a workload 104 from an upstream server 108 of a process path 112, wherein the workload 104 is associated with a performance guarantee 114. The execution of the instructions also causes the server 108 to perform 408 a task 110 on the workload 104. The execution of the instructions also causes the server 108 to identify 410 a performance capability 202 of the server 108. The execution of the instructions also causes the server 108 to compare 412 the performance capability 202 with the performance guarantee 114 of the workload 104. The execution of the instructions also causes the server 108 to respond to determining that the performance capability 202 risks failing the performance guarantee 114 by transmit 414 a performance capability alert 206 to the upstream server 108. The execution of the instructions also causes the server 108 to respond to a performance capability alert 206 received from a downstream server 108 of the process path 112 by rate-limiting 416 the task 110 performed on the workload 104 to reduce the computational load of the downstream server 108. In this manner, the example method 400 may enable the server 108 to process the workload 104 as part of the process path 112 in accordance with the techniques presented herein, and so ends at 418.
  • FIG. 5 is an illustration of an example scenario featuring a third example embodiment of the techniques presented herein, wherein the example embodiment comprises an example method 500 of configuring a server set 106 to process a workload 104 that is associated with a performance guarantee 114 in accordance with techniques presented herein. The example method 500 involves a server set 106 comprising a collection of servers 108 respectively comprising a processor 304, and may be implemented, e.g., as a set of instructions stored in a memory 306 of the server 108, such as firmware, system memory, a hard disk drive, a solid-state storage component, or a magnetic or optical medium, wherein the execution of the instructions by the processor 404 causes the server 108 to operate as a member of the server set 106 in accordance with the techniques presented herein.
  • The first example method 500 begins at 502 and involves configuring a server 108 of the server set 106 that is within the process path 112 to process the workload 104 in the following manner. The server 108 performs 506 a task 110 on the workload 104 according to the performance guarantee 114. The server 108 further receives 508 a performance capability alert 206 from the downstream server 108, wherein the performance capability alert 206 indicates that a computational load of the downstream server 108 risks failing the performance guarantee 114 for the workload 104. Responsive to the performance capability alert 206, the server 108 further rate-limits 510 the task 110 of the server 108 to reduce the workload delivered to the downstream server 108. After performing the task 110 on the workload 104, the server 108 further delivers 512 the workload 104 to a downstream server 108 of the process path 112. In this manner, the example method 500 may enable the server 108 to operate as part of a server set 106 to participate in the processing of the workload 104 in accordance with the techniques presented herein, and so ends at 514.
  • Still another embodiment involves a computer-readable medium comprising processor-executable instructions configured to apply the techniques presented herein. Such computer-readable media may include various types of communications media, such as a signal that may be propagated through various physical phenomena (e.g., an electromagnetic signal, a sound wave signal, or an optical signal) and in various wired scenarios (e.g., via an Ethernet or fiber optic cable) and/or wireless scenarios (e.g., a wireless local area network (WLAN) such as WiFi, a personal area network (PAN) such as Bluetooth, or a cellular or radio network), and which encodes a set of computer-readable instructions that, when executed by a processor of a device, cause the device to implement the techniques presented herein. Such computer-readable media may also include (as a class of technologies that excludes communications media) computer-computer-readable memory devices, such as a memory semiconductor (e.g., a semiconductor utilizing static random access memory (SRAM), dynamic random access memory (DRAM), and/or synchronous dynamic random access memory (SDRAM) technologies), a platter of a hard disk drive, a flash memory device, or a magnetic or optical disc (such as a CD-R, DVD-R, or floppy disc), encoding a set of computer-readable instructions that, when executed by a processor of a device, cause the device to implement the techniques presented herein.
  • An example computer-readable medium that may be devised in these ways is illustrated in FIG. 6, wherein the implementation 600 comprises a computer-readable memory device 602 (e.g., a CD-R, DVD-R, or a platter of a hard disk drive), on which is encoded computer-readable data 604. This computer-readable data 604 in turn comprises a set of computer instructions 606 that, when executed on a processor 304 of a server, cause the server to operate according to the principles set forth herein. For example, the processor-executable instructions 606 may encode a system that processes a workload 104 as part of a server set 106, such as the example system 308 of FIG. 3. As another example, the processor-executable instructions 606 may encode a method of configuring a server 108 to process a workload 104 as part of a server set 106, such as the example method 400 of FIG. 4. As yet another example, the processor-executable instructions 606 may encode a method of configuring a server set 106 to process a workload 104, such as the example method 500 of FIG. 5. Many such computer-readable media may be devised by those of ordinary skill in the art that are configured to operate in accordance with the techniques presented herein.
  • E. Variations
  • The techniques discussed herein may be devised with variations in many aspects, and some variations may present additional advantages and/or reduce disadvantages with respect to other variations of these and other techniques. Moreover, some variations may be implemented in combination, and some combinations may feature additional advantages and/or reduced disadvantages through synergistic cooperation. The variations may be incorporated in various embodiments (e.g., the first example server 302 and/or the example system 308 of FIG. 3; the example method 400 of FIG. 4; the example method 500 of FIG. 5; and the example device 602 and/or example method 608 of FIG. 6) to confer individual and/or synergistic advantages upon such embodiments.
  • E1. Scenarios
  • A first aspect that may vary among implementations of these techniques relates to scenarios in which the presented techniques may be utilized.
  • As a first variation of this first aspect, the presented techniques may be utilized with a variety of servers 108 and server sets 106, such as workstations, laptops, consoles, tablets, phones, portable media and/or game players, embedded systems, appliances, vehicles, and wearable devices. The server may also comprise a collection of server units, such as a collection of server processes executing on a device; a personal group of interoperating devices of a user; a local collection of server units comprising a computing cluster; and/or a geographically distributed collection of server units that span a region, including a global-scale distributed database. Such servers 108 may be interconnected in a variety of ways, such as locally wired connections (e.g., a bus architecture such as Universal Serial Bus (USB) or a locally wired network such as Ethernet); locally wireless connections (e.g., Bluetooth connections or a WiFi network); remote wired connections (e.g., long-distance fiber optic connections comprising Internet); and/or remote wireless connections (e.g., cellular communication). Additionally, such servers 108 may serve a variety of clients 102, such as a client process on one or more of the servers 108; other servers 108 within a different server set 106; and/or various client devices that utilize the server 108 and/or server group on behalf of one or more clients 102 and/or other devices.
  • As a second variation of this first aspect, the server set 106 may present a variety of services that involve applying tasks 110 to workloads 108. As a first such example, the service may comprise a distributed database or data storage system, involving tasks 110 such as receiving the data; storing the data; replicating and/or auditing the data; evaluating queries over the data; and/or running reports or user-defined functions over the data. As a second such example, the service may comprise a content presentation system, such as a news service, a social network service, or social media service, which may involve tasks 110 such as retrieving and storing content items; generating new content items; aggregating content items into a digest or collage; and transmitting or communicating the content items to clients 102. As a third such example, the service may comprise a media presentation system, which may involve tasks 110 such as acquiring, storing, cataloging, and archiving the media objects; rendering and presenting media objects to clients 102; and/or tracking engagement of the clients 102 with the media objects. As a fourth such example, the service may comprise a software repository, which may involve tasks 110 such as storing and cataloging software; deploying software to various clients 102; and receiving and applying updates such as patches and upgrades to the software deployed of the clients 102. As a fifth such example, the service may comprise a gaming system, which may involve tasks 110 such as initiating game sessions; running game sessions; and compiling the results of game sessions among various clients 102. As a sixth such example, the service may comprise an enterprise operational service that provides operational computing for an enterprise, which may involve tasks 110 such as providing a directory of entities such as individuals and operating units; exchanging communication among the entities; controlling and managing various processes; monitoring and logging various processes, such as machine sensors; and generating alerts. Those of ordinary skill in the art may devise a range of scenarios in which a server set 106 configured in accordance with the techniques presented herein may be utilized.
  • E2. Performance Capabilities and Performance Guarantees
  • A second aspect that may vary among embodiments of the techniques presented herein involves the performance capabilities 202 monitored by the servers 108 and the comparison with performance guarantees 114 over the workload 104 to identify a processing capacity shortage 116 and a risk 204 of failing the performance guarantee 114.
  • As a first variation of this second aspect, the performance capabilities 202 may include, e.g., processor capacity; storage capacity; network bandwidth; availability of the server set 106; scalability to handle fluctuations in the volume of a workload 104; resiliency to address faults such as the failure of a server 108; latency of processing the workload 104 through the server set 106; and/or adaptability to handle new types of workloads 104.
  • As a second variation of this second aspect, the performance guarantees 114 of the workloads 104 may involve, e.g., a processing latency, such as a maximum end-to-end processing duration for processing the workload 104 to completion; a processing throughput of the workload 104, such as a sustainable rate of completed items; a processing consistency of the workload 104, such as a guarantee of consistency among portions of the workload 104 processed at different times and/or by different servers 108; scalability to handle a peak volume of the workload 104 to a defined level; a processing replication of the workload 104, such as a maximum volume of unreplicated data that may be subject to data loss; and/or a minimum availability of the server set 106, such as a “sigma” level.
  • As a third variation of this second aspect, a server 108 may identify the performance capabilities 202 in various ways. As a first such example, a server 108 may predict the performance capability of the server 202 over the workload 104, such as an estimate of the amount of time involved in applying the task 110 to the workload 104 or a realistically achievable throughput of the server 108. Such predictions may be based, e.g., upon an analysis of the workload 104, a set of typical performance characteristics or heuristics of the server 108, or previous assessments of processing the task 110 over similar workloads 104. Alternatively or additionally, the server 108 may measure the performance capability 202 of the server while performing the workload 104. Such measurement may occur with various granularity and/or periodicity, and may involve techniques such as low-level hardware monitors (e.g., hardware timers or rate meters) and/or high-level software monitors (e.g., a timer placed upon a thread executing the task 110). As a third such example, a server 108 may not actively monitor the performance capability 202 but may receive an alert if an apparent processing capacity shortage 116 arises (e.g., a message from a downstream server 108 of a reduced delivery of the completed workload 104).
  • As a fourth variation of this second aspect, a server 108 may and compare such performance capabilities 202 with the performance guarantees 114 of the workload 104 in various ways. As a first such example, the server 108 may compare an instantaneous measurement of the performance capability 202 with an instantaneous performance guarantee 114, such as a current data transfer rate compared with a minimum acceptable data transfer rate, and/or periodic measurements, such as a number of completed tasks 110 over a workload 104 in a given period vs. a quota of completed tasks 110. As a second such example, the server 108 may compare a trend in the performance capability 202, e.g., detecting a gradual reduction of processing capacity over time that, while currently satisfying the performance guarantee 114, may indicate an imminent or eventual risk 204 of failing the performance guarantee 114, such as a gradually diminishing rate of completed tasks 110. As a third such example, a workload 104 may be associated with a set of at least two performance guarantees 114 for at least two performance capabilities 202 and a priority order of the performance guarantees 114 (e.g., a first priority of a maximum latency of processing individual tasks 110 over the workload 104 at a typical rate of 10 milliseconds, but in the event of an ephemeral failure of the first performance guarantee 114, a second priority of a maximum throughput of processing tasks 110 of the workload 104 within a given period, such as at least 100 tasks completed per second). Such prioritization may enable the performance guarantees 114 to be specified in a layered or more nuanced manner. The server 108 may compare the respective performance capabilities 202 of the server 202 according to the priority order to evaluate the risk 204 of failing the collection of performance guarantees 114 for the workload 114.
  • As a fifth variation of this third aspect, a performance capability alert 206 may be relayed from a downstream server 108 to an upstream server 108 in a variety of ways. As a first such example, the performance capability alert 206 may comprise a message initiated by the downstream server 208 and transmitted to the upstream server 108 in response to the identification of a risk 204 of failing the performance guarantee 114. The message may be delivered in-band (e.g., as part of an ordinary communication stream) or out-of-band (e.g., using a separate and dedicated communication channel). As a second such example, the performance capability alert 206 may comprise a performance metric that is continuously and/or periodically reported by the downstream server 108 to the upstream server 108 (e.g., an instantaneous measurement of processing capacity), where the upstream server 108 may construe a fluctuation of the metric as a performance capability alert 206 (e.g., the downstream server 108 may periodically report its latency in completing the task 110 over the workload 104, and the metric may reveal an excessive latency that is approaching a maximum latency specified by the performance guarantee 114). As a third such example, the performance capability alert 206 may comprise part of a data structure shared by the downstream server 108 and the upstream server 108, such as a flag of a status field or a queue count of a workload queue provided at the interface between the downstream server 108 and the upstream server 108. As a fourth such example, the performance capability alert 206 may comprise a function of the upstream server 108 that is invoked by the downstream server 108, such as an API call, a remote procedure call, a delegate function, or an interrupt that the downstream server 108 initiates on the upstream server 108. Many such techniques may be utilized to compare the performance capability 202 to the performance guarantee 14 to identify a risk 204 of failing the performance guarantee 114 of the workload 104 in accordance with the techniques presented herein.
  • E3. Task Rate-Limiting
  • A third aspect that may vary among embodiments of the presented techniques involves the manner of applying a rate limit 208 to a task 110 over the workload 104 in accordance with the techniques presented herein. FIG. 7 is an illustration of a set 700 of example scenarios featuring various techniques for rate-limiting a task 110 applied to a workload 104 of a server 302.
  • As a first variation of this third aspect, illustrated in a first example scenario 710, a server 302 may rate limit a task 110, responsive to receiving a performance capability alert 206 from a downstream server 108, by reducing the performance capabilities 202 of the server 108. As a first example, the server 208 may reduce a processor speed 702 of a processor 304, such as reducing the clock speed or core count that is applied to perform the task 110 over the workload 104. As a second example, the server 302 may reduce a thread priority of the task 110, such that a multiprocessing processor 304 performs increments of the task 110 less frequently, or even suspends the task 110 temporarily if other tasks 110 are of higher priority. Other types of performance capabilities 202 that may be reduced for the workload 104 include volatile or nonvolatile memory allocation; network bandwidth; and/or access to a peripheral device such as a rendering pipeline. In some scenarios, the server 302 may rate limit the task 110, relative to a severity of the performance capability alert 206, such as the degree of constraint on the network capacity of the downstream server 108 or the pending volume of unprocessed work that the downstream server 108 has to work through to alleviate the performance capability alert 206.
  • As a second variation of this third aspect, illustrated in a second example scenario 712, a server 302 may rate limit a task 110, responsive to receiving a performance capability alert 206 from a downstream server 108, by temporarily refusing to accept the workload 104 from an upstream server 108, e.g., by initiating a processing jam 118. The processing jam 118 may be initiated in increments, such that the upstream server 108 is only capable of sending batches of the workload 104 to the server 302 in intervals that are interspersed by a cessation of the workload 104 arriving at the server 302. Alternatively or additionally, the server 302 may reduce an acceptance rate of the workload 104 from the upstream server 108; e.g., the upstream server 108 may utilize an output queue of workload 104 to deliver to the server 302, and the server 302 may only check the output queue at an interval, or at a reduced interval, thereby slowing the rate at which the server 302 accepts workload 104 from the upstream server 108 and delivers the workload 104 to the downstream server 108.
  • As a third variation of this third aspect, illustrated in a third example scenario 714, a server 108 may rate limit a task 110, responsive to receiving a performance capability alert 206 from a downstream server 108, by utilizing one or more queues that slow the intake and/or delivery of the workload 104 to the downstream server 108. As a first such example, the server 302 may implement an input queue 704 that enqueues the task 110 for the workload 104 for a delay period, and withdraw the task 110 from the input queue to perform the task 110 on the workload 104 only after the delay period. As a second such example, the server 302 may implement an output queue 706 with a delivery delay that slows the rate at which processed work is delivered to the downstream server 108.
  • As a fourth variation of this third aspect, a server 302 may rate limit the task 110 over the workload 104 only within the performance guarantee 114 of the workload 104. For example, the performance guarantee 114 may comprise a maximum 10-millisecond latency of processing the workload 104 through the process path 112, and a particular server 302 may be permitted to expend up to 2.5 milliseconds per task 110 while the task 110 remains in conformity with the performance guarantee 114. If the server 302 typically performs the task 110 in 0.7 milliseconds, the server 302 may rate limit the task 110 for up to or close to an additional 1.8 milliseconds to reduce the rate at which the workload 104 is delivered to the downstream server 108. If further rate-limiting is to be applied, instead of introducing a new risk 204 of failing the performance guarantee 114, the server 302 may refrain from further rate-limiting the task 110. Instead, as shown in the fourth example scenario 716 of FIG. 4, the server 302 may propagate the performance capability alert 206 to an upstream server 108. Additionally, if the server 302 comprises a first server in the process path 112 that is assigned the task 110 of intake of new workload 708 from one or more clients 102, the server 302 may rate limit the workload by reducing an intake rate of the new workload 708 to the entire process path 112. That is, the server 302 may only agree to accept a diminished volume of the new workload 708 for which the performance guarantee 114 is assigned. Alternatively or additionally, the first server 302 may apply the rate limit 208 to the performance guarantee 114 in an offer 706 provided to the client 102 to extend the new workload 708, such as extending a maximum latency of the performance guarantee from 10 milliseconds to 20 milliseconds. In this manner, the server 302 may adapt the commitment offered by the server set 302 toward a performance guarantee 114 that the process path 12, including the server 108 afflicted by a processing capacity shortage 116, is currently able to guarantee. Many such techniques may be utilized to rate limit the task 110 of a server 108 in response to a performance capability alert 206 in accordance with the techniques presented herein.
  • E4. Process Path Adaptation
  • A fourth aspect that may vary among embodiments of the techniques presented herein involves adjustment of the process path 112 to adapt to a processing capacity shortage 116. In some scenarios, rate-limiting the tasks 110 of upstream servers 108 may be adequate to resolve a processing capacity shortage 116. However, in other scenarios, the processing capacity shortage 116 may be severe, prolonged, and/or of indefinite duration, such that in addition to rate-limiting a task 110 of an upstream server 108, the server set 106 may implement more significant steps to maintain the satisfaction of the performance guarantee 114. FIG. 8 is an illustration of a set 800 of example scenarios that illustrate some of the variations of this fifth aspect.
  • As a first variation of this fourth aspect, illustrated in a first example scenario 808, a server 302 may respond to a performance capability alert 206 of a downstream server 108 by redirecting the process path 112 through the server set 102 to provide a substitute server 802 in lieu of the server 108 exhibiting the performance capacity shortage 116. The substitute server 802 may be previously allocated and allocated and ready for designation as a substitute server 802, or may be newly provisioned for 802 and inserted into the process path 112. Alternatively, the substitute server 802 may already exist in the process path 112 of the workload 104 or in another process path 112 of the server set 104, and the task 110 performed by the server 108 may be transferred to the substitute server 802 along with the workload 104.
  • As a second variation of this fourth aspect, a server 302 may respond to a performance capability alert 206 by expanding a computational resource set of the server 108 exhibiting the processing capacity shortage 116. As a first such example, the server 108 may comprise a virtual machine, and the processing resources allocated to the virtual machine may be increased (e.g., raising a thread priority and/or processing core usage of the virtual machine). As a second such example, illustrated in a second example scenario 810, the server 108 exhibiting the processing capacity shortage 116 may be supplemented by the addition of an auxiliary server 804 that expands the processing capacity of the server 108. For example, the workload 104 may be shared between the server 108 and the auxiliary server 804 until the processing capacity shortage 116 of the server 108 is alleviated.
  • As a third variation of this fourth aspect, a server 108 exhibiting a processing capacity shortage 116 may experience a processing capacity shortage 116 that risks failing a performance guarantee 114 of a first workload 104, but that presents lower or no risk of failing performance guarantees 114 of other workloads 104 of the server 108. The server 108 may therefor prioritize the processing of the first workload 104 over the other workloads 104 alleviate the processing capacity shortage 106. As a first such example, illustrated in a third example scenario 812, the server 108 may adjust by reducing a process priority 806 of another workload 104 that the server 108 processes, e.g., a workload 104 that involves no performance guarantee 114, or may involve a second performance guarantee 114 that is amply satisfied (e.g., a dependency upon a different type of performance capability of the server 108, such as a CPU-bound workload as compared with a network-bound workload). The relative adjustment of the process priorities 806 may enable the server 108 to work through a backlog and resolve the processing capacity shortage 116. As a second such example, where the server 108 processes a third task 110 for a third workload 104 according to a third process path 112, the server 108 may redirect the second process path for the third workload 110 through a substitute server 802. The server 108 may therefore reserve a greater proportion of computational resources to address the processing capacity shortage 116.
  • As a fourth variation of this fourth aspect, a server 108 that implements rate-limiting of a task 110 in order to alleviate a processing capacity shortage 116 of a downstream server 108 may curtail or end the rate-limiting of the task 110 based upon an alleviation of the performance capability shortage 116 of the downstream sever 108. As a first such example, a downstream server 108 initiating the performance capability alert 206 may send a notification to an upstream server 302 applying the rate-limiting to indicate an abatement of the processing capacity shortage 116. As a second such example, an upstream server 302 applying rate-limiting to a task 110 may detect an abatement of the processing capacity shortage 116, e.g., as a depletion of an output queue of workloads 104 to deliver to the downstream server 108. As a third such example, the upstream server 302 may apply the rate-limiting only for a set interval, such as one second, and may then remove the rate-limiting, such that a persistence of the processing capacity shortage 116 at the downstream server 108 may result in a second performance capability alert 206 and a reapplication of the rate limit to the process 110. In some scenarios, the reapplication may occur at an increasing interval (e.g., first one second, then two seconds, etc.) to reduce an inefficiency of the transmission and receipt of multiple performance capability alerts 206, which may reduce the ability of the downstream server 108 to alleviate the processing capacity shortage 116.
  • As a fifth variation of this fourth aspect, the adjustments of the process paths 112 may be requested and/or implemented by the server 108 experiencing the processing capacity shortage 116. As another example, the adjustments of the processing paths 112 may be requested and/or implemented by the upstream server 302, e.g., upon determining that the rate-limiting of the task 110 by the upstream server 302 is insufficient to resolve a processing capacity shortage 116 that is prolonged, indefinite, overly frequent, and/or unresolvable by rate-limiting. As yet another example, the adjustments of the processing paths 112 may be implemented at the request of an automated network monitor or network administrator. Many such techniques may be utilized to provide further adaptations of the server set 106, in conjunction with the rate-limiting of the task 110 by the upstream server 302, in accordance with the techniques presented herein.
  • F. Computing Environment
  • FIG. 9 and the following discussion provide a brief, general description of a suitable computing environment to implement embodiments of one or more of the provisions set forth herein. The operating environment of FIG. 9 is only one example of a suitable operating environment and is not intended to suggest any limitation as to the scope of use or functionality of the operating environment. Example computing devices include, but are not limited to, personal computers, server computers, hand-held or laptop devices, mobile devices (such as mobile phones, Personal Digital Assistants (PDAs), media players, and the like), multiprocessor systems, consumer electronics, mini computers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
  • Although not required, embodiments are described in the general context of “computer readable instructions” being executed by one or more computing devices. Computer readable instructions may be distributed via computer readable media (discussed below). Computer readable instructions may be implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), data structures, and the like, that perform particular tasks or implement particular abstract data types. Typically, the functionality of the computer readable instructions may be combined or distributed as desired in various environments.
  • FIG. 9 illustrates an example of a system comprising a computing device 902 configured to implement one or more embodiments provided herein. In one configuration, computing device 902 includes at least one processing unit 906 and memory 908. Depending on the exact configuration and type of computing device, memory 908 may be volatile (such as RAM, for example), non-volatile (such as ROM, flash memory, etc., for example) or some combination of the two. This configuration is illustrated in FIG. 9 by dashed line 904.
  • In other embodiments, device 902 may include additional features and/or functionality. For example, device 902 may also include additional storage (e.g., removable and/or non-removable) including, but not limited to, magnetic storage, optical storage, and the like. Such additional storage is illustrated in FIG. 9 by storage 910. In one embodiment, computer readable instructions to implement one or more embodiments provided herein may be in storage 910. Storage 910 may also store other computer readable instructions to implement an operating system, an application program, and the like. Computer readable instructions may be loaded in memory 908 for execution by processing unit 906, for example.
  • The term “computer readable media” as used herein includes computer storage media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions or other data. Memory 908 and storage 910 are examples of computer storage media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVDs) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by device 902. Any such computer storage media may be part of device 902.
  • Device 902 may also include communication connection(s) 916 that allows device 902 to communicate with other devices. Communication connection(s) 916 may include, but is not limited to, a modem, a Network Interface Card (NIC), an integrated network interface, a radio frequency transmitter/receiver, an infrared port, a USB connection, or other interfaces for connecting computing device 902 to other computing devices. Communication connection(s) 916 may include a wired connection or a wireless connection. Communication connection(s) 916 may transmit and/or receive communication media.
  • The term “computer readable media” may include communication media. Communication media typically embodies computer readable instructions or other data in a “modulated data signal” such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” may include a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • Device 902 may include input device(s) 914 such as keyboard, mouse, pen, voice input device, touch input device, infrared cameras, video input devices, and/or any other input device. Output device(s) 912 such as one or more displays, speakers, printers, and/or any other output device may also be included in device 902. Input device(s) 914 and output device(s) 912 may be connected to device 902 via a wired connection, wireless connection, or any combination thereof. In one embodiment, an input device or an output device from another computing device may be used as input device(s) 914 or output device(s) 912 for computing device 902.
  • Components of computing device 902 may be connected by various interconnects, such as a bus. Such interconnects may include a Peripheral Component Interconnect (PCI), such as PCI Express, a Universal Serial Bus (USB), Firewire (IEEE 1394), an optical bus structure, and the like. In another embodiment, components of computing device 902 may be interconnected by a network. For example, memory 908 may be comprised of multiple physical memory units located in different physical locations interconnected by a network.
  • Those skilled in the art will realize that storage devices utilized to store computer readable instructions may be distributed across a network. For example, a computing device 920 accessible via network 918 may store computer readable instructions to implement one or more embodiments provided herein. Computing device 902 may access computing device 920 and download a part or all of the computer readable instructions for execution. Alternatively, computing device 902 may download pieces of the computer readable instructions, as needed, or some instructions may be executed at computing device 902 and some at computing device 920.
  • G. Usage of Terms
  • Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
  • As used in this application, the terms “component,” “module,” “system”, “interface”, and the like are generally intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. One or more components may be localized on one computer and/or distributed between two or more computers.
  • Furthermore, the claimed subject matter may be implemented as a method, apparatus, or article of manufacture using standard programming and/or engineering techniques to produce software, firmware, hardware, or any combination thereof to control a computer to implement the disclosed subject matter. The term “article of manufacture” as used herein is intended to encompass a computer program accessible from any computer-readable device, carrier, or media. Of course, those skilled in the art will recognize many modifications may be made to this configuration without departing from the scope or spirit of the claimed subject matter.
  • Various operations of embodiments are provided herein. In one embodiment, one or more of the operations described may constitute computer readable instructions stored on one or more computer readable media, which if executed by a computing device, will cause the computing device to perform the operations described. The order in which some or all of the operations are described should not be construed as to imply that these operations are necessarily order dependent. Alternative ordering will be appreciated by one skilled in the art having the benefit of this description. Further, it will be understood that not all operations are necessarily present in each embodiment provided herein.
  • Any aspect or design described herein as an “example” is not necessarily to be construed as advantageous over other aspects or designs. Rather, use of the word “example” is intended to present one possible aspect and/or implementation that may pertain to the techniques presented herein. Such examples are not necessary for such techniques or intended to be limiting. Various embodiments of such techniques may include such an example, alone or in combination with other features, and/or may vary and/or omit the illustrated example.
  • As used in this application, the term “or” is intended to mean an inclusive “or” rather than an exclusive “or”. That is, unless specified otherwise, or clear from context, “X employs A or B” is intended to mean any of the natural inclusive permutations. That is, if X employs A; X employs B; or X employs both A and B, then “X employs A or B” is satisfied under any of the foregoing instances. In addition, the articles “a” and “an” as used in this application and the appended claims may generally be construed to mean “one or more” unless specified otherwise or clear from context to be directed to a singular form.
  • Also, although the disclosure has been shown and described with respect to one or more implementations, equivalent alterations and modifications will occur to others skilled in the art based upon a reading and understanding of this specification and the annexed drawings. The disclosure includes all such modifications and alterations and is limited only by the scope of the following claims. In particular regard to the various functions performed by the above described components (e.g., elements, resources, etc.), the terms used to describe such components are intended to correspond, unless otherwise indicated, to any component which performs the specified function of the described component (e.g., that is functionally equivalent), even though not structurally equivalent to the disclosed structure which performs the function in the herein illustrated example implementations of the disclosure. In addition, while a particular feature of the disclosure may have been disclosed with respect to only one of several implementations, such feature may be combined with one or more other features of the other implementations as may be desired and advantageous for any given or particular application. Furthermore, to the extent that the terms “includes”, “having”, “has”, “with”, or variants thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term “comprising.”

Claims (20)

What is claimed is:
1. A server of a server set that performs workloads according to a performance guarantee, the server comprising:
a processor; and
a memory storing instructions that, when executed by the processor, cause the server to:
perform a task of the workload according to a performance guarantee, wherein the workload is processed through the server set according to a process path;
receive a performance capability alert from the downstream server, wherein the performance capability alert indicates that a computational load of a downstream server risks failing the performance guarantee for the workload;
rate-limit the task performed on the workload to reduce the computational load of the downstream server; and
after completing the task, deliver the workload to the downstream server of the process path.
2. The server of claim 1, wherein rate-limiting the task for the workload further comprises: refusing to accept the workload from an upstream server.
3. The server of claim 1, wherein rate-limiting the task for the workload further comprises: slowing an acceptance rate of the workload from an upstream server.
4. The server of claim 1, wherein rate-limiting the task for the workload further comprises:
enqueuing the task for the workload in an input queue for a delay period; and
withdrawing the task from the input queue to perform the task of the workload after the delay period.
5. The server of claim 1, wherein rate-limiting the task for the workload further comprises: rate-limiting a processing rate of the task within the performance guarantee.
6. The server of claim 1, further comprising: responsive to receiving the performance capability alert from the downstream server, propagating the performance capability alert to an upstream server.
7. The server of claim 1, wherein:
the server further comprises an intake server that accepts the workload from a client into the process path; and
rate-limiting the task for the workload further comprises: refusing to accept the workload into the process path.
8. The server of claim 1, wherein the performance guarantee involves a performance capability selected from a performance capability set comprising:
a processing latency of the process path for the workload;
a processing throughput of the process path for the workload;
a processing consistency of the process path for the workload; and
a processing replication of the workload within the server set.
9. A method of configuring a server of a server set to participate in workloads, the method comprising:
executing, by a processor of the server, instructions that cause the server to:
receive a workload from an upstream server of a process path, wherein the workload is associated with a performance guarantee;
perform a task on the workload;
identify a performance capability of the server;
compare the performance capability with the performance guarantee of the workload;
responsive to determining that the performance capability risks failing the performance guarantee, transmit a performance capability alert to the upstream server; and
responsive to receiving a performance capability alert from a downstream server of the process path, rate-limit the task performed on the workload to reduce the computational load of the downstream server.
10. The method of claim 9, wherein identifying the performance capability of the server further comprises: predicting the performance capability of the server performing the workload.
11. The method of claim 9, wherein identifying the performance capability of the server further comprises: measuring the performance capability of the server while performing the workload.
12. The method of claim 9, wherein:
the performance guarantee further comprises a set of performance guarantees for at least two performance capabilities and a priority order; and
comparing the performance capability with the performance guarantee further comprises: comparing the performance capabilities according to the priority order.
13. The method of claim 9, wherein executing the instructions further causes the server to, responsive to an alleviation of the performance capability alert, reduce the rate-limiting of the task for the workload.
14. A method of configuring a server set to perform a workload according to a performance guarantee, the method comprising:
configuring a server of a process path of the server set to process a workload by:
performing a task on the workload according to the performance guarantee;
receiving a performance capability alert from a downstream server of the process path, wherein the performance capability alert indicates that a computational load of the downstream server risks failing the performance guarantee for the workload;
rate-limiting the task of the server to reduce the workload delivered to the downstream server; and
after performing the task, delivering the workload to the downstream server.
15. The method of claim 14, further comprising: responsive to the performance capability alert, redirecting the process path through the server set to provide a substitute server for the downstream server.
16. The method of claim 14, wherein:
the downstream server is further processing a second task for a second workload according to a second process path; and
the method further comprises: redirecting a second process path for the second workload to provide a substitute server for the downstream server.
17. The method of claim 14, further comprising, responsive to the performance capability alert:
identifying a second workload of the process path for which the server set is satisfying a second performance guarantee; and
increasing a processing priority of the workload relative to the second workload.
18. The method of claim 14, wherein:
the downstream server is processing the workload using a computational resource set; and
the method further comprises: responsive to the performance capability alert, expanding the computational resource set of the downstream server.
19. The method of claim 18, wherein expanding the computational resource set of the downstream server further comprises:
selecting an auxiliary server to supplement the downstream server; and
sharing the workload between the downstream server and the auxiliary server.
20. The method of claim 14, further comprising: responsive to the performance capability alert, transferring a computational task from the downstream server to a substitute server.
US15/991,953 2018-05-07 2018-05-29 Adaptive resource-governed services for performance-compliant distributed workloads Abandoned US20190342380A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/991,953 US20190342380A1 (en) 2018-05-07 2018-05-29 Adaptive resource-governed services for performance-compliant distributed workloads

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862668226P 2018-05-07 2018-05-07
US15/991,953 US20190342380A1 (en) 2018-05-07 2018-05-29 Adaptive resource-governed services for performance-compliant distributed workloads

Publications (1)

Publication Number Publication Date
US20190342380A1 true US20190342380A1 (en) 2019-11-07

Family

ID=68383805

Family Applications (10)

Application Number Title Priority Date Filing Date
US15/991,953 Abandoned US20190342380A1 (en) 2018-05-07 2018-05-29 Adaptive resource-governed services for performance-compliant distributed workloads
US15/991,632 Active 2038-07-14 US10970269B2 (en) 2018-05-07 2018-05-29 Intermediate consistency levels for database configuration
US15/991,786 Active 2039-03-09 US11030185B2 (en) 2018-05-07 2018-05-29 Schema-agnostic indexing of distributed databases
US15/991,062 Active 2038-10-17 US10817506B2 (en) 2018-05-07 2018-05-29 Data service provisioning, metering, and load-balancing via service units
US15/991,223 Active 2038-12-08 US10885018B2 (en) 2018-05-07 2018-05-29 Containerization for elastic and scalable databases
US15/991,880 Active 2038-12-25 US10970270B2 (en) 2018-05-07 2018-05-29 Unified data organization for multi-model distributed databases
US16/207,170 Active 2040-07-06 US11321303B2 (en) 2018-05-07 2018-12-02 Conflict resolution for multi-master distributed databases
US16/207,176 Active 2039-12-15 US11379461B2 (en) 2018-05-07 2018-12-02 Multi-master architectures for distributed databases
US16/209,647 Active 2040-06-30 US11397721B2 (en) 2018-05-07 2018-12-04 Merging conflict resolution for multi-master distributed databases
US17/855,306 Pending US20220335034A1 (en) 2018-05-07 2022-06-30 Multi-master architectures for distributed databases

Family Applications After (9)

Application Number Title Priority Date Filing Date
US15/991,632 Active 2038-07-14 US10970269B2 (en) 2018-05-07 2018-05-29 Intermediate consistency levels for database configuration
US15/991,786 Active 2039-03-09 US11030185B2 (en) 2018-05-07 2018-05-29 Schema-agnostic indexing of distributed databases
US15/991,062 Active 2038-10-17 US10817506B2 (en) 2018-05-07 2018-05-29 Data service provisioning, metering, and load-balancing via service units
US15/991,223 Active 2038-12-08 US10885018B2 (en) 2018-05-07 2018-05-29 Containerization for elastic and scalable databases
US15/991,880 Active 2038-12-25 US10970270B2 (en) 2018-05-07 2018-05-29 Unified data organization for multi-model distributed databases
US16/207,170 Active 2040-07-06 US11321303B2 (en) 2018-05-07 2018-12-02 Conflict resolution for multi-master distributed databases
US16/207,176 Active 2039-12-15 US11379461B2 (en) 2018-05-07 2018-12-02 Multi-master architectures for distributed databases
US16/209,647 Active 2040-06-30 US11397721B2 (en) 2018-05-07 2018-12-04 Merging conflict resolution for multi-master distributed databases
US17/855,306 Pending US20220335034A1 (en) 2018-05-07 2022-06-30 Multi-master architectures for distributed databases

Country Status (3)

Country Link
US (10) US20190342380A1 (en)
EP (3) EP3791285A1 (en)
WO (3) WO2019217479A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190044860A1 (en) * 2018-06-18 2019-02-07 Intel Corporation Technologies for providing adaptive polling of packet queues
US20200167189A1 (en) * 2018-11-28 2020-05-28 International Business Machines Corporation Elastic load balancing prioritization
US10817506B2 (en) 2018-05-07 2020-10-27 Microsoft Technology Licensing, Llc Data service provisioning, metering, and load-balancing via service units
US10983836B2 (en) * 2018-08-13 2021-04-20 International Business Machines Corporation Transaction optimization during periods of peak activity
US20210173942A1 (en) * 2018-07-11 2021-06-10 Green Market Square Limited Data privacy awareness in workload provisioning
US11089081B1 (en) * 2018-09-26 2021-08-10 Amazon Technologies, Inc. Inter-process rendering pipeline for shared process remote web content rendering
US11138077B2 (en) * 2019-01-24 2021-10-05 Walmart Apollo, Llc System and method for bootstrapping replicas from active partitions
US20220138037A1 (en) * 2020-11-05 2022-05-05 International Business Machines Corporation Resource manager for transaction processing systems
US11334390B2 (en) * 2019-06-28 2022-05-17 Dell Products L.P. Hyper-converged infrastructure (HCI) resource reservation system
US11899670B1 (en) 2022-01-06 2024-02-13 Splunk Inc. Generation of queries for execution at a separate system

Families Citing this family (75)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018119417A1 (en) * 2016-12-22 2018-06-28 Nissan North America, Inc. Autonomous vehicle service system
US10970302B2 (en) 2017-06-22 2021-04-06 Adobe Inc. Component-based synchronization of digital assets
US11635908B2 (en) 2017-06-22 2023-04-25 Adobe Inc. Managing digital assets stored as components and packaged files
US10595363B2 (en) * 2018-05-11 2020-03-17 At&T Intellectual Property I, L.P. Autonomous topology management for wireless radio user equipment
US11226854B2 (en) * 2018-06-28 2022-01-18 Atlassian Pty Ltd. Automatic integration of multiple graph data structures
US10901864B2 (en) * 2018-07-03 2021-01-26 Pivotal Software, Inc. Light-weight mirror container
US10824512B2 (en) * 2018-07-31 2020-11-03 EMC IP Holding Company LLC Managing journaling resources with copies stored in multiple locations
US20220101619A1 (en) * 2018-08-10 2022-03-31 Nvidia Corporation Cloud-centric platform for collaboration and connectivity on 3d virtual environments
CN109325201A (en) * 2018-08-15 2019-02-12 北京百度网讯科技有限公司 Generation method, device, equipment and the storage medium of entity relationship data
US11303726B2 (en) * 2018-08-24 2022-04-12 Yahoo Assets Llc Method and system for detecting and preventing abuse of an application interface
CN110896404B (en) * 2018-09-12 2021-09-14 华为技术有限公司 Data processing method and device and computing node
US11321012B2 (en) * 2018-10-12 2022-05-03 Adobe Inc. Conflict resolution within synchronized composite-part-based digital assets
US11030242B1 (en) * 2018-10-15 2021-06-08 Rockset, Inc. Indexing and querying semi-structured documents using a key-value store
US10963353B2 (en) * 2018-10-23 2021-03-30 Capital One Services, Llc Systems and methods for cross-regional back up of distributed databases on a cloud service
US11204940B2 (en) * 2018-11-16 2021-12-21 International Business Machines Corporation Data replication conflict processing after structural changes to a database
US11321200B1 (en) * 2019-01-21 2022-05-03 Harmonic, Inc. High availability and software upgrades in a virtual cable modem termination system
US11775402B2 (en) 2019-01-21 2023-10-03 Harmonic, Inc. High availability and software upgrades in network software
US11132270B2 (en) * 2019-05-13 2021-09-28 Saudi Arabian Oil Company Planned zero downtime server switching for web applications
EP3745761A1 (en) * 2019-05-28 2020-12-02 Samsung Electronics Co., Ltd. Virtualization of ran functions based on load of the base stations
US11567923B2 (en) * 2019-06-05 2023-01-31 Oracle International Corporation Application driven data change conflict handling system
US11966788B2 (en) * 2019-06-12 2024-04-23 Snyk Limited Predictive autoscaling and resource optimization
US20210064614A1 (en) * 2019-08-30 2021-03-04 Oracle International Corporation Database environments for guest languages
US11924060B2 (en) * 2019-09-13 2024-03-05 Intel Corporation Multi-access edge computing (MEC) service contract formation and workload execution
US11556821B2 (en) * 2019-09-17 2023-01-17 International Business Machines Corporation Intelligent framework updater to incorporate framework changes into data analysis models
US11620171B2 (en) * 2019-09-27 2023-04-04 Atlassian Pty Ltd. Systems and methods for generating schema notifications
US11003436B2 (en) * 2019-10-15 2021-05-11 Dell Products L.P. Composable infrastructure update system
US11556512B2 (en) * 2019-11-01 2023-01-17 Palantir Technologies Inc. Systems and methods for artifact peering within a multi-master collaborative environment
US11645265B2 (en) 2019-11-04 2023-05-09 Oracle International Corporation Model for handling object-level database transactions in scalable computing applications
US11567925B2 (en) * 2019-11-07 2023-01-31 International Business Machines Corporation Concurrent update management
KR102172607B1 (en) 2019-11-15 2020-11-02 한국전자기술연구원 Method for balanced scale-out of resource on distributed and collaborative container platform environment
US11442960B2 (en) * 2019-12-17 2022-09-13 Verizon Patent And Licensing Inc. Edge key value store for a distributed platform
US11537440B2 (en) * 2019-12-19 2022-12-27 Hewlett Packard Enterprise Development Lp Infrastructure adaptive consistency level mechanism
US11238037B2 (en) * 2020-01-06 2022-02-01 International Business Machines Corporation Data segment-based indexing
US11327962B1 (en) * 2020-01-23 2022-05-10 Rockset, Inc. Real-time analytical database system for querying data of transactional systems
US11546420B2 (en) 2020-02-24 2023-01-03 Netapp, Inc. Quality of service (QoS) settings of volumes in a distributed storage system
WO2022015773A1 (en) * 2020-07-13 2022-01-20 Journey Mobile, Inc. Synchronization of source code under development in multiple concurrent instances of an integrated development environment
JP7458259B2 (en) * 2020-07-15 2024-03-29 株式会社日立製作所 Data management device and data management method
US11409726B2 (en) * 2020-07-20 2022-08-09 Home Depot Product Authority, Llc Methods and system for concurrent updates of a customer order
US11561672B1 (en) * 2020-07-24 2023-01-24 Tableau Software, LLC Compatibility-based feature management for data prep applications
US11645119B2 (en) * 2020-07-28 2023-05-09 Optum Services (Ireland) Limited Dynamic allocation of resources in surge demand
CN114064262A (en) * 2020-08-07 2022-02-18 伊姆西Ip控股有限责任公司 Method, apparatus and program product for managing computing resources in a storage system
US11470037B2 (en) 2020-09-09 2022-10-11 Self Financial, Inc. Navigation pathway generation
US11475010B2 (en) * 2020-09-09 2022-10-18 Self Financial, Inc. Asynchronous database caching
US11641665B2 (en) 2020-09-09 2023-05-02 Self Financial, Inc. Resource utilization retrieval and modification
US20220075877A1 (en) 2020-09-09 2022-03-10 Self Financial, Inc. Interface and system for updating isolated repositories
US11436212B2 (en) * 2020-09-22 2022-09-06 Snowflake Inc. Concurrent transaction processing in a database system
US11468032B2 (en) 2020-09-22 2022-10-11 Snowflake Inc. Concurrent transaction processing in a database system
US11671484B2 (en) * 2020-09-25 2023-06-06 Verizon Patent And Licensing Inc. Methods and systems for orchestrating a distributed computing service based on latency performance levels
US11550800B1 (en) * 2020-09-30 2023-01-10 Amazon Technologies, Inc. Low latency query processing and data retrieval at the edge
US11809404B1 (en) * 2020-09-30 2023-11-07 Amazon Technologies, Inc. Mixed-mode replication for sharded database systems
US11700178B2 (en) 2020-10-30 2023-07-11 Nutanix, Inc. System and method for managing clusters in an edge network
US11153163B1 (en) 2020-10-30 2021-10-19 Nutanix, Inc. Cloud-controlled configuration of edge processing units
US20220134222A1 (en) * 2020-11-03 2022-05-05 Nvidia Corporation Delta propagation in cloud-centric platforms for collaboration and connectivity
CN112463862A (en) * 2020-11-05 2021-03-09 深圳市和讯华谷信息技术有限公司 Data acquisition method and device based on configuration permission
US11595319B2 (en) * 2020-12-21 2023-02-28 Microsoft Technology Licensing, Llc Differential overbooking in a cloud computing environment
CN112632190A (en) * 2020-12-26 2021-04-09 中国农业银行股份有限公司 Data synchronization method and device
WO2022147124A1 (en) * 2020-12-30 2022-07-07 Level 3 Communications, Llc Multi- network management system and method
US11531653B2 (en) 2021-03-29 2022-12-20 PlanetScale, Inc. Database schema branching workflow, with support for data, keyspaces and VSchemas
US11868805B2 (en) * 2021-04-13 2024-01-09 Red Hat, Inc. Scheduling workloads on partitioned resources of a host system in a container-orchestration system
CN113296759B (en) * 2021-05-12 2023-11-28 广州博冠信息科技有限公司 User interface processing method, user interface processing system, device and storage medium
US11860860B2 (en) * 2021-07-09 2024-01-02 Cockroach Labs, Inc. Methods and systems for non-blocking transactions
US11741134B2 (en) * 2021-09-07 2023-08-29 Oracle International Corporation Conversion and migration of key-value store to relational model
US11537599B1 (en) * 2021-09-30 2022-12-27 Bmc Software, Inc. Fast database loading with time-stamped records
US11663189B1 (en) 2021-12-01 2023-05-30 Oracle International Corporation Generating relational table structures from NoSQL datastore and migrating data
US20230185941A1 (en) * 2021-12-14 2023-06-15 International Business Machines Corporation Multi-partitioned global data system
US20230259505A1 (en) * 2022-01-26 2023-08-17 Oracle International Corporation Future transaction processing
US11522948B1 (en) * 2022-02-04 2022-12-06 International Business Machines Corporation Dynamic handling of service mesh loads using sliced replicas and cloud functions
US11765065B1 (en) 2022-03-23 2023-09-19 Nutanix, Inc. System and method for scalable telemetry
US11663096B1 (en) * 2022-03-30 2023-05-30 Dell Products L.P. Managing storage domains, service tiers and failed storage domain
US20230315422A1 (en) * 2022-03-30 2023-10-05 Confluent, Inc. Automated upgrade in distributed computing environments
US20230342342A1 (en) * 2022-04-26 2023-10-26 Meta Platforms, Inc. Methods, Apparatuses and Computer Program Products for Stable Identifier Assignment for Evolving Data Structures
US20240045602A1 (en) * 2022-08-03 2024-02-08 Capital One Services, Llc Systems and methods for adaptive data partitioning within cluster systems
CN116028434B (en) * 2023-03-23 2023-07-07 中科星图测控技术股份有限公司 File coding method and system for describing space analysis scene
US11768834B1 (en) * 2023-05-03 2023-09-26 Newday Database Technology, Inc. Storing and querying general data type documents in SQL relational databases
CN116720490A (en) * 2023-08-11 2023-09-08 北京久其金建科技有限公司 Data importing method and device

Family Cites Families (128)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5446880A (en) 1992-08-31 1995-08-29 At&T Corp. Database communication system that provides automatic format translation and transmission of records when the owner identified for the record is changed
US5581753A (en) 1994-09-28 1996-12-03 Xerox Corporation Method for providing session consistency guarantees
US5581754A (en) 1994-12-07 1996-12-03 Xerox Corporation Methodology for managing weakly consistent replicated databases
US5806074A (en) 1996-03-19 1998-09-08 Oracle Corporation Configurable conflict resolution in a computer implemented distributed database
US5787262A (en) 1996-06-26 1998-07-28 Microsoft Corporation System and method for distributed conflict resolution between data objects replicated across a computer network
US5923850A (en) 1996-06-28 1999-07-13 Sun Microsystems, Inc. Historical asset information data storage schema
US6233584B1 (en) 1997-09-09 2001-05-15 International Business Machines Corporation Technique for providing a universal query for multiple different databases
US20030046396A1 (en) 2000-03-03 2003-03-06 Richter Roger K. Systems and methods for managing resource utilization in information management environments
US6523032B1 (en) 2000-05-12 2003-02-18 Oracle Corporation Servicing database requests using read-only database servers coupled to a master database server
US7237034B2 (en) 2000-09-18 2007-06-26 Openwave Systems Inc. Method and apparatus for controlling network traffic
US20020161757A1 (en) 2001-03-16 2002-10-31 Jeffrey Mock Simultaneous searching across multiple data sets
US6925457B2 (en) 2001-07-27 2005-08-02 Metatomix, Inc. Methods and apparatus for querying a relational data store using schema-less queries
US6889338B2 (en) 2001-08-15 2005-05-03 Nortel Networks Limited Electing a master server using election periodic timer in fault-tolerant distributed dynamic network systems
US7269648B1 (en) 2001-09-27 2007-09-11 Emc Corporation Resolving multiple master node conflict in a DDB
US20030135643A1 (en) 2002-01-11 2003-07-17 Chaucer Chiu Data transmission scheduling system and method
US20030220966A1 (en) 2002-05-24 2003-11-27 International Business Machines Corporation System and method for dynamic content dependent conflict resolution
US7774473B2 (en) 2002-07-31 2010-08-10 Oracle America, Inc. System and method for sticky routing of requests within a server farm
US9460129B2 (en) 2013-10-01 2016-10-04 Vmware, Inc. Method for tracking a schema in a schema-less database
US7117221B2 (en) 2003-01-09 2006-10-03 International Business Machines Corporation Replication of changed information in a multi-master environment
US20040230571A1 (en) 2003-04-22 2004-11-18 Gavin Robertson Index and query processor for data and information retrieval, integration and sharing from multiple disparate data sources
US7406499B2 (en) 2003-05-09 2008-07-29 Microsoft Corporation Architecture for partition computation and propagation of changes in data replication
GB2402297B (en) * 2003-05-15 2005-08-17 Sun Microsystems Inc Update dependency control for multi-master replication
US7203711B2 (en) 2003-05-22 2007-04-10 Einstein's Elephant, Inc. Systems and methods for distributed content storage and management
US7483923B2 (en) 2003-08-21 2009-01-27 Microsoft Corporation Systems and methods for providing relational and hierarchical synchronization services for units of information manageable by a hardware/software interface system
US8145731B2 (en) 2003-12-17 2012-03-27 Hewlett-Packard Development Company, L.P. System and method for determining how many servers of at least one server configuration to be included at a service provider's site for supporting an expected workload
US7693991B2 (en) 2004-01-16 2010-04-06 International Business Machines Corporation Virtual clustering and load balancing servers
US7613703B2 (en) 2004-09-30 2009-11-03 Microsoft Corporation Organizing resources into collections to facilitate more efficient and reliable resource access
US20060106879A1 (en) 2004-11-16 2006-05-18 International Business Machines Corporation Conflict resolution in a synchronization framework
US8275804B2 (en) 2004-12-15 2012-09-25 Applied Minds, Llc Distributed data store with a designated master to ensure consistency
US7509354B2 (en) 2005-01-07 2009-03-24 International Business Machines Corporation System, method, and computer program product for multi-master replication conflict resolution
US7689599B1 (en) 2005-01-31 2010-03-30 Symantec Operating Corporation Repair of inconsistencies between data and metadata stored on a temporal volume using transaction log replay
US20060224773A1 (en) 2005-03-31 2006-10-05 International Business Machines Corporation Systems and methods for content-aware load balancing
US8072901B1 (en) 2005-05-09 2011-12-06 Cisco Technology, Inc. Technique for efficient probing to verify policy conformance
US7788668B2 (en) 2005-06-09 2010-08-31 Lockheed Martin Corporation System and method for implementing distributed priority inheritance
US8943180B1 (en) 2005-07-29 2015-01-27 8X8, Inc. Server-based service configuration system and approach
US20070073675A1 (en) 2005-09-24 2007-03-29 International Business Machines Corporation Database query translation
US7529780B1 (en) 2005-12-30 2009-05-05 Google Inc. Conflict management during data object synchronization between client and server
US7606838B2 (en) 2006-02-22 2009-10-20 Microsoft Corporation Distributed conflict resolution for replicated databases
US9219686B2 (en) 2006-03-31 2015-12-22 Alcatel Lucent Network load balancing and overload control
US8571882B1 (en) 2006-07-05 2013-10-29 Ronald J. Teitelbaum Peer to peer database
US7844608B2 (en) * 2006-12-15 2010-11-30 Yahoo! Inc. Clustered query support for a database query engine
US7620659B2 (en) 2007-02-09 2009-11-17 Microsoft Corporation Efficient knowledge representation in data synchronization systems
US7877644B2 (en) 2007-04-19 2011-01-25 International Business Machines Corporation Computer application performance optimization system
US20080301025A1 (en) 2007-05-31 2008-12-04 Boss Gregory J Application of brokering methods to availability characteristics
US20090248737A1 (en) 2008-03-27 2009-10-01 Microsoft Corporation Computing environment representation
US8392482B1 (en) * 2008-03-31 2013-03-05 Amazon Technologies, Inc. Versioning of database partition maps
US8745127B2 (en) 2008-05-13 2014-06-03 Microsoft Corporation Blending single-master and multi-master data synchronization techniques
JP4612715B2 (en) 2008-09-05 2011-01-12 株式会社日立製作所 Information processing system, data update method, and data update program
US8239389B2 (en) 2008-09-29 2012-08-07 International Business Machines Corporation Persisting external index data in a database
US20100094838A1 (en) 2008-10-10 2010-04-15 Ants Software Inc. Compatibility Server for Database Rehosting
US9996572B2 (en) 2008-10-24 2018-06-12 Microsoft Technology Licensing, Llc Partition management in a partitioned, scalable, and available structured storage
US8326807B2 (en) 2009-01-23 2012-12-04 Hewlett-Packard Development Company, L.P. Methods of measuring consistability of a distributed storage system
US9888067B1 (en) 2014-11-10 2018-02-06 Turbonomic, Inc. Managing resources in container systems
US8473543B2 (en) 2009-07-06 2013-06-25 Microsoft Corporation Automatic conflict resolution when synchronizing data objects between two or more devices
US8369211B2 (en) 2009-12-17 2013-02-05 Juniper Networks, Inc. Network distribution prevention when virtual chassis system undergoes splits and merges
US8572022B2 (en) 2010-03-02 2013-10-29 Microsoft Corporation Automatic synchronization conflict resolution
US9141580B2 (en) 2010-03-23 2015-09-22 Citrix Systems, Inc. Systems and methods for monitoring and maintaining consistency of a configuration
US9454441B2 (en) * 2010-04-19 2016-09-27 Microsoft Technology Licensing, Llc Data layout for recovery and durability
US8386421B2 (en) 2010-06-28 2013-02-26 Microsoft Corporation Concurrency control for confluent trees
US8694639B1 (en) 2010-09-21 2014-04-08 Amazon Technologies, Inc. Determining maximum amount of resource allowed to be allocated to client in distributed system
US8824286B2 (en) 2010-10-29 2014-09-02 Futurewei Technologies, Inc. Network aware global load balancing system and method
US20120136839A1 (en) 2010-11-30 2012-05-31 Peter Eberlein User-Driven Conflict Resolution Of Concurrent Updates In Snapshot Isolation
US8880508B2 (en) 2010-12-30 2014-11-04 Sap Se Processing database queries using format conversion
US20120185444A1 (en) * 2011-01-14 2012-07-19 Sparkes Andrew Clock Monitoring in a Data-Retention Storage System
US9026493B1 (en) 2011-02-28 2015-05-05 Google Inc. Multi-master RDBMS improvements for distributed computing environment
US8595267B2 (en) 2011-06-27 2013-11-26 Amazon Technologies, Inc. System and method for implementing a scalable data storage service
US10708148B2 (en) 2011-09-12 2020-07-07 Microsoft Technology Licensing, Llc Activity-and dependency-based service quality monitoring
US8862588B1 (en) 2011-11-30 2014-10-14 Google Inc. Generating an empirically-determined schema for a schemaless database
CN102497410B (en) 2011-12-08 2014-08-27 曙光信息产业(北京)有限公司 Method for dynamically partitioning computing resources of cloud computing system
US20130159253A1 (en) 2011-12-15 2013-06-20 Sybase, Inc. Directing a data replication environment through policy declaration
US8930312B1 (en) 2012-01-17 2015-01-06 Amazon Technologies, Inc. System and method for splitting a replicated data partition
US8782004B2 (en) * 2012-01-23 2014-07-15 Palantir Technologies, Inc. Cross-ACL multi-master replication
US9146810B2 (en) 2012-01-31 2015-09-29 Cleversafe, Inc. Identifying a potentially compromised encoded data slice
US9356793B1 (en) 2012-02-09 2016-05-31 Google Inc. System and method for managing load on a downstream server in a distributed storage system
US9171031B2 (en) 2012-03-02 2015-10-27 Cleversafe, Inc. Merging index nodes of a hierarchical dispersed storage index
CN104255011B (en) 2012-03-09 2017-12-08 英派尔科技开发有限公司 Cloud computing secure data stores
US9195725B2 (en) 2012-07-23 2015-11-24 International Business Machines Corporation Resolving database integration conflicts using data provenance
US9292566B2 (en) 2012-07-30 2016-03-22 Hewlett Packard Enterprise Development Lp Providing a measure representing an instantaneous data consistency level
US9632828B1 (en) 2012-09-24 2017-04-25 Amazon Technologies, Inc. Computing and tracking client staleness using transaction responses
US9405474B2 (en) 2012-10-03 2016-08-02 Microsoft Technology Licensing, Llc Configurable and tunable data store tradeoffs
US8972491B2 (en) 2012-10-05 2015-03-03 Microsoft Technology Licensing, Llc Consistency-based service-level agreements in cloud storage environments
US20140101298A1 (en) 2012-10-05 2014-04-10 Microsoft Corporation Service level agreements for a configurable distributed storage system
US9189531B2 (en) 2012-11-30 2015-11-17 Orbis Technologies, Inc. Ontology harmonization and mediation systems and methods
US20140195514A1 (en) 2013-01-09 2014-07-10 Dropbox, Inc. Unified interface for querying data in legacy databases and current databases
US9230040B2 (en) 2013-03-14 2016-01-05 Microsoft Technology Licensing, Llc Scalable, schemaless document query model
US9712608B2 (en) 2013-03-14 2017-07-18 Microsoft Technology Licensing, Llc Elastically scalable document-oriented storage services
US10417284B2 (en) 2013-03-14 2019-09-17 Microsoft Technology Licensing, Llc Available, scalable, and tunable document-oriented storage services
US9858052B2 (en) 2013-03-21 2018-01-02 Razer (Asia-Pacific) Pte. Ltd. Decentralized operating system
US10075523B2 (en) * 2013-04-01 2018-09-11 International Business Machines Corporation Efficient storage of data in a dispersed storage network
US9596245B2 (en) * 2013-04-04 2017-03-14 Owl Computing Technologies, Inc. Secure one-way interface for a network device
US9424132B2 (en) * 2013-05-30 2016-08-23 International Business Machines Corporation Adjusting dispersed storage network traffic due to rebuilding
US9053167B1 (en) * 2013-06-19 2015-06-09 Amazon Technologies, Inc. Storage device selection for database partition replicas
CN104298690B (en) 2013-07-19 2017-12-29 国际商业机器公司 The method and apparatus established index structure for relation database table and inquired about
US9569513B1 (en) 2013-09-10 2017-02-14 Amazon Technologies, Inc. Conditional master election in distributed databases
US9471711B2 (en) 2013-09-23 2016-10-18 Teradata Us, Inc. Schema-less access to stored data
US20150195162A1 (en) 2014-01-06 2015-07-09 Google Inc. Multi-Master Selection in a Software Defined Network
US20150199134A1 (en) 2014-01-10 2015-07-16 Qualcomm Incorporated System and method for resolving dram page conflicts based on memory access patterns
US9578620B2 (en) 2014-04-22 2017-02-21 Comcast Cable Communications, Llc Mapping and bridging wireless networks to provide better service
US20170199770A1 (en) 2014-06-23 2017-07-13 Getclouder Ltd. Cloud hosting systems featuring scaling and load balancing with containers
US9779073B2 (en) 2014-07-29 2017-10-03 Microsoft Technology Licensing, Llc Digital document change conflict resolution
US20160179840A1 (en) * 2014-12-17 2016-06-23 Openwave Mobility Inc. Cloud bursting a database
US9462427B2 (en) 2015-01-14 2016-10-04 Kodiak Networks, Inc. System and method for elastic scaling using a container-based platform
US9984140B1 (en) * 2015-02-05 2018-05-29 Amazon Technologies, Inc. Lease based leader election system
US10410155B2 (en) 2015-05-01 2019-09-10 Microsoft Technology Licensing, Llc Automatic demand-driven resource scaling for relational database-as-a-service
US10073899B2 (en) 2015-05-18 2018-09-11 Oracle International Corporation Efficient storage using automatic data translation
US9781124B2 (en) 2015-06-11 2017-10-03 International Business Machines Corporation Container-based system administration
CN104935672B (en) 2015-06-29 2018-05-11 新华三技术有限公司 Load balancing service high availability implementation method and equipment
US9619261B2 (en) 2015-06-29 2017-04-11 Vmware, Inc. Method and system for anticipating demand for a computational resource by containers running above guest operating systems within a distributed, virtualized computer system
US10169147B2 (en) 2015-10-30 2019-01-01 International Business Machines Corporation End-to-end secure data storage in a dispersed storage network
US10367914B2 (en) 2016-01-12 2019-07-30 Cisco Technology, Inc. Attaching service level agreements to application containers and enabling service assurance
US10235431B2 (en) 2016-01-29 2019-03-19 Splunk Inc. Optimizing index file sizes based on indexed data storage conditions
US9940175B2 (en) 2016-03-31 2018-04-10 International Business Machines Corporation Joint network and task scheduling
WO2017173828A1 (en) 2016-04-06 2017-10-12 Huawei Technologies Co., Ltd. System and method for multi-master synchronous replication optimization
US20170293540A1 (en) 2016-04-08 2017-10-12 Facebook, Inc. Failover of application services
US10176241B2 (en) 2016-04-26 2019-01-08 Servicenow, Inc. Identification and reconciliation of network resource information
US10768920B2 (en) 2016-06-15 2020-09-08 Microsoft Technology Licensing, Llc Update coordination in a multi-tenant cloud computing environment
US10521311B1 (en) * 2016-06-30 2019-12-31 Amazon Technologies, Inc. Prioritized leadership for data replication groups
EP3270536B1 (en) 2016-07-14 2019-03-06 Huawei Technologies Co., Ltd. Sdn controller and method for task scheduling, resource provisioning and service providing
US10552443B1 (en) 2016-08-11 2020-02-04 MuleSoft, Inc. Schemaless to relational representation conversion
CN106385329B (en) 2016-08-31 2019-11-26 华为数字技术(成都)有限公司 Processing method, device and the equipment of resource pool
US20180150331A1 (en) 2016-11-30 2018-05-31 International Business Machines Corporation Computing resource estimation in response to restarting a set of logical partitions
US11526533B2 (en) 2016-12-30 2022-12-13 Dropbox, Inc. Version history management
US10244048B2 (en) 2017-04-28 2019-03-26 International Business Machines Corporation Sender system status-aware load balancing
US10372436B2 (en) * 2017-08-10 2019-08-06 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Systems and methods for maintaining operating consistency for multiple users during firmware updates
US10698629B2 (en) 2017-11-28 2020-06-30 Facebook, Inc. Systems and methods for locality management
US11429581B2 (en) * 2017-12-01 2022-08-30 International Business Machines Corporation Spatial-temporal query for cognitive IoT contexts
US20190342380A1 (en) 2018-05-07 2019-11-07 Microsoft Technology Licensing, Llc Adaptive resource-governed services for performance-compliant distributed workloads
US10795913B2 (en) * 2018-10-11 2020-10-06 Capital One Services, Llc Synching and reading arrangements for multi-regional active/active databases

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11379461B2 (en) 2018-05-07 2022-07-05 Microsoft Technology Licensing, Llc Multi-master architectures for distributed databases
US11397721B2 (en) 2018-05-07 2022-07-26 Microsoft Technology Licensing, Llc Merging conflict resolution for multi-master distributed databases
US10885018B2 (en) 2018-05-07 2021-01-05 Microsoft Technology Licensing, Llc Containerization for elastic and scalable databases
US11321303B2 (en) 2018-05-07 2022-05-03 Microsoft Technology Licensing, Llc Conflict resolution for multi-master distributed databases
US10817506B2 (en) 2018-05-07 2020-10-27 Microsoft Technology Licensing, Llc Data service provisioning, metering, and load-balancing via service units
US10970270B2 (en) 2018-05-07 2021-04-06 Microsoft Technology Licensing, Llc Unified data organization for multi-model distributed databases
US10970269B2 (en) 2018-05-07 2021-04-06 Microsoft Technology Licensing, Llc Intermediate consistency levels for database configuration
US11030185B2 (en) 2018-05-07 2021-06-08 Microsoft Technology Licensing, Llc Schema-agnostic indexing of distributed databases
US20190044860A1 (en) * 2018-06-18 2019-02-07 Intel Corporation Technologies for providing adaptive polling of packet queues
US20210173942A1 (en) * 2018-07-11 2021-06-10 Green Market Square Limited Data privacy awareness in workload provisioning
US11610002B2 (en) * 2018-07-11 2023-03-21 Green Market Square Limited Data privacy awareness in workload provisioning
US10983836B2 (en) * 2018-08-13 2021-04-20 International Business Machines Corporation Transaction optimization during periods of peak activity
US11089081B1 (en) * 2018-09-26 2021-08-10 Amazon Technologies, Inc. Inter-process rendering pipeline for shared process remote web content rendering
US10942769B2 (en) * 2018-11-28 2021-03-09 International Business Machines Corporation Elastic load balancing prioritization
US20200167189A1 (en) * 2018-11-28 2020-05-28 International Business Machines Corporation Elastic load balancing prioritization
US11138077B2 (en) * 2019-01-24 2021-10-05 Walmart Apollo, Llc System and method for bootstrapping replicas from active partitions
US11334390B2 (en) * 2019-06-28 2022-05-17 Dell Products L.P. Hyper-converged infrastructure (HCI) resource reservation system
US20220138037A1 (en) * 2020-11-05 2022-05-05 International Business Machines Corporation Resource manager for transaction processing systems
US11645130B2 (en) * 2020-11-05 2023-05-09 International Business Machines Corporation Resource manager for transaction processing systems
US11899670B1 (en) 2022-01-06 2024-02-13 Splunk Inc. Generation of queries for execution at a separate system
US11947528B1 (en) * 2022-01-06 2024-04-02 Splunk Inc. Automatic generation of queries using non-textual input

Also Published As

Publication number Publication date
WO2019217482A1 (en) 2019-11-14
EP3791285A1 (en) 2021-03-17
US11397721B2 (en) 2022-07-26
EP3791276A1 (en) 2021-03-17
US10885018B2 (en) 2021-01-05
US20190342188A1 (en) 2019-11-07
US20190340167A1 (en) 2019-11-07
US20220335034A1 (en) 2022-10-20
EP3791284A1 (en) 2021-03-17
US20190340168A1 (en) 2019-11-07
US20190342379A1 (en) 2019-11-07
US11379461B2 (en) 2022-07-05
US20190340273A1 (en) 2019-11-07
US11030185B2 (en) 2021-06-08
US20190340265A1 (en) 2019-11-07
US10817506B2 (en) 2020-10-27
WO2019217481A1 (en) 2019-11-14
US11321303B2 (en) 2022-05-03
US10970269B2 (en) 2021-04-06
US20190340166A1 (en) 2019-11-07
US10970270B2 (en) 2021-04-06
WO2019217479A1 (en) 2019-11-14
US20190340291A1 (en) 2019-11-07

Similar Documents

Publication Publication Date Title
US20190342380A1 (en) Adaptive resource-governed services for performance-compliant distributed workloads
US11762693B1 (en) Dynamically modifying program execution capacity
JP7189997B2 (en) Rolling resource credits for scheduling virtual computer resources
US10810052B2 (en) Methods and systems to proactively manage usage of computational resources of a distributed computing system
US10360083B2 (en) Attributing causality to program execution capacity modifications
US9471585B1 (en) Decentralized de-duplication techniques for largescale data streams
US20120109852A1 (en) Reactive load balancing for distributed systems
EP3671462B1 (en) System and method for consumption based tagging of resources
WO2018108001A1 (en) System and method to handle events using historical data in serverless systems
US10148531B1 (en) Partitioned performance: adaptive predicted impact
US10142195B1 (en) Partitioned performance tracking core resource consumption independently
US20180165693A1 (en) Methods and systems to determine correlated-extreme behavior consumers of data center resources
US9229839B2 (en) Implementing rate controls to limit timeout-based faults
Sousa et al. Predictive elastic replication for multi‐tenant databases in the cloud
US10033620B1 (en) Partitioned performance adaptive policies and leases
US10348814B1 (en) Efficient storage reclamation for system components managing storage
US20230244687A1 (en) Optimization of Virtual Warehouse Computing Resource Allocation
Ogden et al. Layercake: Efficient Inference Serving with Cloud and Mobile Resources
US10148588B1 (en) Partitioned performance: using resource account aggregates to throttle at the granular level
Thu Dynamic replication management scheme for effective cloud storage
US11972287B2 (en) Data transfer prioritization for services in a service chain
US20230244538A1 (en) Optimization of Virtual Warehouse Computing Resource Allocation
US20230125503A1 (en) Coordinated microservices
US20230124885A1 (en) Data transfer prioritization for services in a service chain
WO2023150039A1 (en) Optimization of virtual warehouse computing resource allocation

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:AL-GHOSIEN, MOMIN MAHMOUD;BHOPI, RAJEEV SUDHAKAR;BOSHRA, SAMER;AND OTHERS;SIGNING DATES FROM 20180523 TO 20181013;REEL/FRAME:047950/0509

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION