CN111444183B - Distributed self-adaptive user request scheduling method in key value storage system - Google Patents

Distributed self-adaptive user request scheduling method in key value storage system Download PDF

Info

Publication number
CN111444183B
CN111444183B CN202010217985.9A CN202010217985A CN111444183B CN 111444183 B CN111444183 B CN 111444183B CN 202010217985 A CN202010217985 A CN 202010217985A CN 111444183 B CN111444183 B CN 111444183B
Authority
CN
China
Prior art keywords
server
user request
key value
key
keyset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010217985.9A
Other languages
Chinese (zh)
Other versions
CN111444183A (en
Inventor
蒋万春
严瑜龙
汲发
李昊阳
蒋铭
王建新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202010217985.9A priority Critical patent/CN111444183B/en
Publication of CN111444183A publication Critical patent/CN111444183A/en
Application granted granted Critical
Publication of CN111444183B publication Critical patent/CN111444183B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Abstract

The patent discloses a Distributed Adaptive Scheduling (DAS) method in a key value storage system. The method deploys an improved two-approximate sorting method at a client of a key value storage system, deploys a shortest remaining processing time priority scheduling method at a server end, combines the advantages of the two methods, introduces an information feedback mechanism to adapt to the performance and load of the server along with time change, and increases an overtime mechanism to relieve the starvation problem, thereby reasonably scheduling the service sequence of key value access operation on the server to improve the average completion time of user requests corresponding to a large number of key value access operations. The experimental result shows that compared with the prior First Come First serve (First Come First serve) method, the shortest remaining processing time priority method and the Rein-SBF method, the method can better reduce the average completion time of the user request.

Description

Distributed self-adaptive user request scheduling method in key value storage system
Technical Field
The invention relates to user request scheduling in a key value storage system, in particular to a distributed self-adaptive user request scheduling method in the key value storage system.
Background
Currently, in distributed applications, such as web searching, social networking, electronic commerce and trade, the key value storage system is an important component and has been adopted by many well-known companies, such as amazon, collar english, and face book. Generally, a user request generates hundreds of key access operations on a client of a key storage system, and these key access operations are sent to different servers and then processed in parallel, so that the completion time of the entire user request is determined by the latest served key access operation. On the other hand, the completion time of the user request is closely related to the user experience, and the access experience of the user is damaged by the long completion time, so that the website traffic is reduced, and the income of a company is influenced. Therefore, how to reduce the completion time of the user request is crucial.
As shown in fig. 1, as the user scale increases, many front-end servers are deployed in the key-value storage system to function as clients, which receive and resolve user requests into a large number of key-value access operations, and at the same time, many servers are used to store data and process key-value access operations. Furthermore, one of the characteristics of the key-value storage system is a full-exchange communication mode, which means that a client may receive many user requests and generate a large number of key-value access operations in a short time, and a server needs to process a plurality of key-value access operations sent by different clients. Thus, traffic tends to be bursty and concurrent, such that the load on the server may vary over time. Moreover, the load between different servers is not balanced, and so-called "hot spot" data exists on some servers, so that the servers are accessed frequently and are heavily loaded. In addition, there are also a few instances where the client generates most key-value access operations. Finally, since other background processes such as shared resource contention, garbage collection, etc. are also running on the server, the service rate, i.e., performance, of the server may change over time. The completion time of the key value access operation on different servers is greatly different due to the above factors, and further the completion time of different user requests is greatly different, so that reasonably arranging the service sequence of the key value access operation on the server by a scheduling method is one of effective means for reducing the average completion time of the user requests. However, each key-value access operation is lightweight, and the completion time is usually very short, so that the communication cooperation time between the client and the server is excessively large compared with the key-value access operation time, and therefore, the key-value access operation is difficult to be dispatched in a centralized manner. On the other hand, because the network load is light, the network link speed is high, so that the network delay is very small compared with the queuing waiting time and the service time of the key value access operation, and the network delay can be generally ignored.
Currently, there have been many studies devoted to improving the latency indicator of key-value storage systems, but few studies have been conducted with respect to user request scheduling. In terms of key-value access operation scheduling, there are researchers who believe that a suitable copy selection algorithm is critical to improving the tail latency of all key-value access operations, and therefore an adaptive copy selection algorithm C3 was developed. The algorithm adds the information of the queue length, the service rate and the like of the server to the return value of the key value access operation, and processes and utilizes the feedback information at the client to guide copy selection, thereby obtaining good effect. In addition, some work such as TAP, On-Off improves C3, further improving the tail latency of key-value access operations. However, these efforts focus on single key-value access operations and copy selection issues, and are not designed for scheduling user requests containing a large number of key-value access operations. In addition to copy selection for a single key-value access operation, researchers have designed a scheduling method Rein for an object, Multiget, in a key-value storage system, which reduces the average latency and tail latency of all multigets. Multiget typically contains multiple key-value access operations that are also processed in parallel on different servers, similar to user requests. Rein firstly divides key value access operation contained in the Multiget into a plurality of Opsets on a client according to different target servers, assigns priorities to the Opsets according to the sizes of all the Opsets, and finally carries out service sequence scheduling on the Opsets on the server by utilizing a developed SBF (shortest bottleneck first) and SDS (relaxation drive) algorithm, thereby obviously reducing the average delay and tail delay of the Multiget. However, Rein does not take into account the varying load and performance of the servers over time, so the scheduling it makes may be suboptimal in some cases.
On the other hand, the problem of reducing the average completion time of user requests in a distributed key value storage system is similar to the flow scheduling problem in a data center. Regarding the scheduling problem of the Coflow, the papers Vary, sintronia, etc. published in Sigcomm top-level conference all propose their own scheduling method to effectively reduce the average completion time of the Coflow. However, Varys and sintronia both require centralized collection of flow information for scheduling decisions, which can cause excessive time overhead in the key-value storage system, resulting in increased completion time for key-value access operations, and thus are not well suited.
In summary, in a key value storage system, a distributed user request scheduling algorithm capable of adapting to the load and performance of a server changing with time needs to be designed to reasonably arrange the service sequence of key value access operations at the server end, so as to achieve the goal of reducing the average completion time of all user requests.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: how to provide a distributed adaptive user request scheduling method in a key value storage system can quickly and dynamically adjust the service sequence of key value access operation, thereby reducing the average completion time of the whole user request, improving the response speed of the key value storage system and improving the user access experience.
In order to achieve the technical purpose, the invention provides a distributed Adaptive user request scheduling method das (distributed Adaptive scheduling) in a key value storage system, which is cooperatively completed by a client and a server:
the working steps of the client comprise:
a1: initializing, recording the service rate of each server as the reciprocal of the time for processing the access operation of a single key value, and dividing the consumption rate by the total number of clients; recording the number OSK of key value access operations which are received by each server but are not processed to be 0, defining a symbol | M | to represent the total number of servers capable of providing services, numbering the servers, adding 1 to | M | each time from 1, and defining a variable flag to be 0;
a2: when a user request reaches a client, the client divides all key value access operations in the user request into different operation sets according to different target servers, and the operation sets are called keySets; putting the user request into an unsent user request set R, setting the flag to be 0, and then executing the step A3;
a3: according to the waiting time of each user request on the client in the set R, a weight is given to the user request, and a null list sigma is initialized; defining | R | to represent the number of user requests currently contained in the set R; defining a variable k ═ R |, and then performing step a 4;
a4: when k is larger than 0, the client selects the server b with the largest load by using the number of the user requests in R and the target server information, then selects the server b with the largest weight from the user requests to be sent to the server b, adds the server b to the position of the list sigma (k), deletes the user request sigma (k) from the set R, reduces k by 1, and then adjusts the weight of each user request in R by using the ratio of the size of other user requests in R to the size of the user request sigma (k); at this point, if k is still greater than 0, step a4 is re-executed; otherwise, defining a variable m to represent the current server number, and setting m to be 1 for the subsequent traversal operation, and executing step a 5;
a5: judging whether M is less than or equal to M; if yes, go to step A6; otherwise, executing step A2;
a6: judging whether the recorded OSK of the server m is 0 or not; if not, increasing m by 1, and returning to execute the step A5; otherwise, executing step A7;
a7: defining a variable | σ | to indicate that the list σ currently contains the number of user requests, setting a variable p to 1, and then executing step A8;
a8: judging whether p is less than or equal to | sigma |; if yes, go to step A9; otherwise, executing step A10;
a9: judging whether the user request sigma (p) contains a keySet sent to the server m; if yes, using the number of key value access operations in sigma (p) and the recorded server service rate to estimate the completion time of sigma (p), giving priority to the keySet sent to the server m according to the estimated completion time, sending the keySet and recording the sending time, wherein the OSK increment of the server m is that the keySet contains the number of key value access operations, and if sigma (p) has the keySet to be sent to other servers after sending, temporarily stopping processing and continuing to execute the step A10; if there are no other keySets, deleting σ (p) from σ, and then continuing to execute step A10; if not, increasing p by 1, and then executing the step A8;
a10: judging whether flag is true or not; if yes, m is increased by 1, and then step A5 is executed; otherwise, executing step A11;
a11: when the client receives the return value of the key value access operation, the OSK of the server sending the return value is reduced by 1, and the service rate of the server is updated; if the OSK is equal to 0 at this time, continuing to update the consumption rate of the server, and if the concatenation flag is equal to 1, m is the number of the server, then executing step a 12; if the OSK is not 0, re-executing the step A11;
a12: judging whether the user request list sigma is empty or not; if empty, go to step A2; otherwise, executing step A7;
the working steps of the server side comprise:
b1: initialization: recording the number num of key value access operations currently processed by the server as 0;
b2: when receiving the keySet, adding the keySet into the service sequence S from high to low according to the priority; then step B3 is performed;
b3: traversing each keySet in the S, and judging whether the keySet is overtime or not; if all the keySets are not overtime, continuously judging whether num is equal to 0, if so, executing the step B4, otherwise, executing the step B6; if the timeout keySet exists, executing step B5;
b4: taking out a key value access operation with the highest priority from S, increasing num by 1, processing the key value access operation by the server to obtain required data, and then executing the step B6;
b5: giving the highest priority to all overtime key sets, wherein num increase is the total number of key set access operations contained in the key sets, the server sequentially processes the key value access operations contained in the key sets, and then step B6 is executed;
b6: when one key value access operation is processed, num is reduced by 1; adding the service time of the key value access operation into the value to be returned, then returning the value to the corresponding client, and then executing the step B7;
b7: judging whether the service sequence S is empty, if so, continuing to judge that num is 0, if true, executing step B2, and if not, executing step B6; if the sequence S is not empty, step B3 is performed.
In the method, in the step a3, a weight initialization formula requested by a user is as follows:
Figure BDA0002425084330000071
where R represents a user request in the set R, W r Weight, WT, representing r r Representing the waiting time of r on the client, c is a constant coefficient, greater than 0 and less than 1,
Figure BDA0002425084330000072
and is arbitrary.
In the method, in the step a4, the selection formula of the server with the heaviest load is as follows:
Figure BDA0002425084330000073
wherein the load is the sum of service time of all key value access operations sent to one server, b represents the server with the highest load, | R | represents the size of the user request set R, | represents the user request number, M represents the server set, M represents a single server,
Figure BDA0002425084330000074
indicates the number of key access operations, S, contained in the keySet sent to the server m by the user request with number i mc Representing the consumption rate of server m on the client.
In the method, in the step a4, the user request with the highest weight is selected according to the following formula:
Figure BDA0002425084330000075
where σ (k) denotes the most heavily weighted user request placed in the list at position k below σ, R denotes the set of user requests, W r A weight value representing the user request r,
Figure BDA0002425084330000076
the key set indicating that r sends to the server b includes the number of key access operations, and b is the server with the heaviest load.
In the method, in the step a4, a weight formula for the adjustment request is as follows:
Figure BDA0002425084330000081
wherein is W σ(k) A weight value representing the user request sigma (k),
Figure BDA0002425084330000082
indicates the number of key access operations included in the keySet addressed to the server b in σ (k).
In the method, in the step a9, the estimation formula of the user request completion time is as follows:
Figure BDA0002425084330000083
where Tc represents the estimated completion time of a user request, M represents a single server, M represents the set of all servers in the system, KS m Indicates the number of key value access operands contained in the keySet sent to the server m by the user request, S m Representing the service rate of server m.
In the method, in the step a9, the Priority formula is defined as follows:
Priority=1/Tc
where Tc is the estimated completion time of a user request.
In the method, in the step a11, the return value of the key value access operation includes the following information: the key value access operation originally needs to acquire data, and the server processes the time taken by the key value access operation.
In the step a11, the service rate and the consumption rate are updated by an exponential smoothing method, and the formula is as follows:
S ms_new =α*S ms_old +(1-α)*St
Figure BDA0002425084330000084
wherein S is ms_new Indicating the new service rate of server m, S ms_old Representing the previous service rate, St representing the service time of the key value access operation after the current service is finished, and alpha is an exponential smoothing parameter and is a positive number smaller than 1; s mc_new Representing the new consumption rate of key-value access operations sent by the client to the server m, S mc_old Represents the previous consumption rate, KS m Indicates the number of key access operations, T, contained in the keySet to the server m s Indicates the transmission time, T, of the keySet n Indicating the time when the keySet service is completed, β is an exponential smoothing parameter, and is a positive number smaller than 1.
In the method, in the step B3, the timeout determining formula is as follows:
Figure BDA0002425084330000091
wherein, T now Time of timeout judgment, T a Indicating the time when the user request reaches the client, i.e., the time when the keySet is generated, Priority indicates the Priority of the keySet, and h is a constant coefficient greater than 0.
The invention can rapidly schedule the user request, can adapt to the load and performance change of the server end, and dynamically adjusts the service sequence of the key value access operation, thereby reducing the average completion time of the whole user request, improving the response speed of the key value storage system, and improving the user access experience. Simulation experiment results show that the method can achieve better effects than the conventional FCFS, SRPT and Rein-SBF in the aspect of reducing the average completion time of the user request.
Drawings
FIG. 1 is a distributed key value storage system framework.
Fig. 2 is a schematic diagram of a DAS method.
Fig. 3 is a flow chart of the present invention.
Fig. 4 is a delay comparison diagram of different scheduling methods in a static scenario.
Fig. 5 is a delay comparison diagram of different scheduling methods in a dynamic scenario.
Fig. 6 is a graph of the average delay of three scheduling methods under different utilization rates.
Fig. 7 is a delay comparison diagram of three methods in a time-varying scenario of server performance.
Fig. 8 is a delay comparison diagram of three methods in a server heterogeneous and performance time-varying scenario.
Fig. 9 is a delay contrast diagram of three methods in a large scale scenario.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
The Distributed Adaptive Scheduling (DAS) of the invention is designed for a key value storage system, and achieves the aim of reducing the overall average delay of user requests by Scheduling the service sequence of key value access operation. Fig. 1 is a frame diagram of a key value storage system, and as shown in fig. 1, after a user request reaches a client of the key value storage system, the client divides key value access operations included in the client into a plurality of sets according to differences of target servers. The keySets are sent to different servers, and the key value access operations contained in the keySets are processed by the respective servers in parallel, so that the completion time of the whole user request is determined by the latest completed key value access operation. On the other hand, one server often receives a plurality of keysets from different user requests. Therefore, designing a suitable scheduling method to arrange the service order of key value access operations on the server is a means to reduce the overall average completion time of user requests.
However, due to the characteristics of the key-value storage system itself, the design of the scheduling method presents the following challenges. Firstly, the service time of key value access operation in the key value storage system is short, and the centralized collected information for scheduling decision brings great extra delay to the key value access operation, thereby influencing the completion time of user request, so that a distributed scheduling method is a better choice for the key value storage system. Secondly, traffic in the key-value storage system tends to have burstiness and concurrency, causing the load of the server to change over time. Furthermore, data on a portion of servers may be requested more frequently and accessed by multiple key-value access operations, i.e., there are "hot" data, causing an imbalance in load among the servers. In addition, background processes on the server run, such as resource contention, garbage collection, etc., so that the service performance of the server may change over time. In summary, the design of the scheduling method requires adapting to the time-varying performance and load of the server, and dynamically adjusting the service sequence of the key value access operation, so as to achieve the goal of reducing the time required to complete the user request.
Research on the existing scheduling method finds that under the condition of a single client, the centralized two-approximate scheduling method can obtain a better scheduling sequence and reduce the average completion time of user requests due to the consideration of loads on different servers; in the case of a single server, the shortest remaining processing time first (SRPT) scheduling method can avoid the long tail blocking phenomenon to a greater extent, and reduce the average completion time of the user request, but may have a starvation problem. Therefore, for a key value storage system with N clients and M servers, the inventor regards it as N subsystems of 1 to M, deploys an improved binary-approximate scheduling method on the client, deploys an SRPT scheduling method on the server, introduces an information feedback mechanism to adapt to the performance and load of the server over time, and adds a timeout mechanism to alleviate the starvation problem, and heuristically designs a user request scheduling method DAS shown in fig. 2. In fig. 2, symbols such as R1 represent user requests, symbols such as K11 represent keysets, and K11 represents a keySet addressed to the server 1 in the user request R1.
The DAS is a distributed self-adaptive user request scheduling method suitable for a key value storage system and is mainly completed by the cooperation of a client and a server. Fig. 3 is a schematic workflow diagram of the DAS method.
As shown in fig. 3, the client records the service rate of each server as the reciprocal of the time for processing a single key access operation, the consumption rate is the service rate divided by the total number of clients, the number of key access operations OSK that each server has received but has not processed is recorded as 0, a definition symbol | M | represents the total number of servers that can provide services, and the server number is incremented from 1 to | M | each time, and a definition variable flag is 0. When a client receives a user request, the key value access operation contained in the client is divided into different keySets according to different target servers, and then the user request is added into a unsent user request set R, wherein the flag is set to be 0. Next, the client performs a binary-approximation sorting operation on the set R, and at the same time initializes an empty list σ to store sorted user requests, and defines a variable k equal to the number of user requests in the set R. First, in order to alleviate the starvation problem during the ranking, the DAS initializes its weights based on the waiting time of different user requests on the client, and the initialization formula is as follows:
Figure BDA0002425084330000121
wherein R represents a set of user requests, W r Weight, WT, representing user request r r Representing the latency of r, c is a constant coefficient greater than 0 and less than 1,
Figure BDA0002425084330000122
and is meant to be arbitrary. Because the waiting time of the user request on the client is considered, the probability of preferential sending of the user request is higher when the user request is longer in the sequencing process, and therefore the starvation problem is relieved to a certain extent. Then, the client will calculate the load condition of each server according to the following formula, and select the server with the highest load:
Figure BDA0002425084330000131
wherein the load is the sum of service time of all key value access operations sent to one server, b represents the server with the highest load, | R | represents the size of the user request set R, | represents the user request number, M represents the server set, M represents a single server,
Figure BDA0002425084330000136
indicates the number of key access operations, S, contained in the keySet sent to the server m by the user request with number i mc Representing the consumption rate of server m on the client. After selecting the server with the highest load, the client will select the one with the highest weight from all the user requests addressed to the service b and add it to the list σ, and the selection method is as follows:
Figure BDA0002425084330000132
where σ (k) represents the most heavily weighted user request placed in the list at a position k below σ, R represents the set of user requests, W r A weight value representing the user request r,
Figure BDA0002425084330000133
the key set indicating that r sends to the server b includes the number of key value access operations. Next, the user request σ (k) is removed from the set R, k is reduced by 1, and the weight of the remaining user requests in the set R is adjusted using the following formula:
Figure BDA0002425084330000134
wherein is W σ(k) A weight value representing the user request sigma (k),
Figure BDA0002425084330000135
indicates the number of key access operations included in the keySet addressed to the server b in σ (k). At this time, if k is still greater than 0, it indicates that there are still unsorted user requests in the set R, and at this time, operations such as selecting the server with the highest load, selecting the user request with the highest weight, adjusting the user request weight, and the like should be performed until k is equal to 0, which indicates that all user requests are sorted, and the sorted result is stored in the list σ. Next, the client defines a variable m to represent the current server number, and sets it to 0, and detects whether the OSK of the server m is 0. If the number of key value access operations in the user request is 0, whether the user request in the list sigma has the key set sent to the server m is checked in sequence, if one key set is found, the completion time is estimated according to the recorded service rate and the number of key value access operations in the user request, the priority is given to the key set sent to the server m, the key set is sent to the server m, and the OSK increment is the number of key value access operations in the key set. The estimation formula of the user request completion time is as follows:
Figure BDA0002425084330000141
where Tc represents the estimated completion time of the user request, KS m Indicating that the keySet sent to server m contains the number of key-value access operations, S m Representing the service rate of server M and M representing the set of servers. The priority formula is defined as follows:
Priority=1/Tc
thus, user requests with higher expected completion times will be lower in priority and thus will be serviced later on the server side. After sending the keySet sum, the client will determine whether the flag variable is 0, if so, it indicates that a new user request arrives, and then it is detected that the sum of the OSKs of the server m still needs to continue detecting the OSK of the next server, so m will increase by 1, and the steps of OSK detection and the like are repeatedly executed until all the servers are completely detected. If not 0, it means that the keySet service on a certain server is completed, and it is only necessary to send the keySet to the server.
When the client receives the key valueWhen the return value of the operation is accessed, firstly, the EWMA mode is used for updating the service rate of the server, and the formula is S ms_new =α*S ms_old +(1-α)*St
Wherein S is ms_new Indicating the new service rate of server m, S ms_old Representing the previous service rate, St representing the service time carried in the key value access operation return value, and α being an exponential smoothing parameter, being a positive number less than 1. And then, reducing the OSK of the server sending the return value by 1, if the OSK is equal to 0 at the moment, indicating that the keySet sent to the server is already served, and updating the consumption rate of the server according to the following formula at the moment:
Figure BDA0002425084330000151
wherein S is mc_new Representing the new consumption rate of requests from the client to server m, S mc_old Represents the previous consumption rate, KS m Indicating the size of the keySet sent to server m, T s Indicates the transmission time, T, of the keySet n Indicating the time when the keySet service is completed, β is an exponential smoothing parameter, and is a positive number smaller than 1. Finally, setting flag to be 1, setting a variable m as a server number, simultaneously judging whether the list sigma is empty, and if so, indicating that all user requests are completely sent, so that a new user request needs to be waited to arrive; if not, traversing each request in the table sigma in sequence to find a user request containing the keySet sent to the server m, estimating the completion time similarly, giving priority to the keySet, sending the keySet and increasing the OSK value.
When the server side is initialized, the number of key value access operations which are processed by the server side is recorded to be 0 and is represented by a variable num. When the server receives the keySet sent by the client, the received keySet is inserted into the service sequence S of the server in the order of priority from high to low. Then, the server checks whether there is a timeout keySet in S, and the timeout determination formula is as follows:
Figure BDA0002425084330000152
wherein, T now Time of timeout judgment, T a Indicating the time when the request reaches the client, i.e., the time when the keySet is generated, Priority indicates the Priority of the keySet, and h is a constant coefficient greater than 0. By considering the waiting time of the user request on the client and performing overtime judgment on the server, the DAS method can effectively relieve the starvation problem, reduce the average completion time of the user request and ensure that the tail delay of the user request is not too high. And if the service sequence S has no overtime keySet, judging whether the server is idle, namely judging whether num is 0. If the number is 0, the server takes out a key value with the highest priority from the S head to access operation processing; if not, the server still needs to process the key value access operation, so that any key value access operation cannot be taken out from the S, and the processing is directly continued; if the overtime keySets exist in the S, the server gives the highest priority to all the overtime keySets, then sequentially processes the key value access operations contained in the overtime keySets, and meanwhile, the num value increment is the sum of the number of the key value access operations contained in all the overtime keySets. And finally, every time the server finishes one key value access operation and returns the requested value to the client, num is reduced by 1, then the server detects whether the service sequence S is empty, if not, the previous steps are repeated, and the timeout judgment is carried out on all keySets in the S and the subsequent steps are executed. If the number of num is null, continuously detecting whether num is 0, wherein the value of num is 0, which means that the server side does not have key value access operation to be processed, and the server side waits for a new keySet to arrive; if not, the rest key value access operations are continuously processed.
To further verify the performance of the DAS scheduling method, this example constructs a discrete event simulator based on the SimPy library of Python. In the simulator, a Workload node is used for generating user requests, a Client node is used for dividing the user requests into different keySets, executing binary approximate sorting and predicting a plurality of functions such as user request completion time and the like, and a Server node is used for processing key value access operation. In order to more accurately simulate a real key value storage system, the present embodiment sets the size of the user request to comply with Pareto distribution and designates the average value thereof as 300, that is, one user request contains 300 key value access operations on average. The arrival of the user request is a Poisson process, and therefore the present embodiment sets the arrival interval of the user request so that the system utilization rate is 70%. In order to simulate hot spot data, the target server is selected for the keySet in the user request by using Zipf distribution, so that the probability that some servers are selected by the keySet containing more key value access operations is higher, and the servers serve as the servers where the hot spot data are located. Meanwhile, the size of the keySet (i.e., including the number of key-value access operations) is also set to comply with the Pareto distribution, whose mean is the mean of the user requests (300 key-value access operations in size) divided by the number of servers in the system. The service time of the key-value access operation follows an exponential distribution with a mean value of 20 mus. Finally, FCFS (first come first served) method, SRPT (shortest remaining processing time first) method, Rein-SBF (priority set according to the size of the largest keySet in the request) method are deployed separately as a comparison of DAS methods.
First, the present embodiment verifies the performance of different scheduling methods in a static scenario (i.e., all user requests arrive at the same time). By setting the number of the clients and the servers as 1 and 10, 10 and 1, and 10, respectively, the delay condition (i.e. completion time) of the user request under different scheduling methods under the three settings is compared. As shown in fig. 4, where (a) is under the setting that the number of clients is 1 and the number of servers is 10, the DAS is optimal in terms of both average delay and tail delay, because in this configuration, the scheduling problem is degraded to be a standard concurrent store problem, and the binary scheduling method can obtain a better scheduling result. While in (b), i.e. 10 clients and 1 server, SRPT and Rein-SBF are the best known scheduling methods in this configuration, and since all servers have the same performance and the completion time of the request is determined by their sizes, SRPT and Rein-SBF have the same performance. Meanwhile, the binary approximation algorithm degrades to the SRPT scheduling algorithm in the case of a single server, so the DAS can achieve performance close to SRPT, Rein-SBF. Finally, under the configuration of (c), i.e. 10 clients and 10 servers, the DAS integrates the advantages of the two-approximate scheduling method and the SRPT scheduling method, so that the performance is optimal. In summary, in a static scenario, compared to the FCFS method, the DAS method can improve the average delay by 17-50% without losing the tail delay.
Then, the embodiment verifies the performance of different scheduling methods in a dynamic scenario (that is, the request reaches a Poisson process, and the system utilization rate is 70%). As shown in fig. 5, the DAS method is still optimal when the number of clients and the number of servers are 1 and 10, and 10, respectively. However, under the configuration of 10 clients and 1 server, the key set sending mode under the DAS method is one-by-one, that is, one client sends at most one key set to a certain server at the same time, so as to ensure that the result of the secondary approximate scheduling of the client is not interfered by the result of the SRPT scheduling of the server, and therefore, the performance of the DAS method is lost. In addition, it is noted that, because the DAS method adds a timeout mechanism, the tail delay of the user request under the DAS method is not as high as that of SRPT, Rein-SBF.
Subsequently, the embodiment verifies the performance of the FCFS, Rein-SBF and DAS under different system utilization rates. As shown in fig. 6, when the system utilization rate is low, the scheduling opportunity of the key-value access operation is very low, so the average delay of different methods does not change much. When the system utilization rate exceeds 50%, the average delay of the user request becomes large sharply along with the improvement of the utilization rate, and the scheduling method is very important for reducing the delay. According to the data in fig. 6, DAS can improve the average delay by 16-29% compared to FCFS, and is superior to Rein-SBF for different utilization rates. Next, considering the scenario that the server performance changes with time, that is, under the original 10 clients, 10 server configuration, 50% of the probability performance of the server doubles every 1000 μ s, that is, the time for servicing one key value access operation on average changes from 20 μ s to 10 μ s. As shown in fig. 7, compared to fig. 5(c), since the server performance is probabilistically better, and therefore the average delay or tail delay is lower than the data in fig. 5(c), the DAS method still performs best in this configuration, and the average completion time of the request can be improved by 26% and 13.5% compared to the FCFS method, respectively. Next, half of the initial performance of the server is deteriorated by 1 time on the basis of the time variation of the server, that is, the service time of the key value access operation is changed from 20 μ s to 40 μ s. At this time, when the Rein-SBF assigns a keySet priority to a request, the performance of the Rein-SBF is degraded because only the request size is considered and the server performance difference is not considered. As shown in fig. 8, DAS is now able to improve the average completion time of user requests by 16.3% compared to Rein-SBF. Finally, the number of the clients and the number of the servers are respectively set to be 128 and 128, the system utilization rate is 70%, the server performance is consistent and does not change along with time to verify the performance of the DAS method in a large-scale scene, and the simulation result is shown in fig. 9, so that the DAS method can still obtain the optimal performance.
By synthesizing the simulation results, the DAS method provided by the invention can be deployed in a distributed key value storage system, can adapt to the load and performance of a server changing along with time, and can achieve a better effect in the aspect of reducing the average completion time of user requests compared with the existing FCFS method, the SRPT method and the Rein-SBF method.

Claims (10)

1. A distributed self-adaptive user request scheduling method in a key value storage system is characterized in that the method is completed by a client and a server in a cooperative mode:
the working steps of the client comprise:
a1: initializing, recording the initial service rate of each server as the reciprocal of the time for processing the access operation of a single key value, wherein the initial consumption rate is the service rate divided by the total number of clients; recording the number OSK of key value access operations which are received by each server but are not processed to be 0, defining a symbol | M | to represent the total number of servers capable of providing services, numbering the servers, adding 1 to | M | each time from 1, and defining a variable flag to be 0;
a2: when a user request reaches a client, the client divides all key value access operations in the user request into different operation sets according to different target servers, and the operation sets are called keySets; putting the user request into an unsent user request set R, setting the flag to be 0, and then executing the step A3;
a3: according to the waiting time of each user request on the client in the set R, a weight is given to the user request, and a null list sigma is initialized; defining | R | to represent the number of user requests currently contained in the set R; defining a variable k ═ R |, and then performing step a 4;
a4: when k is larger than 0, the client selects a server b with the largest load by using the number of user requests in R and the target server information, then selects one sigma (k) with the largest weight in the user requests to be sent to the server b, adds the sigma (k) to the list sigma, deletes the user request sigma (k) from the set R, reduces k by 1, and then adjusts the weight of each user request in R by using the ratio of the size of other user requests in R to the size of the user request sigma (k); at this point, if k is still greater than 0, step a4 is re-executed; otherwise, defining a variable m to represent the current server number, and setting m to be 1 for the subsequent traversal operation, and executing step a 5;
a5: judging that M is less than or equal to M; if yes, go to step A6; otherwise, executing step A2;
a6: judging whether the recorded OSK of the server m is 0 or not; if not, m is increased by 1, and the step A5 is executed again; otherwise, executing step A7;
a7: defining a variable | σ | to indicate that the list σ currently contains the number of user requests, setting a variable p to 1, and then executing step A8;
a8: judging whether p is less than or equal to | sigma |; if yes, go to step A9; otherwise, executing step A10;
a9: judging whether the user request sigma (p) contains a keySet sent to the server m; if yes, using the number of key value access operations in sigma (p) and the recorded server service rate to estimate the completion time of sigma (p), giving priority to the keySet sent to the server m according to the estimated completion time, sending the keySet and recording the sending time, wherein the OSK increment of the server m is that the keySet contains the number of key value access operations, and if sigma (p) has the keySet to be sent to other servers after sending, temporarily stopping processing and continuing to execute the step A10; if there are no other keySets, deleting σ (p) from σ, and then continuing to execute step A10; if not, increasing p by 1, and then executing the step A8;
a10: judging whether flag is true or not; if yes, m is increased by 1, and then step A5 is executed; otherwise, executing step A11;
a11: when the client receives the return value of the key value access operation, the OSK of the server sending the return value is reduced by 1, and the service rate of the server is updated; if the OSK is equal to 0 at this time, continuing to update the consumption rate of the server, and if the concatenation flag is equal to 1, m is the number of the server, then executing step a 12; if the OSK is not 0, re-executing the step A11;
a12: judging whether the user request list sigma is empty or not; if empty, go to step A2; otherwise, executing step A7;
the working steps of the server side comprise:
b1: initialization: recording the number num of key value access operations currently processed by the server as 0;
b2: when receiving the keySet, adding the keySet into the service sequence S from high to low according to the priority; then step B3 is performed;
b3: traversing each keySet in the S, and judging whether the keySet is overtime or not; if all the keySets are not overtime, continuously judging whether num is equal to 0, if so, executing the step B4, otherwise, executing the step B6; if the timeout keySet exists, executing step B5;
b4: taking out a key value access operation with the highest priority from S, increasing num by 1, processing the key value access operation by the server to obtain required data, and then executing the step B6;
b5: giving the highest priority to all overtime keySets, wherein num increases are the total number of key access operations contained in the keySets, the server sequentially processes the key access operations contained in the keySets, and then step B6 is executed;
b6: when one key value access operation is processed, num is reduced by 1; adding the service time of the key value access operation into the value to be returned, then returning the value to the corresponding client, and then executing the step B7;
b7: judging whether the service sequence S is empty, if so, continuing to judge that num is 0, if true, executing step B2, and if not, executing step B6; if the sequence S is not empty, step B3 is performed.
2. The method of claim 1, wherein in the step a3, the weight initialization formula requested by the user is as follows:
W r =1.0+c*WT r ,r∈R
where R represents a user request in the set R, W r Denotes the weight of r, WT r Representing the waiting time of r on the client, c is a constant coefficient, greater than 0 and less than 1.
3. The method of claim 1, wherein in step a4, the selection formula of the server with the heaviest load is as follows:
Figure FDA0003627934000000041
wherein the load is the sum of service time of all key value access operations sent to one server, b represents the server with the highest load, | R | represents the size of the user request set R, | represents the user request number, M represents the server set, M represents a single server,
Figure FDA0003627934000000042
indicates the number of key access operations, S, contained in the keySet sent to the server m by the user request with number i mc Representing the consumption rate of server m on the client.
4. The method according to claim 1, wherein in step a4, the most weighted user request selects the formula as follows:
Figure FDA0003627934000000043
where σ (k) represents the most heavily weighted user request placed in the list at a position k below σ, R represents the set of user requests, W r A weight value representing the user request r,
Figure FDA0003627934000000044
the key set indicating that r sends to the server b includes the number of key access operations, and b is the server with the heaviest load.
5. The method according to claim 4, wherein in the step A4, the weight formula for the adjustment request is as follows:
Figure FDA0003627934000000045
wherein is W σ(k) A weight value representing the user request sigma (k),
Figure FDA0003627934000000046
indicates the number of key access operations included in the keySet addressed to the server b in σ (k).
6. The method as claimed in claim 1, wherein in step a9, the estimation formula of the user request completion time is as follows:
Figure FDA0003627934000000051
where Tc represents the estimated completion time of a user request, M represents a single server, M represents the set of all servers in the system, and KS m Indicates the number of key value access operands contained in the keySet sent to the server m by the user request, S m Representing the service rate of server m.
7. The method of claim 1, wherein in the step a9, the Priority formula is defined as follows:
Priority=1/Tc
where Tc is the estimated completion time of a user request.
8. The method according to claim 1, wherein in step a11, the return value of the key-value access operation contains the following information: the key value access operation originally needs to acquire data, and the server processes the time taken by the key value access operation.
9. The method as claimed in claim 1, wherein in step a11, the service rate and the consumption rate are updated by an exponential smoothing method, and the formula is as follows:
S ms_mew =α*S ms_old +(1-α)*St
Figure FDA0003627934000000052
wherein S is ms_new Indicating the new service rate of server m, S ms_old Representing the previous service rate, St representing the service time of the key value access operation after the current service is finished, and alpha is an exponential smoothing parameter and is a positive number less than 1; s mc_new Representing the new consumption rate of key-value access operations sent by the client to the server m, S mc_old Represents the previous consumption rate, KS m Indicates the number of key access operations, T, contained in the keySet to the server m s Indicates the transmission time, T, of the keySet n Indicating the time when the keySet service is completed, β is an exponential smoothing parameter, and is a positive number smaller than 1.
10. The method as claimed in claim 1, wherein in step B3, the formula of timeout determination is as follows:
Figure FDA0003627934000000061
wherein, T now Time of timeout judgment, T a Indicating the time when the user request reaches the client, i.e., the time when the keySet is generated, Priority indicates the Priority of the keySet, and h is a constant coefficient greater than 0.
CN202010217985.9A 2020-03-25 2020-03-25 Distributed self-adaptive user request scheduling method in key value storage system Active CN111444183B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010217985.9A CN111444183B (en) 2020-03-25 2020-03-25 Distributed self-adaptive user request scheduling method in key value storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010217985.9A CN111444183B (en) 2020-03-25 2020-03-25 Distributed self-adaptive user request scheduling method in key value storage system

Publications (2)

Publication Number Publication Date
CN111444183A CN111444183A (en) 2020-07-24
CN111444183B true CN111444183B (en) 2022-08-16

Family

ID=71650805

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010217985.9A Active CN111444183B (en) 2020-03-25 2020-03-25 Distributed self-adaptive user request scheduling method in key value storage system

Country Status (1)

Country Link
CN (1) CN111444183B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113259439B (en) * 2021-05-18 2022-05-06 中南大学 Key value scheduling method based on receiving end drive

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107241442A (en) * 2017-07-28 2017-10-10 中南大学 A kind of key assignments data storage storehouse copy selection method based on prediction

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6895585B2 (en) * 2001-03-30 2005-05-17 Hewlett-Packard Development Company, L.P. Method of mixed workload high performance scheduling
AU2013214801B2 (en) * 2012-02-02 2018-06-21 Visa International Service Association Multi-source, multi-dimensional, cross-entity, multimedia database platform apparatuses, methods and systems
US9798745B2 (en) * 2014-09-13 2017-10-24 Samsung Electronics Co., Ltd. Methods, devices and systems for caching data items

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107241442A (en) * 2017-07-28 2017-10-10 中南大学 A kind of key assignments data storage storehouse copy selection method based on prediction

Also Published As

Publication number Publication date
CN111444183A (en) 2020-07-24

Similar Documents

Publication Publication Date Title
CN109561148B (en) Distributed task scheduling method based on directed acyclic graph in edge computing network
CN102170396B (en) QoS control method of cloud storage system based on differentiated service
US7752628B2 (en) Method and apparatus for reassigning objects to processing units
CN104168318B (en) A kind of Resource service system and its resource allocation methods
KR100429904B1 (en) Router providing differentiated quality-of-service and fast internet protocol packet classification method for the same
CN110297699B (en) Scheduling method, scheduler, storage medium and system
CN101116056B (en) Systems and methods for content-aware load balancing
US5991808A (en) Task processing optimization in a multiprocessor system
CN109885397A (en) The loading commissions migration algorithm of time delay optimization in a kind of edge calculations environment
CN111752708A (en) Storage system self-adaptive parameter tuning method based on deep learning
US20140108458A1 (en) Network filesystem asynchronous i/o scheduling
CN109005211B (en) Micro-cloud deployment and user task scheduling method in wireless metropolitan area network environment
US20070061464A1 (en) System and method for providing differentiated service by using category/resource scheduling
CN112799823A (en) Online dispatching and scheduling method and system for edge computing tasks
CN111143036A (en) Virtual machine resource scheduling method based on reinforcement learning
CN111444183B (en) Distributed self-adaptive user request scheduling method in key value storage system
CN115237568A (en) Mixed weight task scheduling method and system for edge heterogeneous equipment
CN110048966B (en) Coflow scheduling method for minimizing system overhead based on deadline
CN113918301A (en) Request processing method and device, electronic equipment and storage medium
US20050125799A1 (en) Methods and systems for assigning objects to processing units
JP5388134B2 (en) Computer system and moving data determination method
JP2000083055A (en) Router
JP2008131350A (en) Packet transfer device, packet distributing method, packet sorting method, group-belonging-processor changing method, and computer program
JPH11298523A (en) Packet scheduling method
CN115167973B (en) Data processing system of cloud computing data center

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant