CN113132266A

CN113132266A - IO request scheduling method and device

Info

Publication number: CN113132266A
Application number: CN201911398049.6A
Authority: CN
Inventors: 柴云鹏; 刘朝洋; 鲍宁; 王传雯; 赵伟
Original assignee: Renmin University of China; Shenzhen Sensetime Technology Co Ltd
Current assignee: Renmin University of China; Shenzhen Sensetime Technology Co Ltd
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2021-07-16

Abstract

The embodiment of the specification provides an IO request scheduling method and device, and the method comprises the steps of determining label information of each IO request in a plurality of received IO requests, wherein the label information at least comprises a reserved time label and a weight time label, and the plurality of IO requests comprise a first IO request of a first user and a second IO request of the first user; performing first scheduling on the first IO request according to the reserved time tag of the first IO request; and performing second scheduling on the second IO request according to the weight time tag of the second IO request.

Description

IO request scheduling method and device

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to an IO request scheduling method and apparatus.

Background

With the continuous development of software defined storage technology, the management of storage bandwidth resources becomes more and more important. In actual needs, the IO scenarios of users vary widely, and the scheduling of the IO requests of these users is performed to ensure various IO needs of the users, which also has great difficulty. The traditional IO request scheduling method has low scheduling accuracy and coarser bandwidth allocation granularity, so the IO request scheduling method needs to be improved.

Disclosure of Invention

The present disclosure provides an IO request scheduling scheme.

According to a first aspect of the embodiments of the present disclosure, there is provided an IO request scheduling method, including: determining label information of each IO request in a plurality of received IO requests, wherein the label information at least comprises a reserved time label and a weight time label, and the plurality of IO requests comprise a first IO request of a first user and a second IO request of the first user; performing first scheduling on the first IO request according to the reserved time tag of the first IO request; and performing second scheduling on the second IO request according to the weight time tag of the second IO request.

In the embodiment of the disclosure, IO requests for the same user are divided into a first schedule and a second schedule, wherein in the first schedule, scheduling is performed based on reserved time tags, and in the second schedule, scheduling is performed based on weighted time tags.

In some embodiments, the second scheduling the second IO request according to the weighted timestamp of the second IO request includes: and responding to the satisfaction of the deadline condition of the first scheduling of the first user, and performing second scheduling on the second IO request according to the weight time tag of the second IO request.

In some embodiments, the deadline for the first scheduling for the first user comprises: the number of IO requests scheduled by the first user in the first scheduling reaches a first threshold corresponding to the first user, where the first threshold corresponding to the first user depends on a lower bandwidth limit of the first user.

In some embodiments, the first threshold is a product of a lower bandwidth limit of the first user and a length of time of a time window to which the first schedule belongs.

In some embodiments, the method further comprises: counting the IO requests scheduled by the first user in the first scheduling through a first counter; determining that a cutoff condition for a first schedule of the first user is satisfied in response to the count value of the first counter reaching the first threshold or zero.

In some embodiments, the tag information further includes an upper bound time tag; the method further comprises the following steps: stopping scheduling of IO requests to the first user in response to a deadline for the second scheduling of the first user being satisfied, wherein the deadline for the second scheduling of the first user comprises: and the upper limit time tag of the current to-be-scheduled third IO request of the first user is greater than or equal to the current time, or the time windows to which the first schedule and the second schedule belong are ended. In some embodiments, the first schedule and the second schedule are executed within the same time window.

By setting the time window, the first scheduling and the second scheduling for different users may be performed within the same time period, wherein different users may have different cutoff conditions.

In some embodiments, the method further comprises: and if the cutoff condition of the first scheduling of the first user at the end of the last time window is not met, compensating the first scheduling of the first user in the current time window.

If the first scheduling of the user is not ended when a certain time window is over, the number of IO requests actually scheduled based on the first scheduling in the last time window does not reach the bandwidth lower limit of the user, and at this time, the first scheduling can be compensated in the next time window, so that the performance stability of the system and the user experience are improved.

In some embodiments, the method further comprises: in response to the total number of IO requests scheduled in the first schedule of the first user in the last time window being less than a first threshold for the first user, compensating for the first threshold for the first schedule of the first user in the current time window.

In some embodiments, the method further comprises: sending feedback information of each IO request of the first user for completing scheduling to the first user, wherein the feedback information is used for indicating that the scheduling of each IO request is completed through the first scheduling or the second scheduling.

In some embodiments, the method is performed by a first scheduling node in a plurality of scheduling nodes, where the first IO request carries at least one of a first quantity and a second quantity, where the first quantity and the second quantity are, respectively, the quantity of IO requests that the first user successfully schedules by other scheduling nodes and the quantity of IO requests that the first user successfully schedules by other scheduling nodes through the first scheduling during a period in which the user sends a fourth IO request to the first user to send the first IO request, and the fourth IO request is a previous IO request sent to the first scheduling node before the first user sends the first IO request; the determining tag information of each IO request of the received multiple IO requests includes: and determining the label information of the first IO request according to at least one of the first quantity and the second quantity.

The embodiment of the disclosure is applicable to centralized scheduling or distributed scheduling, wherein in distributed scheduling, in order to facilitate scheduling, a user carries scheduling information of other scheduling nodes in an IO request, so that the overall distribution of first scheduling and second scheduling performed by the user on a plurality of scheduling nodes meets the above conditions.

In some embodiments, the method further comprises: and under the condition that the burst IO request of the first user is received, performing the second scheduling on the burst IO request, wherein the scheduling cutoff condition of the burst IO request comprises: and the bandwidth of the burst IO request reaches the upper limit of the burst bandwidth, or the time windows of the first scheduling and the second scheduling end.

In some embodiments, the method further comprises: based on the first IO request scheduled by the first schedule, adjusting a weighted timestamp of the not-yet-scheduled IO request of the first user.

In some embodiments, the method further comprises: responding to the received access request of the first user, and judging whether the sum of the bandwidth lower limit of the first user and the bandwidth lower limit of the current access user exceeds a preset admission bandwidth upper limit or not; and returning an access response of the access request to the first user based on the judgment result.

In some embodiments, the admissible bandwidth cap is determined based on an average of the total bandwidth of the system over a particular time period.

According to a second aspect of the embodiments of the present disclosure, there is provided an IO request scheduling method, including: sending a first IO request to a first scheduling node; and receiving feedback information sent by the first scheduling node, wherein the feedback information is used for indicating that the first IO request is completed through first scheduling or second scheduling, the first scheduling is performed based on a reserved time label, and the second scheduling is performed based on a weighted time label.

In some embodiments, the method further comprises: determining a first quantity and/or a second quantity according to the feedback information, wherein the first quantity and the second quantity are respectively the quantity of IO requests successfully scheduled by other scheduling nodes and the quantity of IO requests successfully scheduled by other scheduling nodes through the first scheduling in the period from the time of sending a second IO request to the time of sending a third IO request, the second IO request is an IO request sent by the previous time of sending the third IO request, and the other scheduling nodes are at least one scheduling node except the second scheduling node in the plurality of scheduling nodes; and sending the third IO requests carrying the first quantity and/or the second quantity to the second scheduling node.

According to a third aspect of the embodiments of the present disclosure, an IO request scheduling apparatus is provided, the apparatus including: the device comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining label information of each IO request in a plurality of received IO requests, the label information at least comprises a reserved time label and a weight time label, and the plurality of IO requests comprise a first IO request of a first user and a second IO request of the first user; the first scheduling module is used for performing first scheduling on the first IO request according to the reserved time tag of the first IO request; and the second scheduling module is used for performing second scheduling on the second IO request according to the weighted time tag of the second IO request.

In some embodiments, the second scheduling module is to: and responding to the satisfaction of the deadline condition of the first scheduling of the first user, and performing second scheduling on the second IO request according to the weight time tag of the second IO request.

According to a fourth aspect of the embodiments of the present disclosure, there is provided an IO request scheduling apparatus, including: the first sending module is used for sending a first IO request to the first scheduling node; a receiving module, configured to receive feedback information sent by the first scheduling node, where the feedback information is used to indicate that the first IO request is completed through first scheduling or second scheduling, where the first scheduling is performed based on a reserved time tag, and the second scheduling is performed based on a weighted time tag.

In some embodiments, the apparatus further comprises: a second determining module, configured to determine a first quantity and/or a second quantity according to the feedback information, where the first quantity and the second quantity are respectively a quantity of IO requests successfully scheduled by other scheduling nodes and a quantity of IO requests successfully scheduled by other scheduling nodes through the first scheduling in a period from when a second IO request is sent to when a third IO request is sent, the second IO request is an IO request sent last time when the third IO request is sent, and the other scheduling nodes are at least one scheduling node of the multiple scheduling nodes except the second scheduling node; a second sending module, configured to send the third IO requests carrying the first quantity and/or the second quantity to the second scheduling node.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the method of any of the embodiments.

According to a sixth aspect of the embodiments of the present disclosure, there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to any of the embodiments when executing the program.

The method includes the steps that label information of each received IO request is determined, wherein the label information at least comprises a reserved time label and a weight time label, and the plurality of IO requests comprise a first IO request of a first user and a second IO request of the first user; performing first scheduling on the first IO request according to the reserved time tag of the first IO request; and performing second scheduling on the second IO request according to the weight time tag of the second IO request. The IO request of the same user is scheduled according to the reserved time label and the weight time label, the bandwidth allocation granularity is fine, and the scheduling reliability of the IO request is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a flowchart of an IO request scheduling method according to an embodiment of the present disclosure.

Fig. 2 is a schematic diagram of an IO request scheduling process according to an embodiment of the present disclosure.

Fig. 3 is a schematic diagram of adjustment of a time window according to an embodiment of the disclosure.

Fig. 4 is an alignment diagram of weight time labels of an embodiment of the present disclosure.

Fig. 5 is a schematic diagram of bandwidth allocation of users under different available bandwidths of the system according to an embodiment of the present disclosure.

Fig. 6 is a flowchart of an IO request scheduling method according to another embodiment of the disclosure.

Fig. 7 is a block diagram of an IO request scheduling apparatus according to an embodiment of the present disclosure.

Fig. 8 is a block diagram of an IO request scheduling apparatus according to another embodiment of the present disclosure.

FIG. 9 is a schematic diagram of a computer device for implementing the disclosed method, in accordance with an embodiment of the present disclosure.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

In order to make the technical solutions in the embodiments of the present disclosure better understood and make the above objects, features and advantages of the embodiments of the present disclosure more comprehensible, the technical solutions in the embodiments of the present disclosure are described in further detail below with reference to the accompanying drawings.

An application scenario of the embodiment of the present disclosure is first explained below. In the application scenario, the scheduling system comprises one or more servers, each server can be used for hanging one or more disks, each disk is a scheduling node, each disk corresponds to a scheduling process, and the scheduling processes are installed in the servers and run by the servers. The system can comprise a plurality of users, each user can send an IO request to the server each time for accessing a certain scheduling node mounted under the server, and the same user can send IO requests aiming at the same or different scheduling nodes at different time. Each scheduling node corresponds to an IO request queue, and the server may add the received IO request of each user to the IO request queue of the corresponding scheduling node, and then schedule the IO request in each IO request queue respectively. The scheduling process needs to meet certain QoS (Quality of Service) requirements.

Those skilled in the art will appreciate that the foregoing embodiments are merely illustrative and that embodiments of the present disclosure are not limited thereto. The method of the embodiments of the present disclosure may be performed by a scheduling process on a server in the above application scenario. The following description will first be made of a case where only a single scheduling node is included in the system.

As shown in fig. 1, an embodiment of the present disclosure provides an IO request scheduling method, where the method includes:

step S101: determining label information of each IO request in a plurality of received IO requests, wherein the label information at least comprises a reserved time label and a weight time label, and the plurality of IO requests comprise a first IO request of a first user and a second IO request of the first user;

step S102: performing first scheduling on the first IO request according to the reserved time tag of the first IO request;

step S103: and performing second scheduling on the second IO request according to the weight time tag of the second IO request.

For ease of understanding, the scheduling process of the first user is taken as an example for explanation. In step S101, tag information may be determined for each IO request of the received first user. Wherein the label information at least comprises a reserved time label and a weighted time label. In some embodiments, the tag information further includes an upper limit time tag, and the specific implementation of the tag information is not limited in the embodiments of the present disclosure.

The reserved time stamp R of the IO request of the first user is used to guarantee the lower bandwidth limit (i.e., reserved bandwidth) of the first user. The upper limit time tag L of the IO request of the first user is used for limiting the bandwidth upper limit of the first user. And in the first scheduling, scheduling the IO request of the first user based on the reserved time tag. Optionally, the phase of the first scheduling may also be referred to as a reservation scheduling phase. In some embodiments, in the reservation scheduling phase, the bandwidth allocated to the first user does not exceed the reserved bandwidth of the first user.

In the second scheduling, the rest part or all IO requests which are not scheduled in the first scheduling in the plurality of schedules are scheduled based on the weighted time tag W. Alternatively, the second scheduled phase may also be referred to as a weight scheduling phase. In some embodiments, in the weight scheduling stage, a remaining bandwidth, that is, a difference between a system available bandwidth and a total bandwidth allocated in the reservation scheduling stage, is allocated according to a ratio of the weight of the first user in a total weight of each user. The ratio of the number of IO requests scheduled by the first user through second scheduling to the number of IO requests corresponding to the remaining bandwidth is equal to the ratio of the weight of the first user to the total weight of each user, but the sum of the number of IO requests scheduled by the first user through the first scheduling and the second scheduling does not exceed the number corresponding to the bandwidth upper limit of the first user.

In the embodiment of the present disclosure, the bandwidth of one user is used to represent the number of IO requests of the user scheduled in a unit time. In some embodiments, the tag information of the IO request of the first user may be calculated according to QoS parameters of the first user, where the QoS parameters optionally include reserved bandwidth, bandwidth upper limit, and bandwidth allocation weight, and the QoS parameters may be preset or determined in a server side in advance, and assuming that the first user is the ith user in the system, an exemplary calculation formula of the tag information of the IO request of the first user is as follows:

in the formula (I), the compound is shown in the specification,

and

respectively representing a reserved time label, an upper limit time label and a weight time label of the IO request sent by the ith user at the kth time,

and

a reserved time label, an upper limit time label and a weight time label r of IO requests sent by the ith user at the kth-1 time respectively_iAnd l_iRespectively representing the reserved bandwidth and the bandwidth upper limit, w, of the ith user_iRepresents the weight of the ith user, t represents the current time, and k is more than or equal to 2. Wherein the content of the first and second substances,

and

the arrival time of the 1 st IO request of the ith user. For example, the reserved time stamp, the upper limit time stamp, and the weight time stamp of the first IO request sent by the user after the idle (idle) state is converted into the active (active) state are all the arrival time of the IO request.

If the reserved time tag of the IO request of the first user is larger than or equal to the current time, the bandwidth of the first user reaches the reserved bandwidth; and if the upper limit time tag of the IO request of the first user is larger than or equal to the current time, indicating that the bandwidth of the first user reaches the bandwidth upper limit. The weight time tag is a relative value, the smaller the weight time tag of the IO request of the first user is, the higher the priority of the IO request of the first user in the weight scheduling is; the larger the weight of the first user is, the more bandwidth the first user allocates to the user in the weight is.

Assuming that the first user is the ith user in the system, the final bandwidth size of the first user can be calculated by the following formula:

in the formula, IOPS_iIs the total bandwidth of the ith user, r_iReserved bandwidth for the ith user, w_iThe weighting time label of the IO request of the ith user, S is the total bandwidth allocable by the system, n is the total number of users sending the IO request, and l_iIs the bandwidth upper limit of the ith user.

Through the method, the IO requests of the same user can be scheduled together according to the bandwidth lower limit and the weight, and compared with a method of scheduling the IO requests of the same user alternatively according to the weight or the bandwidth lower limit, the IO request scheduling method of the embodiment of the disclosure has the advantages of finer scheduling control granularity and higher reliability.

In step S102, the system may perform a first scheduling on the IO request of the first user according to the size of the reserved time tag, that is, scheduling in a reserved scheduling phase. And in the reserved scheduling stage, scheduling the IO requests of the users according to the reserved time tags of the IO requests of the users in a specific sequence until the deadline condition of the reserved scheduling stage is reached. For example, the specific order may be a sequence of large reserved time tags to small reserved time tags of IO requests of respective users. In some embodiments, the deadline for the first scheduling for the first user comprises: the number of IO requests scheduled by the first user in the first scheduling reaches a first threshold corresponding to the first user, where the first threshold corresponding to the first user depends on a lower bandwidth limit of the first user.

In some embodiments, in the reservation scheduling phase, each IO request is scheduled, and the IO request is removed from the IO request queue.

In step S103, when the cutoff condition of the first schedule is satisfied, the first schedule ends, and the second schedule, i.e., the schedule of the weight scheduling phase, starts. In the weight scheduling stage, the IO requests of the users may be scheduled in a specific order according to the weight time tags of the IO requests of the users. For example, the specific order may be an order of the weighted time labels of the IO requests of the respective users from large to small. When IO request scheduling is carried out in the weight scheduling stage, the bandwidth allocated to the first user is S_{The residue is left}*w/w_{General assembly}Wherein S is_{The residue is left}Scheduling the remaining allocable bandwidth of the phase system for the weight, w being the weight of the first user, w_{General assembly}Is the sum of the weights of the individual users. In some embodiments, each IO request is scheduled in the weighted scheduling phase, the IO request is removed from the IO request queue. The upper limit time tag of the IO request of the first user is not used as a scheduling basis, and is only used to determine whether the bandwidth allocated to the first user reaches the bandwidth upper limit.

In some embodiments, a time window for the IO request scheduling may be set, and the first scheduling and the second scheduling for each user may be performed in the same time window, where the first scheduling for each user is performed first in the time window, and after the first scheduling for the first user is completed, the second scheduling is performed for the first user. And strictly ensuring that the first scheduling of the first user is completed preferentially through a time window, and realizing the guarantee of the reserved bandwidth of the first user. In this case, optionally, the first threshold corresponding to the first user may be a product of a lower bandwidth limit of the first user and a time length of a time window to which the first schedule belongs.

A schematic diagram of an IO request scheduling process of one embodiment is shown in fig. 2. Under the condition that a system comprises a plurality of users, each user sends an IO request to a server to trigger a scheduling node mounted under the server, after the server receives the IO requests, each received IO request is added into an IO request queue, and then time tags of the IO requests in the queue are calculated, wherein the time tags comprise reserved time tags, weight time tags and upper limit time tags. After the time labels are calculated, the IO requests of all users are scheduled in the same time window, firstly, all users enter a first schedule, the scheduling node schedules all the IO requests in the queue according to the reserved time labels of all the IO requests in the queue, and for each user, when a user meets the cutoff condition of the first schedule, the user enters a second schedule. If a user does not meet the first schedule's cutoff condition, the user stays on the first schedule within the time window. And when the time window is ended, respectively allocating bandwidth to each user according to the weight of each user so as to schedule the IO request of the user through second scheduling. For each user, the total bandwidth allocated to this user does not exceed the bandwidth cap for this user. For each user, scheduling according to the reserved bandwidth of the user is referred to as first scheduling, and scheduling according to the weight of the user is referred to as second scheduling.

In some embodiments, when performing the first scheduling for the first user, the IO requests scheduled by the first user in the first scheduling may be counted by a counter. Determining that a cutoff condition for a first schedule of the first user is satisfied in response to the count value of the first counter reaching the first threshold or zero. The counter may be a positive count or a negative count. If the counting is positive sequence counting, counting is started from 0, and when the counting value of the counter reaches the first threshold value, the first scheduling is stopped. And if the counting is the reverse counting, starting counting from the first threshold value, and stopping the first scheduling when the counting value of the counter reaches 0.

In an embodiment where the tag information further includes an upper bound time tag, scheduling of IO requests for the first user is stopped in response to a deadline condition for the second schedule for the first user being satisfied, where the deadline condition for the second schedule for the first user includes: and the upper limit time tag of the current to-be-scheduled third IO request of the first user is greater than or equal to the current time, or the time windows to which the first schedule and the second schedule belong are ended. When second scheduling is carried out, as long as whether the upper limit time tag of the IO request of the first user is smaller than the current time is judged, if so, the first user does not reach the upper limit of the bandwidth, otherwise, the first user reaches the upper limit of the bandwidth, and the IO request of the first user is stopped being scheduled.

In some embodiments, the method further comprises: and if the cutoff condition of the first scheduling of the first user at the end of the last time window is not met, compensating the first scheduling of the first user in the current time window. And if the cutoff condition of the first scheduling of the first user includes that the number of IO requests scheduled by the first user in the first scheduling reaches a first threshold corresponding to the first user, responding that the total number of IO requests scheduled by the first user in the first scheduling of the last time window is smaller than the first threshold of the first user, and compensating the first threshold of the first scheduling by the first user in the current time window. By setting the time window, a compensation mechanism may be provided.

When performing compensatory scheduling on the first user, a difference value between the number n1 of IO requests successfully scheduled by the first user in a time window and a first threshold n of the first user may be calculated, and an initial count value of a counter of the first user in a next time window may be adjusted according to the difference value. If the timer is counting in positive sequence, subtracting the difference from the initial counting value (generally 0) of the counter in the next time window; if the timer is counting in the reverse order, the initial value of the counter in the next time window (i.e. the first threshold corresponding to the first user) is added with the difference value. And when the counter of the first user finishes counting, ending the first scheduling adjusted by the first user.

Further, if the number of IO requests scheduled by the first user through the first schedule in two or more consecutive time windows before the current time window is less than the first threshold of the first user, the first threshold of the first schedule in the current time window is compensated according to the number difference between the two or more consecutive time windows. Assuming that the first threshold of the first user is N₀The number of the IO requests scheduled by the first user in the consecutive j time windows before the current time window is respectively N₁,N₂,…,N_jAnd N is₁,N₂,…,N_jAre all less than N₀If so, in the next time window of the j consecutive time windows, the initial counting value of the counter corresponding to the first user is added or subtracted by N₀*j-(N₁+N₂+…+N_j) The value is subtracted if the count is positive, and added if the count is negative.

Fig. 3 is a schematic diagram of time window adjustment. A schematic diagram of two consecutive time windows is shown. In the first time window, the first scheduling is performed from the start point T0 of the time window, and after the time T1, the first scheduling is finished, and after the time T2, the second scheduling is performed. In the second time window, the first scheduling is performed from the start T0+ SW of the time window, and assuming that the reservation of the first time window needs to be compensated, the first scheduling is finished after T1+ Δ T, and the second scheduling is started after T2.

In some embodiments, before second scheduling at least one IO request remaining in the plurality of IO requests, the method further comprises: and aligning the weighted time tags of the IO requests of the users. Because the weight time labels are relative values, the IO requests of all users are expected to be scheduled according to the reciprocal of the beginning of the weight time labels, but when a certain user is changed from idle to active, the weight time labels of the IO requests of the user do not follow the interval any more, namely, the IO requests of the user do not have a linear relation with time any more; in addition, when a new user arrives, the weighting time stamp of the IO request of the new user and the weighting time stamp of the IO request of the user who has sent the IO request in the system are no longer the same reference point, so the weighting time stamp of the IO request of the idle user, the weighting time stamp of the IO request of the new user, and the weighting time stamp of the IO request of the active user in the system need to be aligned.

One way to align is to use a global virtual time to do the alignment. For example, the weighted time stamp of the IO request by user 1 at time t1 of the global virtual time is aligned with the weighted time stamp of the IO request by user 2 at time t1 of the global virtual time, the weighted time stamp of the IO request by user 1 at time t2 of the global virtual time is aligned with the weighted time stamp of the IO request by user 2 at time t2 of the global virtual time, and so on. The other alignment mode is to respectively obtain the time difference delta t between the minimum weight time tag of the IO request of each user and the current time; and aligning the weight time tags of the IO requests of the users according to the time difference corresponding to the users. As shown in fig. 4, the weight time tag of the IO request at the time when Δ t1 of user 1 is 1 may be aligned with the weight time tag of the IO request at the time when Δ t2 of user 2 is 1, the weight time tag of the IO request at the time when Δ t1 of user 1 is 2 may be aligned with the weight time tag of the IO request at the time when Δ t2 of user 2 is 2, and so on. Where Δ t1 is the time difference between the current time and the minimum weighted time tag t1 of the IO request of user 1, and Δ t2 is the time difference between the current time and the minimum weighted time tag t2 of the IO request of user 2, and idle in the figure indicates that the user is in an idle state, i.e., does not send the IO request.

Fig. 5 is a schematic diagram of bandwidth allocation of users under different total available bandwidths of the system according to an embodiment of the present disclosure, where reserved bandwidths of user 1, user 2, and user 3 are 250, and 0, respectively, weights of user 1, user 2, and user 3 are 1, and 2, respectively, and bandwidth upper limits of user 1, user 2, and user 3 are infinite (inf), and 1000, respectively. When the total available bandwidth of the system is 500, the reserved bandwidth of the user 1 and the reserved bandwidth of the user 2 can be just met at the moment, and no residual bandwidth exists; when the total available bandwidth of the system is 700, after the reserved bandwidths of the user 1 and the user 2 are satisfied, the remaining 200 bandwidths are allocated according to the weight, and the final bandwidth allocation conditions are 290, 330 and 80; when the total available bandwidth of the system becomes 3500, it is assumed that the user 3 is allocated according to the weight, and the bandwidth taken by the user 3 is greater than the upper limit of the bandwidth of the user 3, so the bandwidth of the user 3 should be a value corresponding to the upper limit of the bandwidth, and the remaining bandwidth is divided equally by the other users according to the weight, so the finally allocated bandwidths of the three users are 917, 1583, and 1000.

The embodiment of the disclosure can be used in a distributed system, in which a plurality of servers can be included, and one or more disks can be mounted under each server. Each scheduling node may refer to a disk mounted under a server. It differs from non-distributed systems in the updating of the calculation formulas and counters of the tags. In a distributed system, it is assumed that the method is executed by a first scheduling node in a plurality of scheduling nodes in the distributed system, where the first IO request carries at least one of a first quantity and a second quantity, where the first quantity and the second quantity are, respectively, the quantity of IO requests successfully scheduled by other scheduling nodes by the first user and the quantity of IO requests successfully scheduled by other scheduling nodes by the first user during a period when the first user sends a fourth IO request to the first user to send the first IO request, and the fourth IO request is a previous IO request sent to the first scheduling node before the first user sends the first IO request; the determining tag information of each IO request of the received multiple IO requests includes: and determining the label information of the first IO request according to at least one of the first quantity and the second quantity. The following is the new tag calculation formula:

in the formula, ρ_iAnd delta_iRespectively, a second quantity and a first quantity corresponding to the ith user. In some embodiments, the method further comprises: sending feedback information of each IO request of the first user for completing scheduling to the first user, wherein the feedback information is used for indicating that the scheduling of each IO request is completed through the first scheduling or the second scheduling. By sending the feedback information, the number of the IO requests scheduled by the first scheduling and the total number of the IO requests completed by each scheduling can be counted at the user side, so that the number of the IO requests completed by each scheduling node in the distributed system can be counted conveniently.

In the distributed system, in addition to the counters for implementing the compensation mechanism set at the server side, the user side also sets two counters for each scheduling node, and after receiving feedback information sent by a certain scheduling node, the counter corresponding to the scheduling node is updated by the user (plus 1). If the IO request is completed through the first schedule, the counter for counting the first number is updated together with the counter for counting the second number, and if the IO request is completed through the second schedule, the counter for counting the second number is updated. By counting the number of the requests finished on each scheduling node by each user, decentralized scheduling is realized in the distributed system. Under the scene of multi-user distributed storage resource management, the storage resources can be reasonably distributed by using the embodiment of the disclosure, and good user access experience is provided.

In some embodiments, in a case where a burst IO request of the first user is received, the second scheduling is performed on the burst IO request, where a scheduling deadline of the burst IO request includes: and the bandwidth of the burst IO request reaches the upper limit of the burst bandwidth, or the time windows of the first scheduling and the second scheduling end.

The burst IO refers to a large number of IO requests of a user received within a period of time, and is represented as a rapid increase in the number of the received IO requests of the user. On the premise that the system meets the reserved bandwidth of all users, the system can process the burst IO request of the users. And scheduling the burst IO request through the second scheduling, and allowing the processing of the burst IO even if a user sets a weight upper limit, namely the burst bandwidth upper limit bi becomes a new bandwidth upper limit. Under the condition that the burst IO request of the first user is received, whether the IO request of the first user reaches the burst bandwidth upper limit can be judged according to the upper limit time tag of the IO request of the first user. In the non-distributed system and the distributed system, assuming that the first user is the ith user, the upper limit time stamp of the IO request of the ith user may be calculated according to the following formula:

The current access users are users currently sending IO requests, and the number of the current access users may be 1 or greater than 1. When there is a current access user, it may be determined whether the sum of the bandwidth lower limit of the first user and the bandwidth lower limits of the current access users exceeds a preset admission bandwidth upper limit, and an access response of the access request is returned to the first user based on a result of the determination. Admission control (admission control) can be achieved in the above manner. If the number of users is too large, the reserved bandwidth of most users cannot be met in each time window, so that the users always stay in the reserved scheduling stage. In some embodiments, the first user may send an access request before sending an IO request. And responding to the received access request of the first user, and judging whether the sum of the bandwidth lower limit of the first user and the bandwidth lower limit of the current access user exceeds a preset admission bandwidth upper limit. And if so, returning an access response rejecting the access request to the first user so as to inform the first user that the first user is not allowed to send the IO request. And if not, returning an access response allowing the access request to the first user so as to inform the first user of allowing the first user to send the IO request.

In other embodiments, if the sum of the bandwidth lower limit of the first user and the bandwidth lower limit of the current access user exceeds a preset admission bandwidth upper limit, the IO request of the first user may be added to a waiting queue, and in response to that the sum of the bandwidth lower limit of the first user and the bandwidth lower limit of the current access user is smaller than the admission bandwidth upper limit, the IO request of the first user is obtained from the waiting queue.

In one embodiment, the average value of the total bandwidth of the system in a specific time is used as the upper limit of the admissible bandwidth.

In some embodiments, the method further comprises: based on the first IO request scheduled by the first schedule, adjusting a weighted timestamp of the not-yet-scheduled IO request of the first user. Specifically, each time one IO request of the user is scheduled through the first scheduling, the weight time tag of each IO request of the user that is not currently scheduled may be adjusted, and the adjustment may be performed by subtracting an inverse of the weight of the user from the weight time tag of each IO request of the user that is not currently scheduled.

As shown in fig. 6, an embodiment of the present disclosure further provides an IO request scheduling method, where the method includes:

step S601: sending a first IO request to a first scheduling node;

step S602: and receiving feedback information sent by the first scheduling node, wherein the feedback information is used for indicating that the first IO request is completed through first scheduling or second scheduling, the first scheduling is performed based on a reserved time label, and the second scheduling is performed based on a weighted time label.

The method of the embodiment of the present disclosure can be executed by a user side. The client side can send an IO request to the server at a time, and the IO request is used for accessing one scheduling node mounted under the server. After completing scheduling of an IO request, the server may return feedback information of the IO request, so that the user end determines whether the completed IO request is completed through the first scheduling or the second scheduling.

Other embodiments of the IO request scheduling method are detailed in the above embodiment of the IO request scheduling method, and are not described herein again.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

As shown in fig. 7, the present disclosure also provides an IO request scheduling apparatus, including:

a first determining module 701, configured to determine tag information of each IO request in a plurality of received IO requests, where the tag information at least includes a reserved time tag and a weight time tag, and the plurality of IO requests include a first IO request of a first user and a second IO request of the first user;

a first scheduling module 702, configured to perform first scheduling on the first IO request according to the reserved time tag of the first IO request;

a second scheduling module 703, configured to perform second scheduling on the second IO request according to the weighted time tag of the second IO request.

In some embodiments, the second scheduling module 703 is configured to: and responding to the satisfaction of the deadline condition of the first scheduling of the first user, and performing second scheduling on the second IO request according to the weight time tag of the second IO request.

In some embodiments, the apparatus further comprises: a counting module, configured to count, by using a first counter, IO requests scheduled by the first user in the first scheduling; a third determining module, configured to determine that a deadline condition of a first schedule of the first user is satisfied in response to a count value of the first counter reaching the first threshold or zero.

In some embodiments, the tag information further includes an upper bound time tag; the device further comprises: a scheduling stopping module, configured to stop scheduling of an IO request of the first user in response to a deadline condition of the second scheduling of the first user being met, where the deadline condition of the second scheduling of the first user includes: and the upper limit time tag of the current to-be-scheduled third IO request of the first user is greater than or equal to the current time, or the time windows to which the first schedule and the second schedule belong are ended.

In some embodiments, the first schedule and the second schedule are executed within the same time window.

In some embodiments, the apparatus further comprises: and the first scheduling compensation module is used for compensating the first scheduling of the first user in the current time window if the cutoff condition of the first scheduling of the first user is not met when the last time window is finished.

In some embodiments, the apparatus further comprises: a second scheduling compensation module, configured to compensate the first threshold of the first scheduling in the current time window for the first user in response to that the total number of IO requests scheduled in the first scheduling of the last time window by the first user is smaller than the first threshold of the first user.

In some embodiments, the apparatus further comprises: a third sending module, configured to send feedback information of each IO request of the first user for completing scheduling to the first user, where the feedback information is used to indicate that scheduling of each IO request is completed through the first scheduling or the second scheduling.

In some embodiments, the function of the apparatus is performed by a first scheduling node in a plurality of scheduling nodes, where the first IO request carries at least one of a first quantity and a second quantity, where the first quantity and the second quantity are, respectively, the quantity of IO requests successfully scheduled by other scheduling nodes by the first user and the quantity of IO requests successfully scheduled by other scheduling nodes by the first user during a period when the user sends a fourth IO request to the first user to send the first IO request, and the fourth IO request is a previous IO request sent to the first scheduling node before the first user sends the first IO request; the first determination module is to: and determining the label information of the first IO request according to at least one of the first quantity and the second quantity.

In some embodiments, the apparatus further comprises: a burst scheduling module, configured to perform the second scheduling on the burst IO request of the first user when the burst IO request is received, where a scheduling deadline of the burst IO request includes: and the bandwidth of the burst IO request reaches the upper limit of the burst bandwidth, or the time windows of the first scheduling and the second scheduling end.

In some embodiments, the apparatus further comprises: and the adjusting module is used for adjusting the weight time tag of the IO request of the first user which is not scheduled based on the first IO request which is scheduled by the first scheduling.

In some embodiments, the apparatus further comprises: the judging module is used for responding to the received access request of the first user and judging whether the sum of the bandwidth lower limit of the first user and the bandwidth lower limit of the current access user exceeds a preset admission bandwidth upper limit or not; and the response module is used for returning the access response of the access request to the first user based on the judgment result.

As shown in fig. 8, an embodiment of the present disclosure further provides an IO request scheduling apparatus, where the apparatus includes:

a first sending module 801, configured to send a first IO request to a first scheduling node;

a receiving module 802, configured to receive feedback information sent by the first scheduling node, where the feedback information is used to indicate that the first IO request is completed through a first scheduling or a second scheduling, where the first scheduling is performed based on a reserved time tag, and the second scheduling is performed based on a weighted time tag.

In some embodiments, the apparatus further comprises: and a third sending module, configured to send the IO requests carrying the first number and/or the second number to the first scheduling node.

In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.

The above-described embodiments of the apparatus are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution in the specification. One of ordinary skill in the art can understand and implement it without inventive effort.

The embodiments of the apparatus of the present specification can be applied to a computer device, such as a server or a terminal device. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and as a device in a logical sense, a processor in which the device is located processes files reads corresponding computer program instructions in the nonvolatile memory into the memory, and then reads the computer program instructions from the memory into the processor to run.

The embodiments of the present disclosure also provide a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the program, the method according to any embodiment is implemented. From a hardware aspect, as shown in fig. 9, it is a hardware structure diagram of a computer device in which the apparatus of this specification is located, except for the processor 901, the memory 902, the network interface 903, and the nonvolatile memory 904 shown in fig. 9, a server or an electronic device in which the apparatus is located in the embodiment may also include other hardware according to an actual function of the computer device, which is not described again.

The embodiments of the present disclosure also provide a computer storage medium on which a computer program is stored, which when executed by a processor implements the method of any of the embodiments.

The present disclosure may take the form of a computer program product embodied on one or more storage media including, but not limited to, disk storage, CD-ROM, optical storage, and the like, having program code embodied therein. Computer-usable storage media include permanent and non-permanent, removable and non-removable media, and information storage may be implemented by any method or technology. The information may be computer readable commands, data structures, modules of a program, or other data. Examples of the storage medium of the computer include, but are not limited to: phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technologies, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic tape storage or other magnetic storage devices, or any other non-transmission medium, may be used to store information that may be accessed by a computing device.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

The above description is only exemplary of the present disclosure and should not be taken as limiting the disclosure, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

The foregoing description of the various embodiments is intended to highlight various differences between the embodiments, and the same or similar parts may be referred to each other, and for brevity, will not be described again herein.

Claims

1. An IO request scheduling method, the method comprising:

determining label information of each IO request in a plurality of received IO requests, wherein the label information at least comprises a reserved time label and a weight time label, and the plurality of IO requests comprise a first IO request of a first user and a second IO request of the first user;

performing first scheduling on the first IO request according to the reserved time tag of the first IO request;

and performing second scheduling on the second IO request according to the weight time tag of the second IO request.

2. The method of claim 1, wherein the second scheduling the second IO request according to the weighted timestamp of the second IO request comprises:

and responding to the satisfaction of the deadline condition of the first scheduling of the first user, and performing second scheduling on the second IO request according to the weight time tag of the second IO request.

3. The method of claim 2, wherein the cutoff condition for the first schedule for the first user comprises: the number of IO requests scheduled by the first user in the first scheduling reaches a first threshold corresponding to the first user, where the first threshold corresponding to the first user depends on a lower bandwidth limit of the first user.

4. A method according to claim 2 or 3, characterized in that the method further comprises:

counting the IO requests scheduled by the first user in the first scheduling through a first counter;

determining that a cutoff condition for a first schedule of the first user is satisfied in response to the count value of the first counter reaching the first threshold or zero.

5. The method of any of claims 1 to 4, wherein the tag information further comprises an upper time limit tag;

the method further comprises the following steps:

stopping scheduling of IO requests to the first user in response to a deadline for the second scheduling of the first user being satisfied, wherein the deadline for the second scheduling of the first user comprises: and the upper limit time tag of the current to-be-scheduled third IO request of the first user is greater than or equal to the current time, or the time windows to which the first schedule and the second schedule belong are ended.

6. The method of any of claims 1-5, wherein the first schedule and the second schedule are performed within a same time window.

7. The method according to any one of claims 1 to 6, further comprising:

in response to the total number of IO requests scheduled in the first schedule of the first user in the last time window being less than a first threshold for the first user, compensating for the first threshold for the first schedule of the first user in the current time window.

8. The method according to any one of claims 1 to 7, further comprising:

sending feedback information of each IO request of the first user for completing scheduling to the first user, wherein the feedback information is used for indicating that the scheduling of each IO request is completed through the first scheduling or the second scheduling.

9. The method according to any one of claims 1 to 8, wherein the method is performed by a first scheduling node in a plurality of scheduling nodes, and the first IO request carries at least one of a first number and a second number, where the first number and the second number are, respectively, a number of IO requests successfully scheduled by other scheduling nodes by the first user and a number of IO requests successfully scheduled by other scheduling nodes by the first user through the first scheduling during a period when the first user sends a fourth IO request to the first user to send the first IO request, and the fourth IO request is a previous IO request sent to the first scheduling node before the first user sends the first IO request;

the determining tag information of each IO request of the received multiple IO requests includes:

and determining the label information of the first IO request according to at least one of the first quantity and the second quantity.

10. The method according to any one of claims 1 to 9, further comprising:

and under the condition that the burst IO request of the first user is received, performing the second scheduling on the burst IO request, wherein the scheduling cutoff condition of the burst IO request comprises: and the bandwidth of the burst IO request reaches the upper limit of the burst bandwidth, or the time windows of the first scheduling and the second scheduling end.

11. The method according to any one of claims 1 to 10, further comprising:

based on the first IO request scheduled by the first schedule, adjusting a weighted timestamp of the not-yet-scheduled IO request of the first user.

12. The method according to any one of claims 1 to 11, further comprising:

responding to the received access request of the first user, and judging whether the sum of the bandwidth lower limit of the first user and the bandwidth lower limit of the current access user exceeds a preset admission bandwidth upper limit or not;

and returning an access response of the access request to the first user based on the judgment result.

13. An IO request scheduling method, the method comprising:

sending a first IO request to a first scheduling node;

and receiving feedback information sent by the first scheduling node, wherein the feedback information is used for indicating that the first IO request is completed through first scheduling or second scheduling, the first scheduling is performed based on a reserved time label, and the second scheduling is performed based on a weighted time label.

14. The method of claim 13, further comprising:

determining a first quantity and/or a second quantity according to the feedback information, wherein the first quantity and the second quantity are respectively the quantity of IO requests successfully scheduled by other scheduling nodes and the quantity of IO requests successfully scheduled by other scheduling nodes through the first scheduling in the period from the time of sending a second IO request to the time of sending a third IO request, the second IO request is an IO request sent by the previous time of sending the third IO request, and the other scheduling nodes are at least one scheduling node except the second scheduling node in the plurality of scheduling nodes;

and sending the third IO requests carrying the first quantity and/or the second quantity to the second scheduling node.

15. An IO request scheduling apparatus, the apparatus comprising:

the device comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining label information of each IO request in a plurality of received IO requests, the label information at least comprises a reserved time label and a weight time label, and the plurality of IO requests comprise a first IO request of a first user and a second IO request of the first user;

the first scheduling module is used for performing first scheduling on the first IO request according to the reserved time tag of the first IO request;

and the second scheduling module is used for performing second scheduling on the second IO request according to the weighted time tag of the second IO request.

16. The apparatus of claim 15, wherein the second scheduling module is configured to:

17. An IO request scheduling apparatus, the apparatus comprising:

the first sending module is used for sending a first IO request to the first scheduling node;

a receiving module, configured to receive feedback information sent by the first scheduling node, where the feedback information is used to indicate that the first IO request is completed through first scheduling or second scheduling, where the first scheduling is performed based on a reserved time tag, and the second scheduling is performed based on a weighted time tag.

18. The apparatus of claim 17, further comprising:

a second determining module, configured to determine a first quantity and/or a second quantity according to the feedback information, where the first quantity and the second quantity are respectively a quantity of IO requests successfully scheduled by other scheduling nodes and a quantity of IO requests successfully scheduled by other scheduling nodes through the first scheduling in a period from when a second IO request is sent to when a third IO request is sent, the second IO request is an IO request sent last time when the third IO request is sent, and the other scheduling nodes are at least one scheduling node of the multiple scheduling nodes except the second scheduling node;

a second sending module, configured to send the third IO requests carrying the first quantity and/or the second quantity to the second scheduling node.

19. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out the method of any one of claims 1 to 14.

20. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any one of claims 1 to 14 when executing the program.