CN109861850B - SLA-based stateless cloud workflow load balancing scheduling method - Google Patents
SLA-based stateless cloud workflow load balancing scheduling method Download PDFInfo
- Publication number
- CN109861850B CN109861850B CN201910028641.0A CN201910028641A CN109861850B CN 109861850 B CN109861850 B CN 109861850B CN 201910028641 A CN201910028641 A CN 201910028641A CN 109861850 B CN109861850 B CN 109861850B
- Authority
- CN
- China
- Prior art keywords
- request
- queue
- delay
- service
- rtl
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Abstract
The invention discloses a stateless cloud workflow load balancing scheduling method based on SLA, which selects different SLA levels according to the requirements of different tenants on service scenes and process models of the tenants, provides different request throughput services for the tenants through the different SLA levels, performs hierarchical service on different process requests, realizes real-time engine load monitoring by combining shared memory and the distribution condition of the process models on an engine, reduces the request peak of the engine service and the whole memory overhead of an engine cluster, thereby improving the load balancing capacity of the cloud workflow under a multi-tenant architecture, and enabling a process service provider to provide services for more tenants on the basis of meeting the analysis execution performance requirements of different tenants on the request throughput and different process definitions.
Description
Technical Field
The invention relates to the technical field of workflow and cloud computing, in particular to a stateless cloud workflow load balancing scheduling method based on SLA.
Background
With the development of distributed computing, particularly grid technology, cloud computing has emerged as a new service computing model. Cloud computing is a mode of resource delivery and usage, and refers to obtaining resources required by an application through a network, including hardware, platforms, software, and the like, and the network providing the resources is referred to as a "cloud". In cloud computing, anything is a service, which can be generally divided into three levels: infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS).
The cloud workflow is a PaaS-level service, and refers to a distributed system for providing workflow services in a platform-as-a-service cloud computing mode. Compared with a traditional workflow system, the cloud workflow has the main advantages that: the cloud workflow provides a mode of using according to needs and paying according to quantity, and the mode can effectively reduce the investment cost of using workflow management software by enterprises and reduce the starting difficulty; the cloud workflow has the advantages of high resource utilization rate and high service performance, the centralized management mode can fully utilize computing power, and flexible resource configuration can also cope with request loads in different time periods.
The traditional workflow engine is usually realized based on a stateful scheme, and in order to better exert the flexibility of cloud resources and improve the reliability of a cloud workflow system in a cloud environment, the workflow engine realized based on the stateless scheme can better meet the requirements of cloud workflow; for a cloud workflow system realized based on a stateless workflow engine, on one hand, due to the characteristics of workflow services, a flow model is necessary to be analyzed and analysis results are necessary to be stored, and certain computing resources and storage resources are required to be occupied; on the other hand, in a cloud environment, a cloud workflow needs to support multi-tenant business process execution, and a load scenario is much more complex than that of a traditional workflow engine. When the cloud workflow system realized based on the stateless workflow engine faces the request load of multiple tenants, multiple process models and multiple process instances, if only the stateless property of the service is considered for scheduling, the characteristics of the workflow service can not be fully utilized and exerted, so that better request load balancing effect and user experience can not be realized.
Currently, the system architecture and management structure of the existing workflow cluster system in the workflow field usually only serve the same user or the same organization. In a cloud service business action mode, a process service provider wants to provide process analysis services for more tenants under the condition of the same hardware resources, different tenants often have different requirements on request throughput of engine services according to business scenes, and the same tenant often has different requirements on analysis execution performance defined by different processes, so that the tenant and the process service provider need to sign an SLA contract, and a system provides corresponding service levels to the tenant according to the SLA contract.
Disclosure of Invention
The invention provides a stateless cloud workflow load balancing scheduling method based on SLA (service level agreement) for solving the problems of requirements of different tenants on different service levels under cloud workflow and the requirement of a process service provider on increasing the number of tenants under the same hardware resource, which realizes optimization of load balancing effect and execution performance of cloud workflow requests while ensuring cloud tenant service experience, so that a cloud workflow system provides process analysis service for more tenants under a normal service state.
In order to achieve the purpose of the invention, the technical scheme is as follows: a method for load balancing scheduling of stateless cloud workflows based on SLAs is disclosed, wherein when a process instance request corresponding to a tenant uploading process model is received, the cloud workflows schedule the process instance request to a stateless workflow engine in a cluster, and the execution comprises the following steps:
and smoothing the load waveform of the admission layer:
s101: an admission layer receives a tenant process instance request, and acquires a service request arrival rate RAR index of a tenant and a request response time level RTL for the process instance request from a tenant SLA warehouse according to a tenant ID or process instance request information;
s102: judging whether the tenant service request rate meets an RAR index or not according to a system current-limiting algorithm, if the tenant service request rate exceeds the service request rate specified by the RAR index, directly filtering the request, feeding back the request to the tenant, and prompting to purchase a higher RAR level, otherwise, executing the next step;
s103: judging RTL levels, executing the request balanced distribution of a scheduling layer according to different RTL levels, acquiring the request number of a current immediate execution queue and a delay queue, calculating the score of a current process instance request for each delay queue according to the request number of the delay queue by using a historical load variable historySize, and placing the request in the delay queue with the highest score;
and the dispatching layer requests balanced dispatching:
s201: the scheduling layer receives a request from an immediate execution queue of the admission layer, and acquires a load information set E ═ E of each process engine service sent by the process service layer from the shared memory1,…,em],ei=(cpui,rami),cpuiRepresentation flow Engine service eiCurrent cpu occupancy, ramiRepresentation flow Engine service eiCurrent ram occupancy;
s202: flow for obtaining request from flow example warehouse by scheduling layerThe distribution condition set D ═ D of the process instance corresponding to the model in the process engine service1,d2,…,dm],di∈[0,1]When d isiWhen equal to 0, the flow model does not run at eiOn the engine, otherwise, the reverse is true;
s203: dividing the process engine services into two groups E according to the distribution condition set D1And E2,E1In which all the engines on which the process model has been executed, i.e. d, are storedi=1;E2The rest of the engines are stored;
s204: for E1And E2The element of (A) is subjected to engine busyness calculation to respectively obtain E1、E2Least busy engine serviceJudgment inequalityIf not, dispatch the process instance request toOtherwise is assigned toModifying the distribution condition set in the process instance warehouse; completing the request scheduling of the process instance;
where β is a cost parameter for allocating the process instance request to the new engine, and may be set according to specific hardware resource characteristics.
Preferably, in step S101, the service request arrival rate RAR is used to measure the throughput of the process instance request, and represents the highest number of process instance requests that can be sent by the tenant per second;
the RAR index is divided into three stages, and v is defined0,v1,v2Wherein v is0、v1、v2Is an integer and has v0>v1>v2Then threeThe individual levels are described as follows:
RAR 0 means that the service request arrival rate is at most equal to v0;
RAR 1 means that the service request arrival rate is at most equal to v1;
RAR 2 means that the service request arrival rate is at most equal to v2;
Different levels of RAR correspond to different charging.
Preferably, in step S101, the request response time level RTL is used to measure the processing performance of different process requests, and the RTL is proposed based on the diversity of the execution time ranges of the workflow;
the RTL level is divided into three levels, parameters a, b and t are defined, wherein a, b and t are integers, a is smaller than b, t represents the time required by the engine to process a process instance request and represents the length of a time slice, and the RTL level can be obtained by testing the service of the process engine, and then the RTL level is as follows:
RTL 0: the request of the process example is responded in 1 time slice, namely t;
RTL 1: the request responds at the latest (a +1) time slices, namely (a +1) t;
RLT 2: the request responds at the latest at (b +1) time slices, i.e., (b +1) t.
Further, the system current limiting algorithm adopts a sliding window algorithm to ensure the RAR index of the tenant; the implementation is performed in a manner of using request cache for the RTL level, and the specific manner is as follows:
the admission layer maintains b +1 queues for storing process instance requests, each queue corresponds to a delay duration variable representing the delay duration of the process instance requests in the queue, and the values of the delay duration variable are 0t, 1t, 2t, … and bt, respectively, wherein: t is a time slice defined in RTL, a queue with a delay time variable of 0t is an immediate execution queue, a queue with a delay time variable of 1t, 2t, …, bt is a delay queue; the admission layer also needs to update the delay queue and the historical load variable after 1 time slice, and the historical load variable is used for measuring the condition of the number of requests of each past time slice; the admission layer needs to put new process instance requests into corresponding delay queues according to the RTL level set by the tenant for the process tasks and the number of the process instance requests stored in each current queue.
Still further, step S103, specifically, determining an RTL level;
if the RTL level is RTL 0, directly executing the request balance dispatch of the scheduling layer;
if the RTL level is RTL 1, which indicates that the delay time is less than or equal to a slots, the request numbers of the current immediate execution queue and the previous a delay queues are obtained, and the set is N ═ N0,n1,…,na]Wherein n isiThe number of requests representing the ith queue, i ═ 0, 1 … a;
if the RTL level is RTL 2, which indicates that the delay time is less than or equal to b slots, the request numbers of the current immediate execution queue and all the delay queues are obtained, and the set is N ═ N0,n1,…,nb]Wherein n isjThe number of requests in the jth queue is represented, j being 0 and 1 … b.
Still further, in step S103, the calculation method for calculating the score of the current request for each delay queue is as follows:
in the formula: n isiE N, i denotes the position of the current delay queue in the set N.
Still further, the admission layer needs to update the delay queues and the historical load variables after every 1 time slice, including the following steps:
h1: subtracting 1 from the delay time variable of all the delay queues, judging whether the delay time variable is equal to 0, if so, adding all the requests in the queues to the immediate execution queue, and resetting the delay time to be bt;
h2: executing the queue immediately requires running a thread all the time to judge whether the queue has a request, submitting the request to a scheduling layer in sequence for request dispatching according to the rate of processing the request by an engine service, and recording the number of the requests submitted to a process engine in each time slice;
h3: acquiring the request number requestSize submitted to a scheduling layer by an immediate execution queue for request dispatch in the past 1 time slice, and updating a historical load variable historySize according to the following formula:
historySize=α*historySize+(1-α)*requestSize
wherein: α represents a weighting factor representing the degree of attenuation of the previous second historsize value.
Still further, the engine busyness calculation formula is as follows:
busynessi=w1*cpui+w2*rami,w1+w2=1
wherein: w is a1And w2The two parameters respectively represent the importance degrees of two load parameters, namely cpu and ram, and need to be configured according to the characteristics of hardware resources.
In the request balanced distribution of the scheduling layer, the scheduling layer realizes that the requests corresponding to the same process model are distributed to a small number of engines according to the load condition of the process engine service of the process service layer and the characteristics of the stateless workflow engine, so that the calculation and the memory consumption caused by multiple analysis and result storage of the same process model are reduced.
The invention has the following beneficial effects: according to the invention, different SLA levels are selected according to the requirements of different tenants on service scenes and process models of the tenants, different request throughput services are provided for the tenants by the cloud workflow system through the different SLA levels, different process requests are subjected to classified service, and the real-time engine load monitoring and the distribution condition of the process models on the engine are realized by combining shared memory, so that the overall memory overhead of an engine cluster is reduced while the request wave crest of the engine service is reduced, the load balancing capability of the cloud workflow under a multi-tenant architecture is improved, and a process service provider can provide services for more tenants on the basis of meeting the analysis execution performance requirements of different tenants on the request throughput and different process definitions.
Drawings
FIG. 1 is a cloud workflow core component diagram.
Detailed Description
The invention is described in detail below with reference to the drawings and the detailed description.
Example 1
As shown in fig. 1, a method for stateless cloud workflow load balancing scheduling based on SLA includes steps of admission layer load waveform smoothing, scheduling layer request balancing assignment processing; before specifically introducing the two steps, defining quantitative indexes for representing different request throughputs and processing performances of different flow requests in a cloud workflow service SLA, wherein the request throughputs are measured by using a service request arrival rate RAR and represent the highest flow instance request number which can be sent by a tenant per second; the performance of different process request processing is measured by the request response time level RTL, which is proposed based on the diversity of the execution time range of the workflow, which varies from several microseconds to several months.
According to the embodiment, according to the needs of tenants for service scenes, RAR indexes can be divided into three levels, and v is defined0,v1,v2Wherein v is0、v1、v2Is an integer and has v0>v1>v2Then the three levels are described as follows:
RAR 0 for high concurrent service scene, that is, service request arrival rate is at most equal to v0;
RAR 1 is used for a common concurrent service scene, and means that the service request arrival rate is at most equal to v1;
RAR 2 for low concurrent service scenario, meaning that service request arrival rate is at most equal to v2;
RARs of different classes correspond to different charges, and classes with higher service request arrival rates also have correspondingly higher charges.
In this embodiment, according to different requirements of different process requests on processing real-time performance, the RTL level may be divided into three levels, before detailed statement, parameters a, b, and t are defined, where a, b, and t are integers, and a is smaller than b, and t represents a time required by an engine to process a process instance request, and also represents a length of a time slice, and can be obtained by testing engine services, and the SLA level is stated as follows:
RTL 0: for a process instance request with higher real-time requirement, the process instance request responds in 1 time slice, namely t; mostly, the process is automated;
RTL 1: for a general flow instance request with real-time requirements, the request responds at (a +1) time slices at the latest, namely (a +1) t;
RLT 2: for a flow instance request with lower real-time requirements, the request responds at the latest in (b +1) time slices, i.e., (b +1) t.
The tenant can select different SLA indexes for different tasks in the process model according to the actual use condition of the model when uploading the process model, so that different charging is obtained, and the higher the SLA index is, the higher the charging response is.
In the method for load balancing scheduling of a stateless cloud workflow based on an SLA according to this embodiment, when a process instance request corresponding to a tenant upload process model is received, a cloud workflow schedules the process instance request to a stateless workflow engine in a cluster, and the execution includes the following steps:
and smoothing the load waveform of the admission layer:
s101: the admission layer receives a tenant process instance request, and acquires an RAR index of the tenant and an RTL level requested by the process instance from a tenant SLA warehouse according to the tenant ID and the process request information;
s102: judging whether the tenant service request rate meets an RAR index or not according to a time window algorithm, if the tenant service request rate exceeds the service request rate specified by the RAR index, directly filtering the request, feeding back the request to the tenant, and prompting to purchase a higher RAR level; otherwise, executing the next step;
s103: judging the RTL level, and if the RTL level is RTL 0, directly executing the request balance allocation of a scheduling layer for scheduling; if the RTL level is RTL 1, which indicates that only a time slices can be delayed at most, then the current immediate execution queue and the previous a are obtainedThe request number of each delay queue is set as N ═ N0,n1,…,na]Wherein n isiThe number of requests representing the ith queue, i ═ 0, 1 … a; if the RTL level is RTL 2, which indicates that b time slices can be delayed at most, the request numbers of the current immediate execution queue and all the delay queues are obtained, and the set is N ═ N0,n1,…,nb]Wherein n isjRepresents the number of requests of the jth queue, j is 0, 1 … b;
s104: using the historical load variable historySize, for each element of set N, the score for each delay queue for the current request is calculated using:
in the formula, niE N, i denotes the position of the current delay queue in the set N.
And (4) placing the request in the delay queue with the highest score according to the score of each delay queue calculated in the steps.
The dispatch layer request balanced dispatch comprises the following steps:
s201: the scheduling layer receives a request from an immediate execution queue of the admission layer, and acquires a load information set E ═ E of each process engine service sent by the process service layer from the shared memory1,…,em],ei=(cpui,rami),cpuiRepresentation flow Engine service eiCurrent cpu occupancy, ramiRepresentation flow Engine service eiCurrent ram occupancy;
s202: the scheduling layer obtains a distribution condition set D ═ D [ D ] of the flow instance corresponding to the requested flow model in the flow engine service from the flow instance warehouse1,d2,…,dm],di∈[0,1]When d isiWhen equal to 0, the flow model does not run at eiOn the engine, otherwise, the reverse is true;
s203: dividing the process engine services into two groups E according to the distribution condition set D1And E2,E1In which all the engines on which the process model has been executed, i.e. d, are storedi=1;E2The rest of the engines are stored;
s204: for E1And E2The engine busyness calculation is performed by the following formula:
busynessi=w1*cpui+w2*rami,w1+w2=1
wherein: w is a1And w2The two parameters respectively represent the importance degrees of two load parameters, namely cpu and ram, and need to be configured according to the characteristics of hardware resources; respectively obtaining E through an engine busyness formula1、E2Least busy engine service
Judgment inequalityWhether the result is true or not; if the inequality holds, dispatch the request toOtherwise is assigned toModifying the distribution condition set in the process instance warehouse;
wherein: beta is a cost parameter for allocating the process instance request to the new engine, and can be set according to the specific hardware resource characteristics;
s205: and completing request scheduling.
In the embodiment, in the request balanced dispatch of the scheduling layer, the scheduling layer allocates the requests corresponding to the same process model to a small number of engines according to the load condition of the process engine service of the process service layer and according to the characteristics of the stateless workflow engine, so that the computation and the memory consumption caused by multiple analysis and result storage of the same process model are reduced.
In the system current limiting algorithm of the embodiment, a sliding window algorithm is adopted to ensure an RAR index of a tenant; the implementation is performed in a manner of using request cache for the RTL level, and the specific manner is as follows:
the admission layer maintains b +1 queues for storing process instance requests, each queue corresponds to a delay duration variable representing the delay duration of the process instance requests in the queue, and the values of the delay duration variable are 0t, 1t, 2t, … and bt, respectively, wherein: t is a time slice defined in RTL, a queue with a delay time variable of 0t is an immediate execution queue, a queue with a delay time variable of 1t, 2t, …, bt is a delay queue; the admission layer also needs to update the delay queue and the historical load variable after 1 time slice, and the historical load variable is used for measuring the condition of the number of requests of each past time slice; the admission layer needs to put new process instance requests into corresponding delay queues according to the RTL level set by the tenant for the process tasks and the number of the process instance requests stored in each current queue.
The admission layer needs to update the delay queues and the historical load variables after every 1 time slice, including the following steps:
h1: subtracting 1 from the delay time variable of all the delay queues, judging whether the delay time variable is equal to 0, if so, adding all the requests in the queues to the immediate execution queue, and resetting the delay time to be bt;
h2: executing the queue immediately requires running a thread all the time to judge whether the queue has a request, submitting the request to a scheduling layer in sequence for request dispatching according to the rate of processing the request by an engine service, and recording the number of the requests submitted to a process engine in each time slice;
h3: acquiring the request number requestSize submitted to a scheduling layer by an immediate execution queue for request dispatch in the past 1 time slice, and updating a historical load variable historySize according to the following formula:
historySize=α*historySize+(1-α)*requestSize
wherein: α represents a weighting factor representing the decay rate of the previous second historsize value.
The method is characterized in that different requirements of service scenes of different tenants on request throughput and different process definition analysis execution performances are based, the request delay is carried out by utilizing the diversity of execution time ranges in the workflow, particularly the characteristic that part of tasks do not need real-time response, and the current limitation of the tenants and the request load balancing optimization under the current limitation constraint are realized; combining the characteristics of the stateless workflow engine service, realizing a load balancing strategy of requesting minimum engine number distribution by the process model; the access of the engine service load information is realized by utilizing the shared memory; according to the method, the load balancing effect and the execution performance of the cloud workflow request can be optimized while the service experience of the cloud tenants is guaranteed, so that the cloud workflow system can provide flow analysis services for more tenants in a normal service state.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.
Claims (8)
1. A method for load balancing scheduling of stateless cloud workflows based on SLA is characterized in that: when a process instance request corresponding to a tenant uploading process model is received, the cloud workflow schedules the process instance request to a stateless workflow engine in a cluster, and the execution comprises the following steps:
and smoothing the load waveform of the admission layer:
s101: an admission layer receives a tenant process instance request, and acquires a service request arrival rate RAR index of a tenant and a request response time level RTL for the process instance request from a tenant SLA warehouse according to a tenant ID or process instance request information;
s102: judging whether the tenant service request rate meets an RAR index or not according to a system current-limiting algorithm, if the tenant service request rate exceeds the service request rate specified by the RAR index, directly filtering the request, feeding back the request to the tenant, and prompting to purchase a higher RAR level, otherwise, executing the next step;
s103: judging RTL levels, executing the request balanced distribution of a scheduling layer according to different RTL levels, acquiring the request number of a current immediate execution queue and a delay queue, calculating the score of a current process instance request for each delay queue according to the request number of the delay queue by using a historical load variable historySize, and placing the process instance request in the delay queue with the highest score;
and the dispatching layer requests balanced dispatching:
s201: the scheduling layer receives a request from an immediate execution queue of the admission layer, and acquires a load information set E ═ E of each process engine service sent by the process service layer from the shared memory1,…,em],ei=(cpui,rami),cpuiRepresentation flow Engine service eiCurrent cpu occupancy, ramiRepresentation flow Engine service eiCurrent ram occupancy;
s202: the scheduling layer obtains a distribution condition set D ═ D [ D ] of the flow instance corresponding to the requested flow model in the flow engine service from the flow instance warehouse1,d2,…,dm],di∈[0,1]When d isiWhen equal to 0, the flow model does not run at eiOn the engine, otherwise, the reverse is true;
s203: dividing the process engine services into two groups E according to the distribution condition set D1And E2,E1In which all the engines on which the process model has been executed, i.e. d, are storedi=1;E2The rest of the engines are stored;
s204: for E1And E2The element of (A) is subjected to engine busyness calculation to respectively obtain E1、E2Least busy engine serviceJudgment inequalityIf not, dispatch the process instance request toOtherwise is assigned toModifying the distribution condition set in the process instance warehouse; completing the request scheduling of the process instance;
where β is a cost parameter for allocating the process instance request to the new engine, and may be set according to specific hardware resource characteristics.
2. The SLA-based stateless cloud workflow load balancing scheduling method of claim 1, wherein: step S101, the service request arrival rate RAR is used for measuring the throughput of the process instance request and representing the highest process instance request number which can be sent by the tenant per second;
the RAR index is divided into three stages, and v is defined0,v1,v2Wherein v is0、v1、v2Is an integer and has v0>v1>v2Then the three levels are described as follows:
RAR 0 means that the service request arrival rate is at most equal to v0;
RAR 1 means that the service request arrival rate is at most equal to v1;
RAR 2 means that the service request arrival rate is at most equal to v2;
Different levels of RAR correspond to different charging.
3. The SLA-based stateless cloud workflow load balancing scheduling method of claim 1, wherein: step S101, the request response time level RTL is used for measuring the processing performance of different process requests, and the RTL is proposed based on the diversity of the execution time range of the workflow;
the RTL level is divided into three levels, and parameters a, b and t are defined, wherein a, b and t are integers, a is smaller than b, t represents the time required by the engine to process a flow instance request and represents the length of a time slice, and the RTL level is as follows:
RTL 0: the request of the process example is responded in 1 time slice, namely t;
RTL 1: the process instance request responds at the latest (a +1) time slices, i.e., (a +1) t;
RLT 2: the process instance request responds at the latest (b +1) time slices, i.e., (b +1) t.
4. The SLA-based stateless cloud workflow load balancing scheduling method of claim 3, wherein: the system current limiting algorithm adopts a sliding window algorithm to ensure the RAR index of the tenant; the implementation is performed in a manner of using request cache for the RTL level, and the specific manner is as follows:
the admission layer maintains b +1 queues for storing process instance requests, each queue corresponds to a delay duration variable representing the delay duration of the process instance requests in the queue, and the values of the delay duration variable are 0t, 1t, 2t, … and bt, respectively, wherein: t is a time slice defined in RTL, a queue with a delay time variable of 0t is an immediate execution queue, a queue with a delay time variable of 1t, 2t, …, bt is a delay queue; the admission layer also needs to update the delay queue and the historical load variable after 1 time slice, and the historical load variable is used for measuring the condition of the number of requests of each past time slice; the admission layer needs to put new process instance requests into corresponding delay queues according to the RTL level set by the tenant for the process tasks and the number of the process instance requests stored in each current queue.
5. The SLA-based stateless cloud workflow load balancing scheduling method of claim 4, wherein: step S103, specifically, judging the RTL level;
if the RTL level is RTL 0, directly executing the request balance dispatch of the scheduling layer;
if the RTL level is RTL 1, the description is extendedIf the delay time is less than or equal to a time slices, acquiring the request number of the current immediate execution queue and the previous a delay queues, wherein the set is N ═ N0,n1,…,na]Wherein n isiThe number of requests representing the ith queue, i ═ 0, 1 … a;
if the RTL level is RTL 2, which indicates that the delay time is less than or equal to b slots, the request numbers of the current immediate execution queue and all the delay queues are obtained, and the set is N ═ N0,n1,…,nb]Wherein n isjThe number of requests in the jth queue is represented, j being 0 and 1 … b.
6. The SLA-based stateless cloud workflow load balancing scheduling method of claim 5, wherein: step S103, the calculation method for calculating the score of the current request for each delay queue is as follows:
in the formula, niE N, i denotes the position of the current delay queue in the set N.
7. The SLA-based stateless cloud workflow load balancing scheduling method of claim 6, wherein: the admission layer needs to update the delay queues and the historical load variables after every 1 time slice, including the following steps:
h1: subtracting 1 from the delay time variable of all the delay queues, judging whether the delay time variable is equal to 0, if so, adding all the requests in the delay queues to the immediate execution queue, and resetting the delay time to be bt;
h2: executing the queue immediately requires running a thread all the time to judge whether the queue has a request, submitting the request to a scheduling layer in sequence for request dispatching according to the rate of processing the request by an engine service, and recording the number of the requests submitted to a process engine in each time slice;
h3: acquiring the request number requestSize submitted to a scheduling layer by an immediate execution queue for request dispatch in the past 1 time slice, and updating a historical load variable historySize according to the following formula:
historySize=α*historySize+(1-α)*requestSize
wherein: α represents a weighting factor representing the degree of attenuation of the previous second historsize value.
8. The SLA-based stateless cloud workflow load balancing scheduling method of claim 7, wherein: the engine busyness calculation formula is as follows:
busynessi=w1*cpui+w2*rami,w1+w2=1
wherein: w is a1And w2The two parameters respectively represent the importance degrees of two load parameters, namely cpu and ram, and need to be configured according to the characteristics of hardware resources.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910028641.0A CN109861850B (en) | 2019-01-11 | 2019-01-11 | SLA-based stateless cloud workflow load balancing scheduling method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910028641.0A CN109861850B (en) | 2019-01-11 | 2019-01-11 | SLA-based stateless cloud workflow load balancing scheduling method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109861850A CN109861850A (en) | 2019-06-07 |
CN109861850B true CN109861850B (en) | 2021-04-02 |
Family
ID=66894503
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910028641.0A Active CN109861850B (en) | 2019-01-11 | 2019-01-11 | SLA-based stateless cloud workflow load balancing scheduling method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109861850B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110659463B (en) * | 2019-08-23 | 2021-11-12 | 苏州浪潮智能科技有限公司 | Distributed operation method and device of stateless system |
CN110737485A (en) * | 2019-09-29 | 2020-01-31 | 武汉海昌信息技术有限公司 | workflow configuration system and method based on cloud architecture |
US11042415B2 (en) * | 2019-11-18 | 2021-06-22 | International Business Machines Corporation | Multi-tenant extract transform load resource sharing |
CN110941681B (en) * | 2019-12-11 | 2021-02-23 | 南方电网数字电网研究院有限公司 | Multi-tenant data processing system, method and device of power system |
US11841871B2 (en) | 2021-06-29 | 2023-12-12 | International Business Machines Corporation | Managing extract, transform and load systems |
CN115237573B (en) * | 2022-08-05 | 2023-08-18 | 中国铁塔股份有限公司 | Data processing method, device, electronic equipment and readable storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101105843A (en) * | 2006-07-14 | 2008-01-16 | 上海移动通信有限责任公司 | Electronic complaint processing system and method in communication field |
CN101694709A (en) * | 2009-09-27 | 2010-04-14 | 华中科技大学 | Service-oriented distributed work flow management system |
CN104778076A (en) * | 2015-04-27 | 2015-07-15 | 东南大学 | Scheduling method for cloud service workflow |
CN106095569A (en) * | 2016-06-01 | 2016-11-09 | 中山大学 | A kind of cloud workflow engine scheduling of resource based on SLA and control method |
CN108665157A (en) * | 2018-05-02 | 2018-10-16 | 中山大学 | A method of realizing cloud Workflow system flow instance balance dispatching |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8984503B2 (en) * | 2009-12-31 | 2015-03-17 | International Business Machines Corporation | Porting virtual images between platforms |
US8756251B2 (en) * | 2011-07-19 | 2014-06-17 | HCL America Inc. | Method and apparatus for SLA profiling in process model implementation |
-
2019
- 2019-01-11 CN CN201910028641.0A patent/CN109861850B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101105843A (en) * | 2006-07-14 | 2008-01-16 | 上海移动通信有限责任公司 | Electronic complaint processing system and method in communication field |
CN101694709A (en) * | 2009-09-27 | 2010-04-14 | 华中科技大学 | Service-oriented distributed work flow management system |
CN104778076A (en) * | 2015-04-27 | 2015-07-15 | 东南大学 | Scheduling method for cloud service workflow |
CN106095569A (en) * | 2016-06-01 | 2016-11-09 | 中山大学 | A kind of cloud workflow engine scheduling of resource based on SLA and control method |
CN108665157A (en) * | 2018-05-02 | 2018-10-16 | 中山大学 | A method of realizing cloud Workflow system flow instance balance dispatching |
Non-Patent Citations (2)
Title |
---|
An Improved Adaptive Workflow Scheduling Algorithm in Cloud Environments;Yinjuan zhagn et al;《IEEE》;20160321;全文 * |
多目标云工作流调度的协同进化多群体优化;刘雨潇等;《计算机工程与设计》;20180516;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN109861850A (en) | 2019-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109861850B (en) | SLA-based stateless cloud workflow load balancing scheduling method | |
Salot | A survey of various scheduling algorithm in cloud computing environment | |
CN111427679B (en) | Computing task scheduling method, system and device for edge computing | |
US10474504B2 (en) | Distributed node intra-group task scheduling method and system | |
US8701121B2 (en) | Method and system for reactive scheduling | |
US9218213B2 (en) | Dynamic placement of heterogeneous workloads | |
US20130007753A1 (en) | Elastic scaling for cloud-hosted batch applications | |
US11102143B2 (en) | System and method for optimizing resource utilization in a clustered or cloud environment | |
Ashouraei et al. | A new SLA-aware load balancing method in the cloud using an improved parallel task scheduling algorithm | |
CN109857535B (en) | Spark JDBC-oriented task priority control implementation method and device | |
CN108428051B (en) | MapReduce job scheduling method and device facing big data platform and based on maximized benefits | |
CN106095581B (en) | Network storage virtualization scheduling method under private cloud condition | |
CN104239154A (en) | Job scheduling method in Hadoop cluster and job scheduler | |
CN110196773B (en) | Multi-time-scale security check system and method for unified scheduling computing resources | |
CN117573373B (en) | CPU virtualization scheduling method and system based on cloud computing | |
US20220405133A1 (en) | Dynamic renewable runtime resource management | |
Singh et al. | A comparative study of various scheduling algorithms in cloud computing | |
CN116708451B (en) | Edge cloud cooperative scheduling method and system | |
CN109783236A (en) | Method and apparatus for output information | |
CN116820729A (en) | Offline task scheduling method and device and electronic equipment | |
CN117112199A (en) | Multi-tenant resource scheduling method, device and storage medium | |
Yakubu et al. | Priority based delay time scheduling for quality of service in cloud computing networks | |
Naik | A deadline-based elastic approach for balanced task scheduling in computing cloud environment | |
CN114035940A (en) | Resource allocation method and device | |
CN112988363A (en) | Resource scheduling method, device, server and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |