CN109861850B - SLA-based stateless cloud workflow load balancing scheduling method - Google Patents

SLA-based stateless cloud workflow load balancing scheduling method Download PDF

Info

Publication number
CN109861850B
CN109861850B CN201910028641.0A CN201910028641A CN109861850B CN 109861850 B CN109861850 B CN 109861850B CN 201910028641 A CN201910028641 A CN 201910028641A CN 109861850 B CN109861850 B CN 109861850B
Authority
CN
China
Prior art keywords
request
queue
delay
service
rtl
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910028641.0A
Other languages
Chinese (zh)
Other versions
CN109861850A (en
Inventor
余阳
黄钦开
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN201910028641.0A priority Critical patent/CN109861850B/en
Publication of CN109861850A publication Critical patent/CN109861850A/en
Application granted granted Critical
Publication of CN109861850B publication Critical patent/CN109861850B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a stateless cloud workflow load balancing scheduling method based on SLA, which selects different SLA levels according to the requirements of different tenants on service scenes and process models of the tenants, provides different request throughput services for the tenants through the different SLA levels, performs hierarchical service on different process requests, realizes real-time engine load monitoring by combining shared memory and the distribution condition of the process models on an engine, reduces the request peak of the engine service and the whole memory overhead of an engine cluster, thereby improving the load balancing capacity of the cloud workflow under a multi-tenant architecture, and enabling a process service provider to provide services for more tenants on the basis of meeting the analysis execution performance requirements of different tenants on the request throughput and different process definitions.

Description

SLA-based stateless cloud workflow load balancing scheduling method
Technical Field
The invention relates to the technical field of workflow and cloud computing, in particular to a stateless cloud workflow load balancing scheduling method based on SLA.
Background
With the development of distributed computing, particularly grid technology, cloud computing has emerged as a new service computing model. Cloud computing is a mode of resource delivery and usage, and refers to obtaining resources required by an application through a network, including hardware, platforms, software, and the like, and the network providing the resources is referred to as a "cloud". In cloud computing, anything is a service, which can be generally divided into three levels: infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS).
The cloud workflow is a PaaS-level service, and refers to a distributed system for providing workflow services in a platform-as-a-service cloud computing mode. Compared with a traditional workflow system, the cloud workflow has the main advantages that: the cloud workflow provides a mode of using according to needs and paying according to quantity, and the mode can effectively reduce the investment cost of using workflow management software by enterprises and reduce the starting difficulty; the cloud workflow has the advantages of high resource utilization rate and high service performance, the centralized management mode can fully utilize computing power, and flexible resource configuration can also cope with request loads in different time periods.
The traditional workflow engine is usually realized based on a stateful scheme, and in order to better exert the flexibility of cloud resources and improve the reliability of a cloud workflow system in a cloud environment, the workflow engine realized based on the stateless scheme can better meet the requirements of cloud workflow; for a cloud workflow system realized based on a stateless workflow engine, on one hand, due to the characteristics of workflow services, a flow model is necessary to be analyzed and analysis results are necessary to be stored, and certain computing resources and storage resources are required to be occupied; on the other hand, in a cloud environment, a cloud workflow needs to support multi-tenant business process execution, and a load scenario is much more complex than that of a traditional workflow engine. When the cloud workflow system realized based on the stateless workflow engine faces the request load of multiple tenants, multiple process models and multiple process instances, if only the stateless property of the service is considered for scheduling, the characteristics of the workflow service can not be fully utilized and exerted, so that better request load balancing effect and user experience can not be realized.
Currently, the system architecture and management structure of the existing workflow cluster system in the workflow field usually only serve the same user or the same organization. In a cloud service business action mode, a process service provider wants to provide process analysis services for more tenants under the condition of the same hardware resources, different tenants often have different requirements on request throughput of engine services according to business scenes, and the same tenant often has different requirements on analysis execution performance defined by different processes, so that the tenant and the process service provider need to sign an SLA contract, and a system provides corresponding service levels to the tenant according to the SLA contract.
Disclosure of Invention
The invention provides a stateless cloud workflow load balancing scheduling method based on SLA (service level agreement) for solving the problems of requirements of different tenants on different service levels under cloud workflow and the requirement of a process service provider on increasing the number of tenants under the same hardware resource, which realizes optimization of load balancing effect and execution performance of cloud workflow requests while ensuring cloud tenant service experience, so that a cloud workflow system provides process analysis service for more tenants under a normal service state.
In order to achieve the purpose of the invention, the technical scheme is as follows: a method for load balancing scheduling of stateless cloud workflows based on SLAs is disclosed, wherein when a process instance request corresponding to a tenant uploading process model is received, the cloud workflows schedule the process instance request to a stateless workflow engine in a cluster, and the execution comprises the following steps:
and smoothing the load waveform of the admission layer:
s101: an admission layer receives a tenant process instance request, and acquires a service request arrival rate RAR index of a tenant and a request response time level RTL for the process instance request from a tenant SLA warehouse according to a tenant ID or process instance request information;
s102: judging whether the tenant service request rate meets an RAR index or not according to a system current-limiting algorithm, if the tenant service request rate exceeds the service request rate specified by the RAR index, directly filtering the request, feeding back the request to the tenant, and prompting to purchase a higher RAR level, otherwise, executing the next step;
s103: judging RTL levels, executing the request balanced distribution of a scheduling layer according to different RTL levels, acquiring the request number of a current immediate execution queue and a delay queue, calculating the score of a current process instance request for each delay queue according to the request number of the delay queue by using a historical load variable historySize, and placing the request in the delay queue with the highest score;
and the dispatching layer requests balanced dispatching:
s201: the scheduling layer receives a request from an immediate execution queue of the admission layer, and acquires a load information set E ═ E of each process engine service sent by the process service layer from the shared memory1,…,em],ei=(cpui,rami),cpuiRepresentation flow Engine service eiCurrent cpu occupancy, ramiRepresentation flow Engine service eiCurrent ram occupancy;
s202: flow for obtaining request from flow example warehouse by scheduling layerThe distribution condition set D ═ D of the process instance corresponding to the model in the process engine service1,d2,…,dm],di∈[0,1]When d isiWhen equal to 0, the flow model does not run at eiOn the engine, otherwise, the reverse is true;
s203: dividing the process engine services into two groups E according to the distribution condition set D1And E2,E1In which all the engines on which the process model has been executed, i.e. d, are storedi=1;E2The rest of the engines are stored;
s204: for E1And E2The element of (A) is subjected to engine busyness calculation to respectively obtain E1、E2Least busy engine service
Figure GDA0002945682290000031
Judgment inequality
Figure GDA0002945682290000032
If not, dispatch the process instance request to
Figure GDA0002945682290000033
Otherwise is assigned to
Figure GDA0002945682290000034
Modifying the distribution condition set in the process instance warehouse; completing the request scheduling of the process instance;
where β is a cost parameter for allocating the process instance request to the new engine, and may be set according to specific hardware resource characteristics.
Preferably, in step S101, the service request arrival rate RAR is used to measure the throughput of the process instance request, and represents the highest number of process instance requests that can be sent by the tenant per second;
the RAR index is divided into three stages, and v is defined0,v1,v2Wherein v is0、v1、v2Is an integer and has v0>v1>v2Then threeThe individual levels are described as follows:
RAR 0 means that the service request arrival rate is at most equal to v0
RAR 1 means that the service request arrival rate is at most equal to v1
RAR 2 means that the service request arrival rate is at most equal to v2
Different levels of RAR correspond to different charging.
Preferably, in step S101, the request response time level RTL is used to measure the processing performance of different process requests, and the RTL is proposed based on the diversity of the execution time ranges of the workflow;
the RTL level is divided into three levels, parameters a, b and t are defined, wherein a, b and t are integers, a is smaller than b, t represents the time required by the engine to process a process instance request and represents the length of a time slice, and the RTL level can be obtained by testing the service of the process engine, and then the RTL level is as follows:
RTL 0: the request of the process example is responded in 1 time slice, namely t;
RTL 1: the request responds at the latest (a +1) time slices, namely (a +1) t;
RLT 2: the request responds at the latest at (b +1) time slices, i.e., (b +1) t.
Further, the system current limiting algorithm adopts a sliding window algorithm to ensure the RAR index of the tenant; the implementation is performed in a manner of using request cache for the RTL level, and the specific manner is as follows:
the admission layer maintains b +1 queues for storing process instance requests, each queue corresponds to a delay duration variable representing the delay duration of the process instance requests in the queue, and the values of the delay duration variable are 0t, 1t, 2t, … and bt, respectively, wherein: t is a time slice defined in RTL, a queue with a delay time variable of 0t is an immediate execution queue, a queue with a delay time variable of 1t, 2t, …, bt is a delay queue; the admission layer also needs to update the delay queue and the historical load variable after 1 time slice, and the historical load variable is used for measuring the condition of the number of requests of each past time slice; the admission layer needs to put new process instance requests into corresponding delay queues according to the RTL level set by the tenant for the process tasks and the number of the process instance requests stored in each current queue.
Still further, step S103, specifically, determining an RTL level;
if the RTL level is RTL 0, directly executing the request balance dispatch of the scheduling layer;
if the RTL level is RTL 1, which indicates that the delay time is less than or equal to a slots, the request numbers of the current immediate execution queue and the previous a delay queues are obtained, and the set is N ═ N0,n1,…,na]Wherein n isiThe number of requests representing the ith queue, i ═ 0, 1 … a;
if the RTL level is RTL 2, which indicates that the delay time is less than or equal to b slots, the request numbers of the current immediate execution queue and all the delay queues are obtained, and the set is N ═ N0,n1,…,nb]Wherein n isjThe number of requests in the jth queue is represented, j being 0 and 1 … b.
Still further, in step S103, the calculation method for calculating the score of the current request for each delay queue is as follows:
Figure GDA0002945682290000041
in the formula: n isiE N, i denotes the position of the current delay queue in the set N.
Still further, the admission layer needs to update the delay queues and the historical load variables after every 1 time slice, including the following steps:
h1: subtracting 1 from the delay time variable of all the delay queues, judging whether the delay time variable is equal to 0, if so, adding all the requests in the queues to the immediate execution queue, and resetting the delay time to be bt;
h2: executing the queue immediately requires running a thread all the time to judge whether the queue has a request, submitting the request to a scheduling layer in sequence for request dispatching according to the rate of processing the request by an engine service, and recording the number of the requests submitted to a process engine in each time slice;
h3: acquiring the request number requestSize submitted to a scheduling layer by an immediate execution queue for request dispatch in the past 1 time slice, and updating a historical load variable historySize according to the following formula:
historySize=α*historySize+(1-α)*requestSize
wherein: α represents a weighting factor representing the degree of attenuation of the previous second historsize value.
Still further, the engine busyness calculation formula is as follows:
busynessi=w1*cpui+w2*rami,w1+w2=1
wherein: w is a1And w2The two parameters respectively represent the importance degrees of two load parameters, namely cpu and ram, and need to be configured according to the characteristics of hardware resources.
In the request balanced distribution of the scheduling layer, the scheduling layer realizes that the requests corresponding to the same process model are distributed to a small number of engines according to the load condition of the process engine service of the process service layer and the characteristics of the stateless workflow engine, so that the calculation and the memory consumption caused by multiple analysis and result storage of the same process model are reduced.
The invention has the following beneficial effects: according to the invention, different SLA levels are selected according to the requirements of different tenants on service scenes and process models of the tenants, different request throughput services are provided for the tenants by the cloud workflow system through the different SLA levels, different process requests are subjected to classified service, and the real-time engine load monitoring and the distribution condition of the process models on the engine are realized by combining shared memory, so that the overall memory overhead of an engine cluster is reduced while the request wave crest of the engine service is reduced, the load balancing capability of the cloud workflow under a multi-tenant architecture is improved, and a process service provider can provide services for more tenants on the basis of meeting the analysis execution performance requirements of different tenants on the request throughput and different process definitions.
Drawings
FIG. 1 is a cloud workflow core component diagram.
Detailed Description
The invention is described in detail below with reference to the drawings and the detailed description.
Example 1
As shown in fig. 1, a method for stateless cloud workflow load balancing scheduling based on SLA includes steps of admission layer load waveform smoothing, scheduling layer request balancing assignment processing; before specifically introducing the two steps, defining quantitative indexes for representing different request throughputs and processing performances of different flow requests in a cloud workflow service SLA, wherein the request throughputs are measured by using a service request arrival rate RAR and represent the highest flow instance request number which can be sent by a tenant per second; the performance of different process request processing is measured by the request response time level RTL, which is proposed based on the diversity of the execution time range of the workflow, which varies from several microseconds to several months.
According to the embodiment, according to the needs of tenants for service scenes, RAR indexes can be divided into three levels, and v is defined0,v1,v2Wherein v is0、v1、v2Is an integer and has v0>v1>v2Then the three levels are described as follows:
RAR 0 for high concurrent service scene, that is, service request arrival rate is at most equal to v0
RAR 1 is used for a common concurrent service scene, and means that the service request arrival rate is at most equal to v1
RAR 2 for low concurrent service scenario, meaning that service request arrival rate is at most equal to v2
RARs of different classes correspond to different charges, and classes with higher service request arrival rates also have correspondingly higher charges.
In this embodiment, according to different requirements of different process requests on processing real-time performance, the RTL level may be divided into three levels, before detailed statement, parameters a, b, and t are defined, where a, b, and t are integers, and a is smaller than b, and t represents a time required by an engine to process a process instance request, and also represents a length of a time slice, and can be obtained by testing engine services, and the SLA level is stated as follows:
RTL 0: for a process instance request with higher real-time requirement, the process instance request responds in 1 time slice, namely t; mostly, the process is automated;
RTL 1: for a general flow instance request with real-time requirements, the request responds at (a +1) time slices at the latest, namely (a +1) t;
RLT 2: for a flow instance request with lower real-time requirements, the request responds at the latest in (b +1) time slices, i.e., (b +1) t.
The tenant can select different SLA indexes for different tasks in the process model according to the actual use condition of the model when uploading the process model, so that different charging is obtained, and the higher the SLA index is, the higher the charging response is.
In the method for load balancing scheduling of a stateless cloud workflow based on an SLA according to this embodiment, when a process instance request corresponding to a tenant upload process model is received, a cloud workflow schedules the process instance request to a stateless workflow engine in a cluster, and the execution includes the following steps:
and smoothing the load waveform of the admission layer:
s101: the admission layer receives a tenant process instance request, and acquires an RAR index of the tenant and an RTL level requested by the process instance from a tenant SLA warehouse according to the tenant ID and the process request information;
s102: judging whether the tenant service request rate meets an RAR index or not according to a time window algorithm, if the tenant service request rate exceeds the service request rate specified by the RAR index, directly filtering the request, feeding back the request to the tenant, and prompting to purchase a higher RAR level; otherwise, executing the next step;
s103: judging the RTL level, and if the RTL level is RTL 0, directly executing the request balance allocation of a scheduling layer for scheduling; if the RTL level is RTL 1, which indicates that only a time slices can be delayed at most, then the current immediate execution queue and the previous a are obtainedThe request number of each delay queue is set as N ═ N0,n1,…,na]Wherein n isiThe number of requests representing the ith queue, i ═ 0, 1 … a; if the RTL level is RTL 2, which indicates that b time slices can be delayed at most, the request numbers of the current immediate execution queue and all the delay queues are obtained, and the set is N ═ N0,n1,…,nb]Wherein n isjRepresents the number of requests of the jth queue, j is 0, 1 … b;
s104: using the historical load variable historySize, for each element of set N, the score for each delay queue for the current request is calculated using:
Figure GDA0002945682290000071
in the formula, niE N, i denotes the position of the current delay queue in the set N.
And (4) placing the request in the delay queue with the highest score according to the score of each delay queue calculated in the steps.
The dispatch layer request balanced dispatch comprises the following steps:
s201: the scheduling layer receives a request from an immediate execution queue of the admission layer, and acquires a load information set E ═ E of each process engine service sent by the process service layer from the shared memory1,…,em],ei=(cpui,rami),cpuiRepresentation flow Engine service eiCurrent cpu occupancy, ramiRepresentation flow Engine service eiCurrent ram occupancy;
s202: the scheduling layer obtains a distribution condition set D ═ D [ D ] of the flow instance corresponding to the requested flow model in the flow engine service from the flow instance warehouse1,d2,…,dm],di∈[0,1]When d isiWhen equal to 0, the flow model does not run at eiOn the engine, otherwise, the reverse is true;
s203: dividing the process engine services into two groups E according to the distribution condition set D1And E2,E1In which all the engines on which the process model has been executed, i.e. d, are storedi=1;E2The rest of the engines are stored;
s204: for E1And E2The engine busyness calculation is performed by the following formula:
busynessi=w1*cpui+w2*rami,w1+w2=1
wherein: w is a1And w2The two parameters respectively represent the importance degrees of two load parameters, namely cpu and ram, and need to be configured according to the characteristics of hardware resources; respectively obtaining E through an engine busyness formula1、E2Least busy engine service
Figure GDA0002945682290000072
Judgment inequality
Figure GDA0002945682290000073
Whether the result is true or not; if the inequality holds, dispatch the request to
Figure GDA0002945682290000074
Otherwise is assigned to
Figure GDA0002945682290000075
Modifying the distribution condition set in the process instance warehouse;
wherein: beta is a cost parameter for allocating the process instance request to the new engine, and can be set according to the specific hardware resource characteristics;
s205: and completing request scheduling.
In the embodiment, in the request balanced dispatch of the scheduling layer, the scheduling layer allocates the requests corresponding to the same process model to a small number of engines according to the load condition of the process engine service of the process service layer and according to the characteristics of the stateless workflow engine, so that the computation and the memory consumption caused by multiple analysis and result storage of the same process model are reduced.
In the system current limiting algorithm of the embodiment, a sliding window algorithm is adopted to ensure an RAR index of a tenant; the implementation is performed in a manner of using request cache for the RTL level, and the specific manner is as follows:
the admission layer maintains b +1 queues for storing process instance requests, each queue corresponds to a delay duration variable representing the delay duration of the process instance requests in the queue, and the values of the delay duration variable are 0t, 1t, 2t, … and bt, respectively, wherein: t is a time slice defined in RTL, a queue with a delay time variable of 0t is an immediate execution queue, a queue with a delay time variable of 1t, 2t, …, bt is a delay queue; the admission layer also needs to update the delay queue and the historical load variable after 1 time slice, and the historical load variable is used for measuring the condition of the number of requests of each past time slice; the admission layer needs to put new process instance requests into corresponding delay queues according to the RTL level set by the tenant for the process tasks and the number of the process instance requests stored in each current queue.
The admission layer needs to update the delay queues and the historical load variables after every 1 time slice, including the following steps:
h1: subtracting 1 from the delay time variable of all the delay queues, judging whether the delay time variable is equal to 0, if so, adding all the requests in the queues to the immediate execution queue, and resetting the delay time to be bt;
h2: executing the queue immediately requires running a thread all the time to judge whether the queue has a request, submitting the request to a scheduling layer in sequence for request dispatching according to the rate of processing the request by an engine service, and recording the number of the requests submitted to a process engine in each time slice;
h3: acquiring the request number requestSize submitted to a scheduling layer by an immediate execution queue for request dispatch in the past 1 time slice, and updating a historical load variable historySize according to the following formula:
historySize=α*historySize+(1-α)*requestSize
wherein: α represents a weighting factor representing the decay rate of the previous second historsize value.
The method is characterized in that different requirements of service scenes of different tenants on request throughput and different process definition analysis execution performances are based, the request delay is carried out by utilizing the diversity of execution time ranges in the workflow, particularly the characteristic that part of tasks do not need real-time response, and the current limitation of the tenants and the request load balancing optimization under the current limitation constraint are realized; combining the characteristics of the stateless workflow engine service, realizing a load balancing strategy of requesting minimum engine number distribution by the process model; the access of the engine service load information is realized by utilizing the shared memory; according to the method, the load balancing effect and the execution performance of the cloud workflow request can be optimized while the service experience of the cloud tenants is guaranteed, so that the cloud workflow system can provide flow analysis services for more tenants in a normal service state.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (8)

1. A method for load balancing scheduling of stateless cloud workflows based on SLA is characterized in that: when a process instance request corresponding to a tenant uploading process model is received, the cloud workflow schedules the process instance request to a stateless workflow engine in a cluster, and the execution comprises the following steps:
and smoothing the load waveform of the admission layer:
s101: an admission layer receives a tenant process instance request, and acquires a service request arrival rate RAR index of a tenant and a request response time level RTL for the process instance request from a tenant SLA warehouse according to a tenant ID or process instance request information;
s102: judging whether the tenant service request rate meets an RAR index or not according to a system current-limiting algorithm, if the tenant service request rate exceeds the service request rate specified by the RAR index, directly filtering the request, feeding back the request to the tenant, and prompting to purchase a higher RAR level, otherwise, executing the next step;
s103: judging RTL levels, executing the request balanced distribution of a scheduling layer according to different RTL levels, acquiring the request number of a current immediate execution queue and a delay queue, calculating the score of a current process instance request for each delay queue according to the request number of the delay queue by using a historical load variable historySize, and placing the process instance request in the delay queue with the highest score;
and the dispatching layer requests balanced dispatching:
s201: the scheduling layer receives a request from an immediate execution queue of the admission layer, and acquires a load information set E ═ E of each process engine service sent by the process service layer from the shared memory1,…,em],ei=(cpui,rami),cpuiRepresentation flow Engine service eiCurrent cpu occupancy, ramiRepresentation flow Engine service eiCurrent ram occupancy;
s202: the scheduling layer obtains a distribution condition set D ═ D [ D ] of the flow instance corresponding to the requested flow model in the flow engine service from the flow instance warehouse1,d2,…,dm],di∈[0,1]When d isiWhen equal to 0, the flow model does not run at eiOn the engine, otherwise, the reverse is true;
s203: dividing the process engine services into two groups E according to the distribution condition set D1And E2,E1In which all the engines on which the process model has been executed, i.e. d, are storedi=1;E2The rest of the engines are stored;
s204: for E1And E2The element of (A) is subjected to engine busyness calculation to respectively obtain E1、E2Least busy engine service
Figure FDA0002945682280000011
Judgment inequality
Figure FDA0002945682280000012
If not, dispatch the process instance request to
Figure FDA0002945682280000013
Otherwise is assigned to
Figure FDA0002945682280000014
Modifying the distribution condition set in the process instance warehouse; completing the request scheduling of the process instance;
where β is a cost parameter for allocating the process instance request to the new engine, and may be set according to specific hardware resource characteristics.
2. The SLA-based stateless cloud workflow load balancing scheduling method of claim 1, wherein: step S101, the service request arrival rate RAR is used for measuring the throughput of the process instance request and representing the highest process instance request number which can be sent by the tenant per second;
the RAR index is divided into three stages, and v is defined0,v1,v2Wherein v is0、v1、v2Is an integer and has v0>v1>v2Then the three levels are described as follows:
RAR 0 means that the service request arrival rate is at most equal to v0
RAR 1 means that the service request arrival rate is at most equal to v1
RAR 2 means that the service request arrival rate is at most equal to v2
Different levels of RAR correspond to different charging.
3. The SLA-based stateless cloud workflow load balancing scheduling method of claim 1, wherein: step S101, the request response time level RTL is used for measuring the processing performance of different process requests, and the RTL is proposed based on the diversity of the execution time range of the workflow;
the RTL level is divided into three levels, and parameters a, b and t are defined, wherein a, b and t are integers, a is smaller than b, t represents the time required by the engine to process a flow instance request and represents the length of a time slice, and the RTL level is as follows:
RTL 0: the request of the process example is responded in 1 time slice, namely t;
RTL 1: the process instance request responds at the latest (a +1) time slices, i.e., (a +1) t;
RLT 2: the process instance request responds at the latest (b +1) time slices, i.e., (b +1) t.
4. The SLA-based stateless cloud workflow load balancing scheduling method of claim 3, wherein: the system current limiting algorithm adopts a sliding window algorithm to ensure the RAR index of the tenant; the implementation is performed in a manner of using request cache for the RTL level, and the specific manner is as follows:
the admission layer maintains b +1 queues for storing process instance requests, each queue corresponds to a delay duration variable representing the delay duration of the process instance requests in the queue, and the values of the delay duration variable are 0t, 1t, 2t, … and bt, respectively, wherein: t is a time slice defined in RTL, a queue with a delay time variable of 0t is an immediate execution queue, a queue with a delay time variable of 1t, 2t, …, bt is a delay queue; the admission layer also needs to update the delay queue and the historical load variable after 1 time slice, and the historical load variable is used for measuring the condition of the number of requests of each past time slice; the admission layer needs to put new process instance requests into corresponding delay queues according to the RTL level set by the tenant for the process tasks and the number of the process instance requests stored in each current queue.
5. The SLA-based stateless cloud workflow load balancing scheduling method of claim 4, wherein: step S103, specifically, judging the RTL level;
if the RTL level is RTL 0, directly executing the request balance dispatch of the scheduling layer;
if the RTL level is RTL 1, the description is extendedIf the delay time is less than or equal to a time slices, acquiring the request number of the current immediate execution queue and the previous a delay queues, wherein the set is N ═ N0,n1,…,na]Wherein n isiThe number of requests representing the ith queue, i ═ 0, 1 … a;
if the RTL level is RTL 2, which indicates that the delay time is less than or equal to b slots, the request numbers of the current immediate execution queue and all the delay queues are obtained, and the set is N ═ N0,n1,…,nb]Wherein n isjThe number of requests in the jth queue is represented, j being 0 and 1 … b.
6. The SLA-based stateless cloud workflow load balancing scheduling method of claim 5, wherein: step S103, the calculation method for calculating the score of the current request for each delay queue is as follows:
Figure FDA0002945682280000031
in the formula, niE N, i denotes the position of the current delay queue in the set N.
7. The SLA-based stateless cloud workflow load balancing scheduling method of claim 6, wherein: the admission layer needs to update the delay queues and the historical load variables after every 1 time slice, including the following steps:
h1: subtracting 1 from the delay time variable of all the delay queues, judging whether the delay time variable is equal to 0, if so, adding all the requests in the delay queues to the immediate execution queue, and resetting the delay time to be bt;
h2: executing the queue immediately requires running a thread all the time to judge whether the queue has a request, submitting the request to a scheduling layer in sequence for request dispatching according to the rate of processing the request by an engine service, and recording the number of the requests submitted to a process engine in each time slice;
h3: acquiring the request number requestSize submitted to a scheduling layer by an immediate execution queue for request dispatch in the past 1 time slice, and updating a historical load variable historySize according to the following formula:
historySize=α*historySize+(1-α)*requestSize
wherein: α represents a weighting factor representing the degree of attenuation of the previous second historsize value.
8. The SLA-based stateless cloud workflow load balancing scheduling method of claim 7, wherein: the engine busyness calculation formula is as follows:
busynessi=w1*cpui+w2*rami,w1+w2=1
wherein: w is a1And w2The two parameters respectively represent the importance degrees of two load parameters, namely cpu and ram, and need to be configured according to the characteristics of hardware resources.
CN201910028641.0A 2019-01-11 2019-01-11 SLA-based stateless cloud workflow load balancing scheduling method Active CN109861850B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910028641.0A CN109861850B (en) 2019-01-11 2019-01-11 SLA-based stateless cloud workflow load balancing scheduling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910028641.0A CN109861850B (en) 2019-01-11 2019-01-11 SLA-based stateless cloud workflow load balancing scheduling method

Publications (2)

Publication Number Publication Date
CN109861850A CN109861850A (en) 2019-06-07
CN109861850B true CN109861850B (en) 2021-04-02

Family

ID=66894503

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910028641.0A Active CN109861850B (en) 2019-01-11 2019-01-11 SLA-based stateless cloud workflow load balancing scheduling method

Country Status (1)

Country Link
CN (1) CN109861850B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110659463B (en) * 2019-08-23 2021-11-12 苏州浪潮智能科技有限公司 Distributed operation method and device of stateless system
CN110737485A (en) * 2019-09-29 2020-01-31 武汉海昌信息技术有限公司 workflow configuration system and method based on cloud architecture
US11042415B2 (en) * 2019-11-18 2021-06-22 International Business Machines Corporation Multi-tenant extract transform load resource sharing
CN110941681B (en) * 2019-12-11 2021-02-23 南方电网数字电网研究院有限公司 Multi-tenant data processing system, method and device of power system
US11841871B2 (en) 2021-06-29 2023-12-12 International Business Machines Corporation Managing extract, transform and load systems
CN115237573B (en) * 2022-08-05 2023-08-18 中国铁塔股份有限公司 Data processing method, device, electronic equipment and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101105843A (en) * 2006-07-14 2008-01-16 上海移动通信有限责任公司 Electronic complaint processing system and method in communication field
CN101694709A (en) * 2009-09-27 2010-04-14 华中科技大学 Service-oriented distributed work flow management system
CN104778076A (en) * 2015-04-27 2015-07-15 东南大学 Scheduling method for cloud service workflow
CN106095569A (en) * 2016-06-01 2016-11-09 中山大学 A kind of cloud workflow engine scheduling of resource based on SLA and control method
CN108665157A (en) * 2018-05-02 2018-10-16 中山大学 A method of realizing cloud Workflow system flow instance balance dispatching

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8984503B2 (en) * 2009-12-31 2015-03-17 International Business Machines Corporation Porting virtual images between platforms
US8756251B2 (en) * 2011-07-19 2014-06-17 HCL America Inc. Method and apparatus for SLA profiling in process model implementation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101105843A (en) * 2006-07-14 2008-01-16 上海移动通信有限责任公司 Electronic complaint processing system and method in communication field
CN101694709A (en) * 2009-09-27 2010-04-14 华中科技大学 Service-oriented distributed work flow management system
CN104778076A (en) * 2015-04-27 2015-07-15 东南大学 Scheduling method for cloud service workflow
CN106095569A (en) * 2016-06-01 2016-11-09 中山大学 A kind of cloud workflow engine scheduling of resource based on SLA and control method
CN108665157A (en) * 2018-05-02 2018-10-16 中山大学 A method of realizing cloud Workflow system flow instance balance dispatching

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
An Improved Adaptive Workflow Scheduling Algorithm in Cloud Environments;Yinjuan zhagn et al;《IEEE》;20160321;全文 *
多目标云工作流调度的协同进化多群体优化;刘雨潇等;《计算机工程与设计》;20180516;全文 *

Also Published As

Publication number Publication date
CN109861850A (en) 2019-06-07

Similar Documents

Publication Publication Date Title
CN109861850B (en) SLA-based stateless cloud workflow load balancing scheduling method
Salot A survey of various scheduling algorithm in cloud computing environment
CN111427679B (en) Computing task scheduling method, system and device for edge computing
US10474504B2 (en) Distributed node intra-group task scheduling method and system
US8701121B2 (en) Method and system for reactive scheduling
US9218213B2 (en) Dynamic placement of heterogeneous workloads
US20130007753A1 (en) Elastic scaling for cloud-hosted batch applications
US11102143B2 (en) System and method for optimizing resource utilization in a clustered or cloud environment
Ashouraei et al. A new SLA-aware load balancing method in the cloud using an improved parallel task scheduling algorithm
CN109857535B (en) Spark JDBC-oriented task priority control implementation method and device
CN108428051B (en) MapReduce job scheduling method and device facing big data platform and based on maximized benefits
CN106095581B (en) Network storage virtualization scheduling method under private cloud condition
CN104239154A (en) Job scheduling method in Hadoop cluster and job scheduler
CN110196773B (en) Multi-time-scale security check system and method for unified scheduling computing resources
CN117573373B (en) CPU virtualization scheduling method and system based on cloud computing
US20220405133A1 (en) Dynamic renewable runtime resource management
Singh et al. A comparative study of various scheduling algorithms in cloud computing
CN116708451B (en) Edge cloud cooperative scheduling method and system
CN109783236A (en) Method and apparatus for output information
CN116820729A (en) Offline task scheduling method and device and electronic equipment
CN117112199A (en) Multi-tenant resource scheduling method, device and storage medium
Yakubu et al. Priority based delay time scheduling for quality of service in cloud computing networks
Naik A deadline-based elastic approach for balanced task scheduling in computing cloud environment
CN114035940A (en) Resource allocation method and device
CN112988363A (en) Resource scheduling method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant