CN109861850A

CN109861850A - A method of the stateless cloud workflow load balance scheduling based on SLA

Info

Publication number: CN109861850A
Application number: CN201910028641.0A
Authority: CN
Inventors: 余阳; 黄钦开
Original assignee: National Sun Yat Sen University
Current assignee: Sun Yat Sen University; National Sun Yat Sen University
Priority date: 2019-01-11
Filing date: 2019-01-11
Publication date: 2019-06-07
Anticipated expiration: 2039-01-11
Also published as: CN109861850B

Abstract

The method of the invention discloses a kind of stateless cloud workflow load balance scheduling based on SLA, the present invention is according to different tenants to the needs of its business scenario and procedural model, select different SLA ranks, pass through different SLA ranks, cloud Workflow system provides tenant different request throughput services, and class-of-service is carried out to different process requests, in conjunction with the engine load real time monitoring and procedural model distribution situation on engine of Sharing Memory Realization, the overall memory expense of engine cluster is also reduced while the request wave crest for having cut down engine service, to improve the ability of cloud workflow load balancing under multi-tenant architecture, so that flow services provider is on the basis of meeting parsing execution performance demand of the different tenants to request handling capacity and different flow definitions, service can be provided for more tenants.

Description

A method of the stateless cloud workflow load balance scheduling based on SLA

Technical field

The present invention relates to workflows and field of cloud computer technology, more particularly to a kind of stateless Yun work based on SLA Make the method for current load balance dispatching.

Background technique

With the development of distributed computing especially grid, cloud computing as a kind of novel service computation model and It generates.Cloud computing is a kind of resource delivery and use pattern, refers to obtain by network and applies required resource, including hardware, flat Platform, software etc., the network for providing resource are referred to as " cloud ".In cloud computing, anything is all service, generally can be divided into three Level: infrastructure services (IaaS), platform i.e. service (PaaS) and software services (SaaS).

Cloud workflow is serviced as a kind of PaaS grades, is referred to and is provided workflow service with the cloud computing mode that platform services Distributed system.Compared to conventional operation streaming system, the main advantage of cloud workflow is: cloud workflow provides to be made on demand With, the mode paid, this mode can effectively reduce the input cost that enterprise uses Workflow Management software according to quantity, reduce starting Difficulty；Cloud workflow has the advantages that resource high usage and service are high performance, and centralized management mode can make full use of calculating Power, flexible resource distribution can also cope with the request load of different periods.

Conventional operation stream engine is often based on what stateful scheme was realized, and under cloud environment, in order to preferably play The flexibility of cloud resource and the reliability for improving cloud Workflow system, the workflow engine realized based on stateless scheme can be more Meet the demand of cloud workflow；For the cloud Workflow system realized based on stateless workflow engine, on the one hand due to work The characteristic of stream service itself, process of analysis model and storing and resolving result are still essential, and need to occupy certain calculating Resource and storage resource；On the other hand, under cloud environment, the operation flow that cloud workflow needs support multi-tenant is executed, load Scene is more much more complex than conventional operation stream engine.Based on the cloud Workflow system that stateless workflow engine is realized, in face of more When the request of tenant, multipaths model, multipaths example load, if only considering that the statelessness of service is scheduled, It will lead to the characteristic for being unable to fully utilize and play workflow service itself, to cannot achieve preferably request load balancing effect Fruit and user experience.

Currently, the system architecture of existing workflow group system and management structure usually only serve in workflow field The same user or the same tissue.In cloud service business action mode, flow services provider wishes in same hardware Process analysis service is provided to more tenants under resource situation, different tenants often takes engine according to its business scenario The request handling capacity of business has different requirements, and same tenant is often also different to the parsing execution performance of its different flow definition Requirement, therefore tenant and flow services provider need to sign SLA contract, and system is provided by SLA agreement to tenant corresponding Service level.

Summary of the invention

The present invention in order to solve based on to tenants different under cloud workflow to varying service level demand and process take Business provider, to the problem of needs of tenant's quantity is improved, provides a kind of based on the stateless of SLA under same hardware resource The method of cloud workflow load balance scheduling, while guaranteeing cloud tenant service experience now in fact, optimization cloud workflow request Load balancing effect and execution performance, so that cloud Workflow system be allowed to provide stream under normal service state for more tenants Journey analysis service.

To realize aforementioned present invention purpose, the technical solution adopted is as follows: a kind of stateless cloud workflow based on SLA is negative The method for carrying balance dispatching, when receiving tenant and uploading the request of flow instance corresponding to procedural model, cloud workflow is by process Example request is dispatched in the stateless workflow engine in cluster, execute the following steps are included:

It is smooth that access layer loads waveform:

S101: access layer receives the request of tenant's flow instance, access layer according to tenant ID or flow instance solicited message from The warehouse tenant SLA obtains the service request arrival rate RAR index of the tenant and the request of flow instance request is rung Answer the other RTL of time stage；

S102: according to system limited current algorithm, judge whether tenant's service request rate meets RAR index, if it exceeds RAR The specified service request rate of index, then direct filter request, and fed back to tenant, it prompts to buy higher RAR rank, otherwise It performs the next step；

S103: judging that RTL level is other, according to different RTL level not Zhi Hang dispatch layer request is balanced assigns, or obtain current vertical It executes queue and delays the number of request of queue, usage history load variation historySize, according to the request for delaying queue Number calculates the scoring that current flow instance request is directed to each delay queue, and the request is put in the highest delay of scoring In queue；

Dispatch layer request is balanced to assign:

S201: dispatch layer receives the request that queue is immediately performed from access layer, and dispatch layer obtains process from shared drive The load information set E=[e for each flow engine service that service layer sends₁,…,e_m], e_i=(cpu_i,ram_i), cpu_iTable Show that flow engine services e_iCurrent cpu occupancy, ram_iIndicate that flow engine services e_iCurrent ram occupancy；

S202: the corresponding flow instance of procedural model of dispatch layer acquisition request from flow instance warehouse is in flow engine The distribution situation set D=[d of service₁,d₂,…,d_m], d_i∈ [0,1], works as d_iIndicate that the procedural model was not run when=0 In e_iIt is otherwise on the contrary on engine；

S203: according to distribution situation set D, flow engine service is divided into two groups of E₁And E₂, E₁In house all processes The engine namely d that model executed_i=1；E₂House remaining engine；

S204: it is directed to E₁And E₂Element carry out engine busy degree calculating, respectively obtain E₁、E₂Busy degree the smallest draw The service of holding upJudge inequalityIt is whether true, if inequality is set up, Flow instance request is assigned toOtherwise it is assigned toAnd the distribution situation set in modification process example warehouse；It completes Flow instance request scheduling；

Wherein, β is to be assigned to the cost parameter of new engine as by flow instance request, can be special according to particular hardware resource Property is configured.

Preferably, step S101, the service request arrival rate RAR, for measuring flow instance request handling capacity, table Show the transmissible flow instance number of request of tenant's highest per second；

The RAR index is divided into three-level, defines v₀, v₁, v₂, wherein v₀、v₁、v₂For integer, and there is v₀>v₁>v₂, then three A rank is described as follows:

RAR 0: refer to service request arrival rate up to equal to v₀；

RAR 1: refer to service request arrival rate up to equal to v₁；

RAR 2: refer to service request arrival rate up to equal to v₂；

The RAR of different stage corresponds to different chargings.

Preferably, step S101, the request response time rank RTL, for measuring different process request treatability Can, the proposition of RTL is the diversity of the execution time range based on workflow；

The RTL level is not divided into three-level, and defined parameters a, b, t, wherein a, b, t are integer, and has a to be less than b, and t is indicated Engine handles the time of a flow instance request demand, and indicates the length of a timeslice, can be by taking to flow engine Business test obtains, then each rank of the RTL is as follows:

RTL 0: flow instance request responds in 1 timeslice, as t；

RTL 1: request is the latest in (a+1) a timeslice response, i.e. (a+1) t；

RLT 2: request is the latest in (b+1) a timeslice response, i.e. (b+1) t.

Further, the system limited current algorithm guarantees the RAR index of tenant using sliding window algorithm；For RTL Rank is realized that specific mode is as follows using the mode of request caching:

Access layer maintains b+1 for storing the queue of flow instance request, when each queue corresponds to a delay Long variable, represent the flow instance request in the queue postpones duration, and value is respectively 0t, 1t, 2t ..., bt, in which: t For timeslice defined in RTL, postponing the queue that duration variable is 0t is to be immediately performed queue, and delay duration variable is 1t, The queue of 2t ..., bt are delay queue；Access layer is also needed to update delay queue after the time of every 1 timeslice of mistake and be gone through History load variation, the historic load variable are used to measure the number of request situation of each timeslice in the past；Access layer needs root According to the flow instance number of requests situation that tenant does not store the RTL level that flow tasks are arranged with current individual queue, by new stream Journey example request is put into corresponding delay queue.

Still further, step S103, specifically, judging that RTL level is other；

If RTL level is not RTL 0, directly the request of execution dispatch layer is balanced assigns；

If RTL level is not RTL 1, illustrates that delay time is less than or equal to a timeslice, then obtain and be currently immediately performed team Column and the first two delay the number of request of queue, and collection is combined into N=[n₀,n₁,…,n_a]；

If RTL level is not RTL 2, illustrates that delay time is less than or equal to b timeslice, then obtain and be currently immediately performed team Column and whole number of requests for delaying queue, collection are combined into N=[n₀,n₁,…,n_b]。

Still further, step S103, the calculation method for calculating current request and being directed to the scoring of each delay queue It is as follows:

In formula: n_i∈ N, i indicate position of the current delay queue in set N.

Still further, the access layer needs to update delay queue after the time of every 1 timeslice of mistake and history is negative Carry variable, comprising the following steps:

H1: the delay duration variable of all delay queues all subtracts 1, and judges to postpone whether duration variable is equal to 0, if so, Request in queue is all added to and is immediately performed queue, and resets a length of bt when delay；

H2: it is immediately performed queue and needs to run always a thread to judge whether queue has request, and taken according to engine The rate of business processing request, successively submits to dispatch layer for request and makes requests assignment, and record and submit in each timeslice The number of request of flow engine；

H3: acquisition is immediately performed queue in the 1 timeslice time in the past and submits to the request that dispatch layer makes requests assignment Number requestSize updates historic load variable historySize according to following formula:

HistorySize=α * historySize+ (1- α) * requestSize

Wherein: α indicates weight factor, represents the attenuation degree of previous second historySize value.

Still further, the busy degree calculation formula of the engine is as follows:

busyness_i=w₁*cpu_i+w₂*ram_i,w₁+w₂=1

Wherein: w₁And w₂Two parameters respectively indicate the significance level of both load parameters of cpu and ram, need according to hard Part resource characteristics are configured.

The present invention was requested in the dispatch layer in balanced assign, and dispatch layer is according to the flow engine service of flow services layer Load state, and according to the characteristic of stateless workflow engine, realize that the corresponding request of the same procedural model is assigned to minority Engine on, thus reduce the same procedural model it is multiple parsing with result storage bring calculate and memory consumption.

Beneficial effects of the present invention are as follows: the present invention according to different tenants to the needs of its business scenario and procedural model, Different SLA ranks is selected, by different SLA ranks, cloud Workflow system provides different request handling capacities to tenant and takes Business, and class-of-service is carried out to different process requests, in conjunction with the engine load real time monitoring and process mould of Sharing Memory Realization Type distribution situation on engine also reduces the overall memory of engine cluster while the request wave crest for having cut down engine service Expense, so that the ability of cloud workflow load balancing under multi-tenant architecture is improved, so that flow services provider is meeting not With tenant to the parsing execution performance demand of request handling capacity and different flow definitions on the basis of, can mention for more tenants For service.

Detailed description of the invention

Fig. 1 is cloud workflow core component figure.

Specific embodiment

The present invention will be described in detail with reference to the accompanying drawings and detailed description.

Embodiment 1

As shown in Figure 1, a kind of method of the stateless cloud workflow load balance scheduling based on SLA, this method includes standard Enter that layer load waveform be smooth, dispatch layer requests balanced assignment process step；Before this two step is specifically introduced, need It first defines and characterizes different request handling capacities in cloud workflow service SLA and the quantization of different processes request process performance is referred to Mark indicates that the transmissible process of tenant's highest per second is real wherein request handling capacity is measured using service request arrival rate RAR Example number of request；Different process request process performances are measured using request response time rank RTL, request response time rank The it is proposed of RTL is the diversity of the execution time range based on workflow, and time range is from several microseconds to some months etc..

For the present embodiment according to tenant to the needs of its business scenario, RAR index can be divided into three-level, define v₀, v₁, v₂, In, v₀、v₁、v₂For integer, and there is v₀>v₁>v₂, then three ranks are described as follows:

RAR 0: it is used for high concurrent business scenario, refers to service request arrival rate up to equal to v₀；

RAR 1: it is used for general voice and packet data concurrent service scene, refers to service request arrival rate up to equal to v₁；

RAR 2: it is used for low voice and packet data concurrent service scene, refers to service request arrival rate up to equal to v₂；

The RAR of different stage corresponds to different chargings, has the rank of higher service request arrival rate accordingly also to have more High charging.

The present embodiment requests the different demands to processing real-time according to different processes, and RTL level can not be divided into three-level, detailed Before thin statement, defined parameters a, b, t, wherein a, b, t are integer, and have a to be less than b, and t indicates that engine handles a flow instance The time of request demand also illustrates that the length of a timeslice, can be by testing to obtain to engine service, each rank of SLA It is presented below:

RTL 0: requesting the higher flow instance of requirement of real-time, and flow instance request responds in 1 timeslice, i.e., For t；Mostly automatic flow；

RTL 1: requesting the general flow instance of requirement of real-time, and request is the latest in (a+1) a timeslice response, i.e. (a +1)t；

RLT 2: requesting the lower flow instance of requirement of real-time, and request is the latest in (b+1) a timeslice response, i.e. (b +1)t。

Tenant can be directed to the actual use situation of model when uploading procedural model, be task different in procedural model Different SLA indexs is selected, to obtain different chargings, the higher SLA index of process performance, charging is accordingly higher.

The method of stateless cloud workflow load balance scheduling described in the present embodiment based on SLA, when on reception tenant When spreading the corresponding flow instance request of journey model, stateless work of the cloud workflow by flow instance request scheduling into cluster Make in stream engine, execute the following steps are included:

It is smooth that access layer loads waveform:

S101: access layer receives the request of tenant's flow instance, and access layer is according to tenant ID and process solicited message from tenant The RTL level that the warehouse SLA obtains the RAR index of the tenant and requests for the flow instance is other；

S102: according to time window algorithm, judge whether tenant's service request rate meets RAR index, if it exceeds RAR The specified service request rate of index, then direct filter request, and fed back to tenant, it prompts to buy higher RAR rank；Otherwise It performs the next step；

S103: judging that RTL level is other, if RTL level is not RTL 0, directly the request of execution dispatch layer is balanced assigns progress Scheduling；If RTL level is not RTL 1, illustrate at most postpone a timeslice, then obtains and be currently immediately performed queue with before Two are delayed the number of request of queue, and collection is combined into N=[n₀,n₁,…,n_a]；If RTL level is not RTL 2, illustrate at most postpone b A timeslice, then obtain and be currently immediately performed queue and whole number of requests for delaying queue, and collection is combined into N=[n₀,n₁,…, n_b]；

S104: usage history load variation historySize, for the element of each set N, calculated currently using following Request is directed to the scoring of each delay queue:

In formula, n_i∈ N, i indicate position of the current delay queue in set N.

By the scoring of the calculated each delay queue of above step, request is put in the highest delay queue of scoring.

Dispatch layer request is balanced assign the following steps are included:

S204: it is directed to E₁And E₂Element carry out engine busy degree calculating, formula is as follows:

busyness_i=w₁*cpu_i+w₂*ram_i,w₁+w₂=1

Wherein: w₁And w₂Two parameters respectively indicate the significance level of both load parameters of cpu and ram, need according to hard Part resource characteristics are configured；E is respectively obtained by the busy degree formula of engine₁、E₂The smallest engine service of busy degree

Judge inequalityIt is whether true；If inequality is set up, will request It is assigned toOtherwise it is assigned toAnd the distribution situation set in modification process example warehouse；

Wherein: β is to be assigned to the cost parameter of new engine as by flow instance request, can be special according to particular hardware resource Property is configured；

S205: request scheduling is completed.

The present embodiment was requested in the dispatch layer in balanced assign, and dispatch layer is according to the flow engine service of flow services layer Load state it is few to realize that the corresponding request of the same procedural model is assigned to and according to the characteristic of stateless workflow engine On several engines, so that the multiple parsing for reducing the same procedural model is calculated with result storage bring and memory consumption.

System limited current algorithm described in the present embodiment guarantees the RAR index of tenant using sliding window algorithm；For RTL level Do not realized that specific mode is as follows using the mode of request caching:

The access layer needs to update delay queue and historic load variable after the time of every 1 timeslice of mistake, including Following steps:

HistorySize=α * historySize+ (1- α) * requestSize

Wherein: α indicates weight factor, represents the rate of decay of previous second historySize value.

The characteristics of the present embodiment is the business scenario based on different tenants to request handling capacity and different flow definition solutions The different demands for analysing execution performance, using the diversity of the execution time range in workflow, especially partial task and are not required to The delay for wanting the characteristic of real-time response to make requests, the request load realized to the current limliting of tenant and under current limliting constraint are equal Weighing apparatus optimization；In conjunction with the characteristic of stateless workflow engine service, the load of the minimum engine number distribution of implementation process model request is equal Weighing apparatus strategy；Utilize the access of Sharing Memory Realization engine service load information；Method set forth above may be implemented guaranteeing cloud While tenant's service experience, the load balancing effect and execution performance of optimization cloud workflow request, to allow cloud workflow system System provides process analysis service under normal service state for more tenants.

Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair The restriction of embodiments of the present invention.Any modification done within the spirit and principles of the present invention and changes equivalent replacement Into etc., it should all be included in the scope of protection of the claims of the present invention.

Claims

1. a kind of method of the stateless cloud workflow load balance scheduling based on SLA, it is characterised in that: uploaded when receiving tenant When flow instance corresponding to procedural model is requested, stateless work of the cloud workflow by flow instance request scheduling into cluster Flow engine in, execute the following steps are included:

It is smooth that access layer loads waveform:

S101: access layer receives the request of tenant's flow instance, and access layer is according to tenant ID or flow instance solicited message from tenant When the warehouse SLA obtains the service request arrival rate RAR index of the tenant and the request of flow instance request is responded Between rank RTL；

S102: according to system limited current algorithm, judging whether tenant's service request rate meets RAR index, if it exceeds RAR index Specified service request rate, then direct filter request, and fed back to tenant, it prompts to buy higher RAR rank, otherwise execute In next step；

S103: judging that RTL level is other, according to different RTL level not Zhi Hang dispatch layer request is balanced assigns, or obtain and currently hold immediately Row queue and the number of request for delaying queue, usage history load variation historySize, according to the number of request for delaying queue, meter Current flow instance request is directed to the scoring of each delay queue, and flow instance request is put in highest prolong that score In slow queue；

Dispatch layer request is balanced to assign:

S201: dispatch layer receives the request that queue is immediately performed from access layer, and dispatch layer obtains flow services from shared drive The load information set E=[e for each flow engine service that layer is sent₁..., e_m], e_i=(cpu_i, ram_i), cpu_iIndicate stream Journey engine service e_iCurrent cpu occupancy, ram_iIndicate that flow engine services e_iCurrent ram occupancy；

S202: the corresponding flow instance of procedural model of dispatch layer acquisition request from flow instance warehouse is in flow engine service Distribution situation set D=[d₁, d₂..., d_m], d_i∈ [0,1], works as d_iIndicate that the procedural model was not run in e when=0_i It is otherwise on the contrary on engine；

S203: according to distribution situation set D, flow engine service is divided into two groups of E₁And E₂, E₁In house all procedural models The engine namely d executed_i=1；E₂House remaining engine；

S204: it is directed to E₁And E₂Element carry out engine busy degree calculating, respectively obtain E₁、E₂The smallest engine service of busy degreeJudge inequalityIt is whether true, if inequality is set up, by process reality Example request is assigned toOtherwise it is assigned toAnd the distribution situation set in modification process example warehouse；Complete flow instance Request scheduling；

Wherein, β is to be assigned to the cost parameter of new engine as by flow instance request, can according to particular hardware resource characteristics into Row setting.

2. the method for the stateless cloud workflow load balance scheduling according to claim 1 based on SLA, feature exist In: step S101, the service request arrival rate RAR indicate that tenant is per second most for measuring flow instance request handling capacity High transmissible flow instance number of request；

The RAR index is divided into three-level, defines v₀, v₁, v₂, wherein v₀、v₁、v₂For integer, and there is v₀> v₁> v₂, then three Rank is described as follows:

RAR 0: refer to service request arrival rate up to equal to v₀；

RAR 1: refer to service request arrival rate up to equal to v₁；

RAR 2: refer to service request arrival rate up to equal to v₂；

The RAR of different stage corresponds to different chargings.

3. the method for the stateless cloud workflow load balance scheduling according to claim 1 based on SLA, feature exist In: step S101, the request response time rank RTL, for measuring different process request process performances, the proposition of RTL It is the diversity of the execution time range based on workflow；

The RTL level is not divided into three-level, defined parameters a, b, t, wherein a, b, t are integer, and have a to be less than b, and t indicates engine The time of a flow instance request demand is handled, and indicates the length of a timeslice, then each rank of the RTL is as follows:

RTL 0: flow instance request responds in 1 timeslice, as t；

RTL 1: flow instance request is the latest in (a+1) a timeslice response, i.e. (a+1) t；

RLT 2: flow instance request is the latest in (b+1) a timeslice response, i.e. (b+1) t.

4. the method for the stateless cloud workflow load balance scheduling according to claim 3 based on SLA, feature exist In: the system limited current algorithm guarantees the RAR index of tenant using sliding window algorithm；It Shi Yong not requested for RTL level slow The mode deposited is realized that specific mode is as follows:

Access layer maintains b+1 for storing the queue of flow instance request, and the corresponding delay duration of each queue becomes Amount, represent the flow instance request in the queue postpones duration, and value is respectively 0t, 1t, 2t ..., bt, in which: t is Timeslice defined in RTL, the queue that delay duration variable is 0t are to be immediately performed queue, and delay duration variable is 1t, The queue of 2t ..., bt is delay queue；Access layer is also needed to update delay queue after the time of every 1 timeslice of mistake and be gone through History load variation, the historic load variable are used to measure the number of request situation of each timeslice in the past；Access layer needs root According to the flow instance number of requests situation that tenant does not store the RTL level that flow tasks are arranged with current individual queue, by new stream Journey example request is put into corresponding delay queue.

5. the method for the stateless cloud workflow load balance scheduling according to claim 4 based on SLA, feature exist In: step S103, specifically, judging that RTL level is other；

If RTL level is not RTL 1, illustrate delay time be less than or equal to a timeslice, then obtain currently be immediately performed queue with The first two delays the number of request of queue, and collection is combined into N=[n₀, n₁..., n_a]；

If RTL level is not RTL 2, illustrate delay time be less than or equal to b timeslice, then obtain currently be immediately performed queue with Whole number of requests for delaying queue, collection are combined into N=[n₀, n₁..., n_b]。

6. the method for the stateless cloud workflow load balance scheduling according to claim 5 based on SLA, feature exist In: step S103, the calculation method for calculating the scoring that current request is directed to each delay queue are as follows:

In formula, n_i∈ N, i indicate position of the current delay queue in set N.

7. the method for the stateless cloud workflow load balance scheduling according to claim 6 based on SLA, feature exist In: the access layer needs to update delay queue and historic load variable, including following step after the time of every 1 timeslice of mistake It is rapid:

H1: the delay duration variable of all delay queues all subtracts 1, and judges to postpone whether duration variable is equal to 0, if so, will prolong Request in slow queue, which is all added to, is immediately performed queue, and resets a length of bt when delay；

H2: being immediately performed queue and need to run always a thread to judge whether queue has request, and according to engine service at Request is successively submitted to dispatch layer and makes requests assignment, and recorded in each timeslice and submit to process by the rate for managing request The number of request of engine；

H3: acquisition is immediately performed queue in the 1 timeslice time in the past and submits to the number of request that dispatch layer makes requests assignment RequestSize updates historic load variable historySize according to following formula:

HistorySize=α * historySize+ (1- α) * requestSize

8. the method for the stateless cloud workflow load balance scheduling according to claim 7 based on SLA, feature exist In: the busy degree calculation formula of the engine is as follows:

busyness_i=w₁*cpu_i+w₂*ram_i, w₁+w₂=1

Wherein: w₁And w₂Two parameters respectively indicate the significance level of both load parameters of cpu and ram, need to be provided according to hardware Source characteristic is configured.