CN111352728A - Self-adaptive scheduling method of data service cluster - Google Patents

Self-adaptive scheduling method of data service cluster Download PDF

Info

Publication number
CN111352728A
CN111352728A CN201910803526.6A CN201910803526A CN111352728A CN 111352728 A CN111352728 A CN 111352728A CN 201910803526 A CN201910803526 A CN 201910803526A CN 111352728 A CN111352728 A CN 111352728A
Authority
CN
China
Prior art keywords
request
interface
execution
efficiency function
next instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910803526.6A
Other languages
Chinese (zh)
Other versions
CN111352728B (en
Inventor
黄罡
董瀚
景翔
蔡华谦
姜海鸥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Information Technology Institute (tianjin Binhai)
Original Assignee
Peking University Information Technology Institute (tianjin Binhai)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Information Technology Institute (tianjin Binhai) filed Critical Peking University Information Technology Institute (tianjin Binhai)
Priority to CN201910803526.6A priority Critical patent/CN111352728B/en
Publication of CN111352728A publication Critical patent/CN111352728A/en
Application granted granted Critical
Publication of CN111352728B publication Critical patent/CN111352728B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the field of task scheduling, in particular to a self-adaptive scheduling method of a data service cluster. The method comprises the following steps: sending a calling request, analyzing the request and reading a requested interface; screening candidate equipment meeting the conditions; selecting a candidate device with the lowest load; executing the request on the candidate equipment, if the set time is exceeded or the execution fails, recording the execution failure, judging the failure condition, and executing the next instruction; if the execution is successful, the execution is recorded to be successful, the success condition is judged, and the next instruction is executed. The invention selects the equipment with the minimum load to realize the average flow distribution without accurately monitoring the instantaneous request flow reaching the equipment; and automatically adjusting according to the success or failure of interface calling to realize automatic adaptation to the unknown interface.

Description

Self-adaptive scheduling method of data service cluster
Technical Field
The invention relates to the field of data scheduling, in particular to a self-adaptive scheduling method of a data service cluster.
Background
The current artificial intelligence can not leave data, the collection of big data becomes an obvious bottleneck, a large number of data barriers appear, and the big data research faces the dilemma of unavailable data due to a large number of 'information islands'. The problem of "information islanding" is more serious for smart devices due to the inherent closeness of mobile applications.
One idea for solving the problem of information island of the intelligent device is to develop a novel software definition theory based on the classical software definition theory, that is, a controllable component of the intelligent device is exposed through an Application Programming Interface (API) to realize the on-demand management and on-demand service of the intelligent device.
Unlike the classical data service cluster, the novel data service cluster has the following characteristics: the service capacities of different interfaces are greatly different; the service capabilities of the device are greatly affected by the requested traffic.
Disclosure of Invention
The embodiment of the invention provides a self-adaptive scheduling method of a data service cluster, which is used for maintaining a receiving window for each device, controlling the quantity of requests processed by the devices at the same time, and simultaneously feeding back and adjusting the size of the receiving window according to the request processing result to realize the self-adaptation of flow control.
According to a first aspect of the embodiments of the present invention, a method for adaptive scheduling of a data service cluster includes:
sending out calling requests, and arranging the calling requests in a request queue according to a first-in first-out sequence;
reading a request of a queue head, analyzing the request, and reading an interface of the request;
screening candidate equipment meeting the conditions;
if no candidate equipment exists, adding the call request to the tail of the queue; if the candidate equipment exists, selecting the candidate equipment with the lowest load;
executing the request on the candidate equipment, if the execution request exceeds the set time or the execution request fails, recording the execution failure, and executing the next instruction;
if the execution is successful, the execution is recorded to be successful, and the next instruction is executed.
The method includes the steps of executing a request on a candidate device, recording a failure condition after the execution fails if the execution request exceeds a set time or the execution request fails, and executing a next instruction, wherein the failure condition is specifically:
judging whether F is satisfiedj′,i′≥λVj′,i′Is established, Fj′,i′Maintaining a failed count, V, at device j' for calling interface ij′,i′For the rate at which requests for invoking interface i 'arrive at device j', λ is a real number between 0 and 1, which may be 0.5, and if true, makes the order
Figure RE-GDA0002480920810000021
Wj′,i′The maximum number of simultaneous calls that can be made to the interface i 'on the recording device j', α being a real number between 0 and 1, Sj′,i′=0,Fj′,i′=0,Sj′,i′Performing a successful count at device j 'for the call interface i'; then returning to execute the next instruction; fj′,i′≥λVj′,i′If not, returning to execute the next instruction.
If the execution is successful, after the execution is successfully recorded, a success condition is determined, and a next instruction is executed, wherein the determination of the success condition and the execution of the next instruction specifically include:
judging whether S is satisfiedj′,i′≥μVj′,i′Is true, mu is a rational number between 0 and 1, Sj′,i′To invoke the interface i 'to perform a successful count at device j', Vj′,i′The rate at which requests for call interface i 'reach device j'; when it is established, let Wj′,i′=0,Sj′,i′=0,F j′,i′0, wherein Fj′,i′Maintaining a failed count at the device j 'for the calling interface i', and then returning to execute the next instruction; sj′,i′≥μVj′,i′If not, returning to execute the next instruction.
A method for adaptive scheduling of a data service cluster is provided, wherein values of parameters α, λ and μ are optimized, and specifically:
selecting a group of parameters α, lambda and mu, wherein the value ranges of α, lambda and mu are all 0 to 1;
at a velocity vi=v0Sending a call request v to a calling interface i of the cluster gateway0Measuring a rate w' at which the request is completed for setting the rate; repeating the operation for multiple times to obtain the mean value of w
Figure RE-GDA0002480920810000022
Rate at which interface i cluster completes requests
Figure RE-GDA0002480920810000031
Selecting different viIn the same way, different v can be measurediRate G of completion request of corresponding interface i clusterQ,i(vi);
Compute interface i scheduling efficiency function HQ,i(vi);
Changing to request the interface i from the cluster gateway until all the interfaces go through, and repeatedly calculating to obtain a scheduling efficiency function HQ,i(vi) (ii) a Measuring comprehensive scheduling efficiency function H of m interfaces on cluster QQ(v);
Setting step length, changing α, lambda and mu, repeatedly measuring to obtain comprehensive scheduling efficiency function HQ(v) Selecting a comprehensive scheduling efficiency function HQ(v) α ', λ ' and μ ' corresponding to the maximum value.
The scheduling efficiency function HQ,i(vi) The calculation method comprises the following steps:
Figure RE-GDA0002480920810000032
wherein F ″)i(x) rate F of completion of single device interface i request under flow control conditioni(ii) an approximate representation of an ideal value of (),
Figure RE-GDA0002480920810000033
wherein v isi *Is to follow viIncrease assay Fi(vi) Increasing to an overload threshold; fi(vi *) At a velocity vi *The rate at which a single device requests completion, | DiAnd | is the number of devices of the interface i.
V isi *And Fi(vi) The measuring method comprises the following steps:
controlling an external variable;
at a velocity vi=v0Sending a request for calling an interface i to the equipment, and measuring the rate w' of the completion of the request; repeating the operation for multiple times to obtain the mean value of w
Figure RE-GDA0002480920810000034
Then
Figure RE-GDA0002480920810000035
Change viMeasuring Fi(vi) At viTaking the results for other values to give Fi(vi) And vi *
Said change viMeasuring Fi(vi) At viTaking other values, the results are specifically:
viincrease exponentially, judge Fi(v) The overall trend and the interval of the maximum value of (c);
vilinearly traversing the interval where the maximum value is positioned, and determining Fi(vi) Maximum value of (2) and Fi(vi) The law of variation around the maximum.
Setting the step length, changing α, lambda and mu, and repeatedly measuring to obtain a comprehensive scheduling efficiency function HQ(v) Selecting a comprehensive scheduling efficiency function HQ(v) α ', λ ', μ ' corresponding to the maximum value are specifically:
fixing α and lambda optimizing mu to obtain a measured comprehensive scheduling efficiency function HQ(v) Optimal μ'; fixingα and mu optimize lambda, determine the comprehensive scheduling efficiency function HQ(v) Obtaining optimum lambda', fixing lambda and mu optimization α, determining comprehensive scheduling efficiency function HQ(v) Optimal α 'is obtained, thus α', λ ', μ' are obtained.
And (3) carrying out local optimality verification on the optimal parameters (α ', lambda', mu '), comparing the adjacent 26 groups (α, lambda, mu) with (α', lambda ', mu'), and if (α ', lambda', mu ') is better than 26 groups (α, lambda, mu), satisfying the local optimality, and if one or more groups are better than (α', lambda ', mu'), selecting HQ(v) The highest corresponding (α, λ, μ) is the optimal (α ', λ ', μ ').
The comprehensive scheduling efficiency function HQ(v) The calculation method comprises the following steps:
Figure RE-GDA0002480920810000041
and m is the number of interfaces.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
based on the characteristics of the service capability of single equipment, the invention researches and establishes an adaptive scheduling algorithm model for improving the scheduling efficiency of the equipment cluster, and simultaneously considers the influence of parameters for controlling a receiving window on the algorithm. The specific implementation mode is to maintain a receiving window for each device, control the number of requests processed by the devices at the same time, and simultaneously feed back and adjust the size of the receiving window according to the request processing result to realize the self-adaption of flow control.
By Wj,iIndirectly describing service capabilities of a single device Fi(v) By Ci,iDescribing the load of the device; ensure Cj,i≤Wj,iTo achieve flow control; selecting
Figure RE-GDA0002480920810000042
Minimal equipment to achieve traffic averaging without the need to accurately monitor instantaneous request traffic to the equipment; automatically adjusting W based on success or failure of interface invocationj,iTo achieve automatic adaptation v*An unknown interface.
The invention finds that the service capacity of single equipment is highly sensitive to flow overload through experiments, and therefore provides a method for calculating an ideal value of the service capacity of the single equipment based on flow control conditions.
The invention tests and optimizes the adaptive scheduling algorithm on the data service cluster. Compared with a single device without flow control, the self-adaptive scheduling algorithm is verified to be capable of effectively controlling the flow. And performing independent optimization experiment and analysis on each parameter of the algorithm to obtain an optimal parameter combination.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a flow chart of an adaptive scheduling method for a data service cluster according to the present invention;
fig. 2 is a measurement result of the service capability of the interface of a single device in the second embodiment;
FIG. 3 is an ideal value of the interface service capability of a single device part under the condition of flow control in the second embodiment;
fig. 4 is a measurement result of the service capability of the interface of a single device in the second embodiment;
FIG. 5 is a data structure used by the adaptive scheduling algorithm according to the second embodiment;
fig. 6 shows the effect of the adaptive scheduling algorithm in controlling the flow rate in the second embodiment.
Detailed Description
Example one
As shown in fig. 1, the present invention provides an adaptive scheduling method for a data service cluster, including:
the API sends out calling requests, and the API calling requests are arranged behind the existing requests according to a first-in first-out (FIFO) sequence;
reading the most front request, analyzing the request, and reading an interface i' of the request;
traverse Wj,iAnd Cj,iCorresponding column in which Wn×mTo maintain the receive window matrix, element Wj,iMaximum number of interfaces i on recording device j that can be called simultaneously, Cn×mAs a concurrency matrix, element Cj,iRecording the current concurrency number (the number of requests being processed) of the interface i on the device j; obtaining a set of candidate devices (i.e. devices with current concurrency smaller than the receiving window) D' (D is a set of all devices) D ═ j | Cj,i′<Wj,i′,j∈D}
When D' is an empty set, rearranging the API call requests in the request queue according to the sequence;
when D ' is not an empty set, selecting the device j ' with the lowest load from the set D ':
Figure RE-GDA0002480920810000061
c is to bej′,i′Adding 1, assigning the request to the device j' for execution, and if the update time is reached, updating Vj′,i′,Vj′,i′Record the rate, V, at which requests invoking interface i' arrive at device jj′,i′Frequent updates (e.g., every minute) are not required;
when the execution of the device j' is overtime or fails, F is setj′,i′Adding 1, Fj′,i′To maintain the failure request count matrix, determine whether F is satisfiedj′,i′≥λVj′,i′When it is true, λ is a rational number between 0 and 1, which may be 0.5, and when it is true, let
Figure RE-GDA0002480920810000066
α is a rational number between 0 and 1, and may be 0.5, Sj′,i′=0,F j′,i′0; then returning to execute the next instruction; fj′,i′≥λVj′,i′If not, returning to execute the next instruction;
when the device j' successfully executes, Sj′,i′Adding 1, Sm×nTo execute the successful request count matrix, Sj′,i′To invoke the counting of successful execution of interface i 'at device j', a determination is madeWhether or not S is satisfiedj′,i′≥μVj′,i′If it is true, μ is a rational number between 0 and 1, which may be 0.5, and if true, let Wj′,i′,Sj′,i′=0,Fj′,i′When the instruction is equal to 0, returning to execute the next instruction; sj′,i′≥μVj′,i′If not, returning to execute the next instruction;
preferably, the values of the parameters α, λ, μ are optimized, specifically:
selecting a group of parameters α, lambda and mu, wherein the value ranges of α, lambda and mu are all 0 to 1;
at a velocity vi=v0Requesting the calling interface i, v from the cluster gateway0Measuring a rate w' at which the request is completed for setting the rate; preferably, the operation is repeated a plurality of times to obtain a mean value of w
Figure RE-GDA0002480920810000062
Then
Figure RE-GDA0002480920810000063
GQ,i(v0) Rate of requests completed for the cluster;
selecting viIncrease exponentially (e.g. v ═ 1, 2)2,23,...,2n) In the same way, G can be measuredQ,i(vi);
Calculating a scheduling efficiency function
Figure RE-GDA0002480920810000064
Wherein F ″)i(x) rate of completion of requests from a single device under flow control conditions FiIdeal value of (& gtF'i(ii) an approximate representation of (a),
Figure RE-GDA0002480920810000065
wherein v isi *Is to follow viIncrease assay Fi(vi) Increasing to an overload threshold; fi(vi *) At a velocity vi *Rate of completion of a single device request;
Changing to request the interface i from the cluster gateway until all the interfaces go through, and repeatedly calculating to obtain a scheduling efficiency function HQ,i(vi) (ii) a Defining a comprehensive scheduling efficiency function H of m interfaces on a cluster QQ(v):
Figure RE-GDA0002480920810000071
Setting step length, changing α, lambda and mu, repeatedly measuring to obtain comprehensive scheduling efficiency function HQ(v) Selecting a comprehensive scheduling efficiency function HQ(v) α ', λ' and μ 'corresponding to the maximum values are the optimal parameters α', λ 'and μ'.
Preferably, α and λ can be fixed and optimized, making α ═ 0.5, λ ═ 0.5, μ ∈ {0.2, 0.4, 0.6,
0.8, optimum μ ' was obtained, fixed α and μ optimum λ were obtained, α was 0.5, μ was 0.5, λ ∈ 0.2, 0.4, 0.6, 0.8, and optimum λ ' was obtained, fixed λ and μ optimum α, λ was 0.5, μ was 0.5, α∈ 0.2, 0.4, 0.6, 0.8, and optimum α ' was obtained.
Preferably, local optimality verification is performed, the optimal parameters are combined to be (α ', λ', μ '), the adjacent 26 groups (α, λ, μ) are compared with (α', λ ', μ'), if (α ', λ', μ ') is better than 26 groups (α, λ, μ), the local optimality is satisfied, and if one or more groups are better than (α', λ ', μ'), H is selectedQ(v) The highest corresponding (α, λ, μ) is the optimal (α ', λ ', μ ').
Example two
Single device service capability measurement
Step 1: controlled variable
(1) Software and hardware configuration of the equipment: the measurement is carried out on equipment with the same manufacturer, the same model and the same Android version, all the applications allowed to be unloaded are unloaded, and only the applications and the interfaces to be measured are installed. And the equipment is connected to a power supply to keep the full-charged state of the battery.
(2) Network environment: the device is connected to a stable Wi-Fi access point, and measurements are taken during off-peak hours.
Step 2: measurement Fi(v)
At a rate v ═ v0Sending a request for calling an interface i to the equipment, and measuring the rate w' of the completion of the request; repeating the operation for multiple times to obtain the mean value of w
Figure RE-GDA0002480920810000072
Then
Figure RE-GDA0002480920810000073
By the same token, F can be measuredi(v) Results when v takes other values. Due to measurement of a single Fi(v) The operation of (a) is time-consuming, the value range of v is large, and if v is not properly selected, the total measurement time is hard to bear. To determine F within a reasonable timei(v) The overall trend, the maximum value and the change rule around the maximum value of the formula are as follows:
(1) first, v increases exponentially (e.g., v ═ 1, 2)2,23,...,2n) Judgment of Fi(v) The overall trend and the interval in which the maximum value lies.
(2) Then, v linearly traverses the interval where the maximum value is located, and determines Fi(v) Maximum value of (2) and Fi(v) The law of variation around the maximum.
Preferably, the completion rate of a single device is determined by:
to determine F within a reasonable timei(v) The overall trend, the maximum value and the change rule around the maximum value,
(1) first, viIncrease exponentially (e.g. v)i=1,2,22,23,...,2n) Judgment of Fi(vi) The overall trend and the interval in which the maximum value lies.
(2) Then, viLinearly traversing the interval where the maximum value is positioned, and determining Fi(vi) Maximum value of (2) and Fi(vi) The law of variation around the maximum.
Preferably, in the measurement, the control variables are required: software and hardware configuration of the equipment: the measurement is carried out on equipment with the same manufacturer, the same model and the same Android version, all the applications allowed to be unloaded are unloaded, and only the applications and the interfaces to be measured are installed. Connecting the equipment into a power supply and keeping the battery in a full-charged state; selected network environment: the device is connected to a stable Wi-Fi access point, and measurements are taken during off-peak hours.
And measuring the service capability of the partial behavior reflection interface on the single equipment. The experimental equipment is a Changhong S07 mobile phone, and the Android version is 6.0. The selected interface was open from seven common applications (see table 1). Except for the interface for acquiring the new product list of the 'split-many' application, other interfaces have parameters. To prevent that a possible application local cache affects the accuracy of the measurement values, a randomly chosen parameter is used for each request. The 'key words' are selected from Chinese general word stock, and the 'stock names or codes' and 'city codes' come from special word stock.
TABLE 1 measured behavioral reflex interface
Figure RE-GDA0002480920810000081
Figure RE-GDA0002480920810000091
As shown in FIG. 2, is a measurement of the exponential increase of v, showing Fi(v) The overall trend of (c):
(1) with increasing v, Fi(v) Increasing and then decreasing, and finally approaching 0; fi(v) The variation of (c) corresponds to the typical "idle-saturation-overload" procedure.
(2)Fi(v) Image of growing phase (before maximum is reached) and Fi(v) The higher the image (dotted grey line) overlap at v, indicating that the interface is more stable to service before being saturated by the requested traffic.
(3) F of different interfacesi(v) The maximum values are different and may vary widely, reflecting differences in service capabilities. For example, "search offer information for" Mei Tuo "applications"interface max Fi(v) Less than 5, and the 'search stock information' interface max F of 'the classic edition of the' Yilian playsmani(v) Is more than 100, the difference between the two is more than 20 times.
FIG. 4 shows a linear traversal of v for Fi(v) The measurement result of the interval in which the maximum value is located. The abscissa, ordinate and gray dashed line have the same meaning as in fig. 2. The image on the left side of each row is the same as the corresponding curve in fig. 2 (v grows exponentially), and the highlight part is the interval where the maximum value is located; the right image is Fi(v) Details of the variation around the maximum (v increases linearly). Display of images, Fi(v) If v continues to increase after the maximum value is reached, Fi(v) The fluctuation of (a) is significantly increased; the interface is sensitive to the increase of the request traffic after being saturated by the request traffic.
From the above measurements, it can be speculated that if some flow control mechanism is used to reject part of the requests when v is too large, so that the rate v' at which the device accepts requests satisfies the following formula, the service capacity degradation caused by overload can be avoided.
Figure RE-GDA0002480920810000092
As shown in FIG. 3, F is defined as the flow control conditioni(v) Is preferably F'i(v):
Figure RE-GDA0002480920810000093
v<v*When Fi(v) Can be approximately expressed as Fi(v) V, thus F'i(v) May be approximately represented as F ″)i(v):
The "swallow cloud" data service cluster suitable for the adaptive scheduling algorithm should satisfy the following assumptions:
(1) the cluster consists of n devices (with the same software and hardware configuration (same manufacturer, model and Android version) ( numbers 1, 2.., n)).
(2) The network environment where the cluster is located is stable.
(3) The m interfaces (numbered 1,2, D), the device number set of deployment interface i is Di
(4) The same interface service capability on each device is the same, using function Fi(v) And (4) showing.
(5) The different interfaces deployed on the same equipment have no influence on each other, i.e. function Fi(v) Independent of other interfaces deployed on the device.
(6) The interface calling request is stateless, and the gateway can forward the request to any equipment in the cluster, wherein the equipment is provided with the corresponding interface for processing.
A "swallow cloud" data service cluster Q satisfying the above assumptions may be represented by a 2m +2 tuple:
Q=(n,m,D1,D2,...,Dm,F1(v),F2(v),...,Fm(v))
the total service capability of the cluster Q on interface i can be defined as a function GQ,i(v) V is the rate of arrival of requests to the cluster (in units of times/second), GQ,i(v) Is the rate (in times/second) at which the cluster completes the request. The request rate assigned by the gateway to device j is vj(in units of times/second) then there are:
Figure RE-GDA0002480920810000101
F″i(v) for flow control conditions Fi(v) Approximate representation of the ideal values, it is easy to prove that:
Figure RE-GDA0002480920810000102
namely, it is
Figure RE-GDA0002480920810000103
Given | DiI and Fi(v) Time GQ,i(v) The upper limit of (3). Defining a scheduling efficiency function H for an interface i on a cluster QQ,i(v):
Figure RE-GDA0002480920810000104
Defining a comprehensive scheduling efficiency function H of m interfaces on a cluster QQ(v):
Figure RE-GDA0002480920810000105
Cluster parameters (i.e., n, m, D)i,Fi) And scheduling algorithm can influence HQ(v) When cluster parameters are not changed, HQ(v) Reflecting the performance of the scheduling algorithm. The goal of the scheduling algorithm is therefore to increase HQ(v)
To increase HQ(v) H must be increasedQ,i(v) In that respect To increase HQ,i(v) It is necessary to make GQ,i(v) Approaching the upper limit while satisfying:
Figure RE-GDA0002480920810000106
equation ① requires control of request flow to ensure DiDoes not exceed a threshold value v*Equation ② requires that the requested flow be distributed evenly so that DiEach device in (a) receives requests at an equal rate (i.e., load is equal). The scheduling algorithm only needs to properly control and allocate the requested traffic, so that G is enabledQ,i(v) Approaching the upper limit, and then HQ,i(v) Approach 1, finally HQ(v) Close to 1, the optimization objective is achieved.
However, in order to approximately satisfy the above condition, the scheduling algorithm needs to solve the following problem:
(1) critical value v of different interfaces*Different, and the interface types that the scheduling algorithm needs to support are not controlled, and the critical values of all the interfaces cannot be measured in advance through experiments.
(2) When the requested traffic to the cluster is large, monitoring the instantaneous traffic to each device faces the contradiction of low delay and high accuracy: in order to more accurately monitor the traffic data, the frequency of updating the data must be increased, but this results in increased delay for the scheduling algorithm.
One possible solution is to adaptively control the traffic, dynamically sense the threshold of the interface and the load of the device as follows:
(1) maintaining a receive window matrix Wn×mElement Wi,iThe maximum number of interfaces i on device j that can be called simultaneously is recorded. Wj,iCan be set to a smaller integer, and W is set during the operation of the algorithmj,iWill be constantly updated.
(2) Maintaining a current concurrency number matrix Cn×mElement Cj,iThe current concurrency (number of requests being processed) of the interface i on device j is recorded. When the gateway forwards the interface i call request to device j, Cj,iAdding 1; when the device j finishes processing the interface i call request once, no matter whether the execution is successful or not, Cj,iMinus 1.
(3) When the gateway receives the call request of the interface i, the gateway tries to find a target device j' meeting the following conditions:
Figure RE-GDA0002480920810000111
(4) if the target equipment i' meeting the formula does not exist, executing current limiting operation, namely, processing after the equipment meeting the condition appears, and rejecting the request if the equipment is overtime in the waiting process; and if the target equipment j 'exists, forwarding the request to the equipment j' for execution, and waiting for an execution result.
Determining Wj,iThe adjustment rules of (2) need to solve two key problems: (1) how to judge the success or failure of a large number of requests in a short time; (2) wj,iThe amount should be increased or decreased. The second problem is easily solved, since the cluster device is very sensitive to request overload, for Wj,iThe adjustment of (1) should follow the principle of 'increasing and decreasing multiplicatively'. The "mass" determination in the first problem may be based on the ratio of the successful or failed request count to the arriving device request traffic, with a truly difficult definition of "short time". The proposed 'competition counting' rule can avoid the direct judgment of 'short time' and achieve the expected effect. The following is "Competition count"Rule:
(1) maintaining a current traffic matrix Vn×mElement Vj,iThe rate at which requests invoking interface i arrive at device j is recorded. Vj,iFrequent updates (e.g., every minute) are not required.
(2) Maintaining a successful request count matrix Sn×mWhenever device j successfully calls interface i, Sj,iAnd adding 1. If the updated Sj,iSatisfies Sj,i≥μVj,iThen W isj,iAdding 1, Sj,iAnd Fj,iAnd (6) clearing.
(3) Maintaining a failed request count matrix Fn×mWhenever device j fails to call interface i, Fj,iAnd adding 1. If updated Fj,iSatisfies Fj,i≥λVj,iThen, then
Figure RE-GDA0002480920810000121
Sj,iAnd Fj,iAnd (6) clearing.
The key to the "Contention count" rule is Sj,iAnd Fj,iA variable that reaches the threshold value first in the process results in Wj,iIs then Sj,iAnd Fj,iAre all clear, so successful or failed request pair Sj,iAnd Fj,iThe influence of (A) does not last, Sj,iAnd Fj,iThe cluster state in a short time can be more accurately reflected.
The parameters to be determined are α, lambda and mu, the value range is 0 < α, lambda and mu < 1, and the W is updated by using a competition counting rulej,iThe adaptive scheduling algorithm of (a) can be uniquely represented by a triplet (α, λ, μ).
As shown in fig. 5, the data structures used by the adaptive scheduling algorithm are mainly queues and hash tables.
The queue is used for storing the requests to be processed. The space complexity is O (N), and N is the maximum value of the queue length; the average time complexity of enqueue and dequeue operations is O (1).
Hash table for storing Wj,iWhen global variables are equal, the method realizes access according to 'equipment-interface' by adopting a nested modeAsking for an element, the element being NULL indicates that no corresponding interface is deployed on the device. The space complexity is O (nm), wherein n is the number of devices and m is the number of interfaces. The average temporal complexity of accessing an element is O (1).
The core of the adaptive scheduling algorithm is adaptive flow control, and the key influencing the performance of the algorithm is the adjustment rule and the parameters of a receiving window. Therefore, three steps are required to be completed for realizing the adaptive scheduling algorithm:
(1) in this chapter, a "competition count" rule is selected, and the parameters to be optimized are α, λ and μ.
(2) Verifying the effectiveness of the algorithm and determining whether the algorithm can effectively control the flow to the device.
(3) And aiming at the parameters of the actual data service cluster optimization algorithm, the scheduling efficiency is improved.
And measuring the service capacity of the single device after the self-adaptive scheduling algorithm is used, comparing the result with the service capacity without flow control, and verifying that the self-adaptive scheduling algorithm can effectively control the flow reaching the device to prevent the loss of the service capacity of the device due to overload. The verification method comprises the following steps:
(1) the "race count" rule is chosen to take any legal value for the parameter, where the value of the parameter is (α, λ, μ) ═ 0.5, 0.5, 0.5.
(2) The algorithm is applied to the completely same single equipment, and the service capability of the equipment is measured under the same experimental environment by the same method. As shown in FIG. 6, the abscissa represents the request arrival rate v and the ordinate represents the service capability Fi(v) In that respect The black curve represents the result of using the adaptive scheduling algorithm, and the result shows that the adaptive scheduling algorithm obviously improves the service capability of a single device under high load.
Optimizing algorithm parameters α, lambda, mu on a 'swallow cloud' data service cluster, wherein the optimization aim is to improve the comprehensive scheduling efficiency HQ(v) The optimization method comprises the following steps:
(1) assuming that the three parameters can be optimized independently, when one of the parameters is optimized, the values of the other two parameters are fixed.
(2) Fixing α and lambda, uniformly selecting k values from the (0, 1) interval as mu candidates (taking k as 4 in experiments, the candidate values are 0.2, 0.4, 0.6 and 0.8), and sequentially testing the comprehensive scheduling efficiency H of different mu pairs of clustersQ(v) Is selected such that HQ(v) Overall highest value μ'.
(3) Similarly, fixing α and μ optimizes λ, and fixing λ and μ optimizes α, yields a combination of α, λ, μ optima (α ', λ ', μ ').
(4) The local optimality of (α ', λ ', μ ') was verified, i.e. better than all (α ' + p Δ, λ ' + q Δ, μ ' + r Δ (p, q, r ∈ -1, 0, 1 and not all 0, the experiment was taken as Δ ═ 0.1.) it was necessary to test 26 parameter combinations and compare the results with (α ', λ ', μ ').
The parameters (α, λ, μ) were optimized as described above on cluster Q' consisting of 20 identical devices (software and hardware configuration and deployed behavioral reflex interface).
Fixing α and lambda optimizing mu
When α is 0.5, λ is 0.5, and μ ∈ {0.2, 0.4, 0.6, 0.8}, it is found that the most suitable candidate value of μ is 0.4, and therefore μ' is 0.4.
Fix α and μ optimize λ
When α is 0.5, μ is 0.5, λ ∈ {0.2, 0.4, 0.6, 0.8}, it is shown that λ' is 0.2.
Fixed lambda and mu optimization α
When λ is 0.5, μ is 0.5, and α∈ {0.2, 0.4, 0.6, and 0.8}, α' is 0.8.
Verification of local optimality
The optimal parameter combination is (α ', λ', μ ') (0.8, 0.2, 0.4), and the experimental results show that (α', λ ', μ') is superior to (α, λ, μ ') of 26 groups, and satisfies local optimality, by comparing the adjacent 26 groups (α, λ, μ) with (α', λ ', μ').
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (10)

1. An adaptive scheduling method for a data service cluster, comprising:
sending out calling requests, and arranging the calling requests in a request queue according to a first-in first-out sequence;
reading a request of a queue head, analyzing the request, and reading an interface of the request;
screening candidate equipment meeting the conditions;
if no candidate equipment exists, adding the call request to the tail of the queue; if the candidate equipment exists, selecting the candidate equipment with the lowest load;
executing the request on the candidate equipment, if the execution request exceeds the set time or the execution request fails, recording the execution failure, and executing the next instruction;
if the execution is successful, the execution is recorded to be successful, and the next instruction is executed.
2. The adaptive scheduling method of a data service cluster according to claim 1, wherein the request is executed on the candidate device, and if the request exceeds a set time or the request fails to be executed, after the execution fails, a failure condition is determined, and a next instruction is executed, where the failure condition is determined and the next instruction is executed:
judging whether F is satisfiedj′,i′≥λVj′,i′Is established, Fj′,i′Maintaining a failed count, V, at device j' for calling interface ij′,i′For the rate at which requests for invoking interface i 'arrive at device j', λ is a real number between 0 and 1, which may be 0.5, and if true, makes the order
Figure FDA0002182987190000011
Wj′,i′The maximum number of simultaneous calls that can be made to the interface i 'on the recording device j', α being a real number between 0 and 1, Sj′,i′=0,Fj′,i′=0,Sj′,i′Performing a successful count at device j 'for the call interface i'; then returning to execute the next instruction; fj′,i′≥λVj′,i′If not, returning to execute the next instruction.
3. The adaptive scheduling method of a data service cluster according to claim 1, wherein if the execution is successful, after the execution is successfully recorded, a success condition is determined, and a next instruction is executed, where the determination of the success condition and the execution of the next instruction specifically include:
judging whether S is satisfiedj′,i′≥μVj′,i′Is true, mu is a rational number between 0 and 1, Sj′,i′To invoke the interface i 'to perform a successful count at device j', Vj′,i′The rate at which requests for call interface i 'reach device j'; when it is established, let Wj′,i′=0,Sj′,i′=0,Fj′,i′0, wherein Fj′,i′Maintaining a failed count at the device j 'for the calling interface i', and then returning to execute the next instruction; sj′,i′≥μVj′,i′If not, returning to execute the next instruction.
4. The adaptive scheduling method of a data service cluster according to claim 2 or 3, wherein the values of the parameters α, λ, μ are optimized, specifically:
selecting a group of parameters α, lambda and mu, wherein the value ranges of α, lambda and mu are all 0 to 1;
at a velocity vi=v0Sending a call request v to a calling interface i of the cluster gateway0Measuring a rate w' at which the request is completed for setting the rate; repeating the operation for multiple times to obtain the mean value of w
Figure FDA0002182987190000021
Rate at which interface i cluster completes requests
Figure FDA0002182987190000022
Selecting different viIn the same way, different v can be measurediRate G of completion request of corresponding interface i clusterQ,i(vi);
Compute interface i scheduling efficiency function HQ,i(vi);
Changing to request the interface i from the cluster gateway until all the interfaces go through, and repeatedly calculating to obtain a scheduling efficiency function HQ,i(vi) (ii) a Measuring comprehensive scheduling efficiency function H of m interfaces on cluster QQ(v);
Setting step length, changing α, lambda and mu, repeatedly measuring to obtain comprehensive scheduling efficiency function HQ(v) Selecting a comprehensive scheduling efficiency function HQ(v) α ', λ ' and μ ' corresponding to the maximum value.
5. The method of claim 4, wherein the scheduling efficiency function H is the scheduling efficiency functionQ,i(vi) The calculation method comprises the following steps:
Figure FDA0002182987190000023
wherein F ″)i(x) rate F of completion of single device interface i request under flow control conditioni(ii) an approximate representation of an ideal value of (),
Figure FDA0002182987190000024
wherein v isi *Is to follow viIncrease assay Fi(vi) Increasing to an overload threshold; fi(vi *) At a velocity vi *The rate at which a single device requests completion, | DiAnd | is the number of devices of the interface i.
6.The method of claim 5, wherein v is the number of bits of the data service clusteri *And Fi(vi) The measuring method comprises the following steps:
controlling an external variable;
at a velocity vi=v0Sending a request for calling an interface i to the equipment, and measuring the rate w' of the completion of the request; repeating the operation for multiple times to obtain the mean value of w
Figure FDA0002182987190000031
Then
Figure FDA0002182987190000032
Change viMeasuring Fi(vi) At viTaking the results for other values to give Fi(vi) And vi *
7. The method of claim 6, wherein the change v is a change in a scheduling algorithmiMeasuring Fi(vi) At viTaking other values, the results are specifically:
viincrease exponentially, judge Fi(v) The overall trend and the interval of the maximum value of (c);
vilinearly traversing the interval where the maximum value is positioned, and determining Fi(vi) Maximum value of (2) and Fi(vi) The law of variation around the maximum.
8. The adaptive scheduling method of claim 7 wherein the step size is set and the overall scheduling efficiency function H is obtained by repeating the measurement by changing α, λ and μQ(v) Selecting a comprehensive scheduling efficiency function HQ(v) α ', λ ', μ ' corresponding to the maximum value are specifically:
fixing α and lambda optimizing mu to obtain a measured comprehensive scheduling efficiency function HQ(v) Optimization ofFixing α and optimizing lambda to determine comprehensive scheduling efficiency function HQ(v) Obtaining optimum lambda', fixing lambda and mu optimization α, determining comprehensive scheduling efficiency function HQ(v) Optimal α 'is obtained, thus α', λ ', μ' are obtained.
9. The adaptive scheduling method of a data service cluster as claimed in claim 8, wherein the optimal parameters (α ', λ', μ ') are verified for local optimality, the neighboring 26 groups (α, λ, μ) are compared with (α', λ ', μ'), and if (α ', λ', μ ') is better than 26 groups (α, λ, μ), the local optimality is satisfied, and if one or more of them are better than (α', λ ', μ'), H is selectedQ(v) The highest corresponding (α, λ, μ) is the optimal (α ', λ ', μ ').
10. The method for adaptive scheduling of a data service cluster of claim 9,
the comprehensive scheduling efficiency function HQ(v) The calculation method comprises the following steps:
Figure FDA0002182987190000041
and m is the number of interfaces.
CN201910803526.6A 2019-08-28 2019-08-28 Self-adaptive scheduling method of data service cluster Active CN111352728B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910803526.6A CN111352728B (en) 2019-08-28 2019-08-28 Self-adaptive scheduling method of data service cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910803526.6A CN111352728B (en) 2019-08-28 2019-08-28 Self-adaptive scheduling method of data service cluster

Publications (2)

Publication Number Publication Date
CN111352728A true CN111352728A (en) 2020-06-30
CN111352728B CN111352728B (en) 2023-10-03

Family

ID=71196890

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910803526.6A Active CN111352728B (en) 2019-08-28 2019-08-28 Self-adaptive scheduling method of data service cluster

Country Status (1)

Country Link
CN (1) CN111352728B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070061783A1 (en) * 2005-09-09 2007-03-15 Sun Microsystems, Inc. Task dispatch monitoring for dynamic adaptation to system conditions
WO2008151464A1 (en) * 2007-06-14 2008-12-18 Zte Corporation A wireless network simulation method
KR101396394B1 (en) * 2013-03-20 2014-05-19 주식회사 스마티랩 Methods to autonomously optimize performance using clustering in mobile cloud environment
WO2014173339A1 (en) * 2013-08-07 2014-10-30 中兴通讯股份有限公司 Task scheduling service system and method
US10382380B1 (en) * 2016-11-17 2019-08-13 Amazon Technologies, Inc. Workload management service for first-in first-out queues for network-accessible queuing and messaging services

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070061783A1 (en) * 2005-09-09 2007-03-15 Sun Microsystems, Inc. Task dispatch monitoring for dynamic adaptation to system conditions
WO2008151464A1 (en) * 2007-06-14 2008-12-18 Zte Corporation A wireless network simulation method
KR101396394B1 (en) * 2013-03-20 2014-05-19 주식회사 스마티랩 Methods to autonomously optimize performance using clustering in mobile cloud environment
WO2014173339A1 (en) * 2013-08-07 2014-10-30 中兴通讯股份有限公司 Task scheduling service system and method
CN104346215A (en) * 2013-08-07 2015-02-11 中兴通讯股份有限公司 Task scheduling service system and method
US10382380B1 (en) * 2016-11-17 2019-08-13 Amazon Technologies, Inc. Workload management service for first-in first-out queues for network-accessible queuing and messaging services

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐文莉: "分布式渲染调度策略优化研究与实现", pages 1 *

Also Published As

Publication number Publication date
CN111352728B (en) 2023-10-03

Similar Documents

Publication Publication Date Title
US11736388B1 (en) Load balancing path assignments techniques
US20190260845A1 (en) Caching method, system, device and readable storage media for edge computing
US9929970B1 (en) Efficient resource tracking
US20110161294A1 (en) Method for determining whether to dynamically replicate data
EP1564638A1 (en) A method of reassigning objects to processing units
US10810054B1 (en) Capacity balancing for data storage system
US10432429B1 (en) Efficient traffic management
WO2017016499A1 (en) Method and device for leveling load of distributed database
CN109688229A (en) Session keeps system under a kind of load balancing cluster
CN112799823A (en) Online dispatching and scheduling method and system for edge computing tasks
US8886804B2 (en) Method for making intelligent data placement decisions in a computer network
JPWO2014148247A1 (en) Processing control system, processing control method, and processing control program
CN115629865B (en) Deep learning inference task scheduling method based on edge calculation
CN108990114B (en) Assigning a subset of access points in a wireless network to a high priority level
JP2005148911A (en) Load distribution method and device, system and its program
CN117155942A (en) Micro-service dynamic self-adaptive client load balancing method and system
CN114422508B (en) Method and system for collecting network equipment performance
KR20180046078A (en) Database rebalancing method
CN109815204A (en) A kind of metadata request distribution method and equipment based on congestion aware
CN111352728A (en) Self-adaptive scheduling method of data service cluster
KR102496115B1 (en) Apparatus and Method of Altruistic Scheduling based on Reinforcement Learning
CN109982375A (en) A kind of the load balancing method of adjustment and device of serving cell
Al-Abbasi et al. On the information freshness and tail latency trade-off in mobile networks
Huang et al. Intelligent task migration with deep Qlearning in multi‐access edge computing
CN113596146B (en) Resource scheduling method and device based on big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant