CN111352728B - Self-adaptive scheduling method of data service cluster - Google Patents

Self-adaptive scheduling method of data service cluster Download PDF

Info

Publication number
CN111352728B
CN111352728B CN201910803526.6A CN201910803526A CN111352728B CN 111352728 B CN111352728 B CN 111352728B CN 201910803526 A CN201910803526 A CN 201910803526A CN 111352728 B CN111352728 B CN 111352728B
Authority
CN
China
Prior art keywords
request
interface
lambda
rate
execution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910803526.6A
Other languages
Chinese (zh)
Other versions
CN111352728A (en
Inventor
黄罡
董瀚
景翔
蔡华谦
姜海鸥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Information Technology Institute (tianjin Binhai)
Original Assignee
Peking University Information Technology Institute (tianjin Binhai)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Information Technology Institute (tianjin Binhai) filed Critical Peking University Information Technology Institute (tianjin Binhai)
Priority to CN201910803526.6A priority Critical patent/CN111352728B/en
Publication of CN111352728A publication Critical patent/CN111352728A/en
Application granted granted Critical
Publication of CN111352728B publication Critical patent/CN111352728B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application relates to the field of task scheduling, in particular to a self-adaptive scheduling method of a data service cluster. Comprising the following steps: sending a calling request, analyzing the request and reading the interface of the request; screening candidate devices meeting the conditions; selecting a candidate device with the lowest load; executing the request on the candidate equipment, if the set time is exceeded or the execution fails, recording the execution failure, judging the failure condition, and executing the next instruction; if the execution is successful, recording the success of the execution, judging the success condition, and executing the next instruction. The application selects the equipment with the smallest load to realize the average flow distribution without accurately monitoring the instantaneous request flow reaching the equipment; and automatically adjusting according to success or failure of interface call to automatically adapt to an unknown interface.

Description

Self-adaptive scheduling method of data service cluster
Technical Field
The present application relates to the field of data scheduling, and in particular, to a method for adaptive scheduling of a data service cluster.
Background
The current artificial intelligence is free from data, the acquisition of big data becomes an obvious bottleneck, a large number of data barriers appear, and a large number of formed 'information islands' lead the research of big data to face the dilemma of no available data. The "information islanding" problem of smart devices is more serious due to the inherent closeness of mobile applications.
One idea for solving the problem of "information island" of intelligent equipment is to develop a novel software definition theory based on a classical software definition theory, namely, exposing controllable components of the intelligent equipment through an application programming interface (application programming interface, API) so as to realize on-demand management and on-demand service of the intelligent equipment.
Unlike classical data service clusters, the novel data service cluster has the following characteristics: the service capability of different interfaces is greatly different; the service capabilities of the device are greatly affected by the requested traffic.
Disclosure of Invention
The embodiment of the application provides a self-adaptive scheduling method of a data service cluster, which maintains a receiving window for each device, controls the number of requests processed by the device at the same time, and simultaneously adjusts the size of the receiving window according to the feedback of the request processing result so as to realize the self-adaptation of flow control.
According to a first aspect of an embodiment of the present application, a method for adaptive scheduling of a data service cluster includes:
sending out call requests, and arranging the call requests in a request queue according to a first-in first-out sequence;
reading a request at the head of a team, analyzing the request, and reading an interface of the request;
screening candidate devices meeting the conditions;
if no candidate device exists, adding the calling request to the tail of the queue; if the candidate equipment exists, selecting the candidate equipment with the lowest load;
executing the request on the candidate equipment, if the execution request exceeds the set time or the execution request fails, recording the execution failure and executing the next instruction;
if the execution is successful, recording that the execution is successful, and executing the next instruction.
And if the execution request exceeds a set time or the execution request fails, recording an execution failure, judging a failure condition, and executing a next instruction, wherein the judgment failure condition comprises the following specific steps:
judging whether or not F is satisfied j′,i′ ≥λV j′,i′ Hold, F j′,i′ To invoke interface i 'to maintain a count of failures at device j', V j′,i′ To invoke the rate at which the requests of interface i 'reach device j', λ is a real number between 0 and 1, possibly 0.5, when established, letW j′,i′ The maximum number of interfaces i 'that can be called simultaneously on recording device j' is a real number between 0 and 1, S j′,i′ =0,F j′,i′ =0,S j′,i′ Performing a successful count at device j 'for call interface i'; then returning to execute the next instruction; f (F) j′,i′ ≥λV j′,i′ And when not established, returning to execute the next instruction.
If the execution is successful, after the execution success is recorded, judging a success condition, and executing the next instruction, wherein the judging the success condition, the executing the next instruction specifically comprises:
judging whether or not S is satisfied j′,i′ ≥μV j′,i′ Is established that mu is a rational number between 0 and 1, S j′,i′ To call interface i 'to perform a successful count at device j', V j′,i′ The rate at which requests to invoke interface i 'reach device j'; when in stand, let W j′,i′ =0,S j′,i′ =0,F j′,i′ =0, where F j′,i′ Maintaining a failure count in the device j 'for the calling interface i', and then returning to execute the next instruction; s is S j′,i′ ≥μV j′,i′ And when not established, returning to execute the next instruction.
The self-adaptive dispatching method of the data service cluster optimizes the values of parameters alpha, lambda and mu, and specifically comprises the following steps:
selecting a group of parameters alpha, lambda and mu, wherein the values of alpha, lambda and mu are all 0 to 1;
at a rate v i =v 0 Sending a call request to a cluster gateway request call interface i, v 0 To set the rate, the rate at which the request is completed, w', is measured; repeating the operation for multiple times to obtain the average value of wInterface i cluster completion request rate +.>
Selecting different v i Similarly, different v can be measured i Corresponding interface i cluster completion request rate G Q,i (v i );
Computing interface i scheduling efficiency function H Q,i (v i );
Changing the request interface i to the cluster gateway until all interfaces are traversed, and repeating calculation to obtain a dispatching efficiency function H Q,i (v i ) The method comprises the steps of carrying out a first treatment on the surface of the Determination of the comprehensive scheduling efficiency function H of m interfaces on a cluster Q Q (v);
Setting step length, changing alpha, lambda and mu, repeatedly measuring to obtain comprehensive dispatching efficiency function H Q (v) Selecting a comprehensive scheduling efficiency function H Q (v) Alpha ', lambda ', mu ' corresponding to the maximum value.
The scheduling efficiency function H Q,i (v i ) The calculation method of (1) is as follows:
wherein F' i Rate F for completion of single device interface i request under flow control conditions i Approximation of ideal value, < >>Wherein v is i * As v i Increase assay F i (v i ) By increasing to a critical value for overload; f (F) i (v i * ) At a rate of v i * Rate of completion, |d, requested by a single device of (a) a i I is the number of devices of interface i.
The v is i * And F i (v i ) The measurement method of (2) comprises the following steps:
controlling an external variable;
at a rate v i =v 0 Sending a request for calling an interface i to equipment, and measuring the rate w' of completion of the request; repeating the operation for multiple times to obtain the average value of wThen->
Changing v i F is measured i (v i ) At v i Taking the results of other values to obtain F i (v i ) And v i *
The change v i F is measured i (v i ) At v i The results when other values are taken are specifically:
v i exponentially increasing to judge F i (v) The overall trend and the interval of the maximum value;
v i determining F by linearly traversing the interval where the maximum value is located i (v i ) Maximum value of F i (v i ) A law of variation around a maximum value.
The step length is set, and the comprehensive dispatching efficiency function H is obtained by repeatedly measuring by changing alpha, lambda and mu Q (v) Selecting a comprehensive scheduling efficiency function H Q (v) Alpha ', lambda ', mu ' corresponding to the maximum value are specifically:
fixing alpha and lambda to optimize mu to obtain a measurement comprehensive scheduling efficiency function H Q (v) Optimal μ'; fixed alpha and mu optimized lambda, and the comprehensive dispatching efficiency function H is measured Q (v) Obtaining optimal lambda'; fixed lambda and mu optimized alpha, and measuring comprehensive dispatching efficiency function H Q (v) The optimum α 'is obtained, resulting in α', λ ', μ'.
Verifying the local optimality of the optimal parameters (alpha ', lambda', mu '), comparing the adjacent 26 groups (alpha, lambda, mu) with (alpha', lambda ', mu'), if (alpha ', lambda', mu ') is better than the 26 groups (alpha, lambda, mu), meeting the local optimality, and if one or more groups are better than (alpha', lambda ', mu'), selecting H Q (v) The highest corresponding (α, λ, μ) is the optimum (α ', λ ', μ ').
The comprehensive dispatching efficiency function H Q (v) The calculation method of (1) is as follows:
m is the number of interfaces.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
the application researches and builds the self-adaptive dispatching algorithm model for improving the dispatching efficiency of the equipment cluster based on the characteristic of the service capability of the single equipment, and considers the influence of the parameters for controlling the receiving window on the algorithm. The specific implementation mode is that a receiving window is maintained for each device, the number of requests processed by the device is controlled, and meanwhile, the size of the receiving window is adjusted according to the feedback of the request processing result, so that the self-adaption of flow control is realized.
By W j,i Indirectly describing service capability F of a single device i (v) By C i,i Describing the load of the device; ensure C j,i ≤W j,i To achieve flow control; selection ofMinimum devices to achieve flow average distribution without precisely monitoring the instantaneous requested flow to the devices; automatically adjusting W based on success or failure of interface calls j,i To achieve automatic adaptation v * Unknown interfaces.
The application discovers that the service capacity of the single equipment is highly sensitive to the overload of the flow rate through experiments, and therefore provides a method for calculating the ideal value of the service capacity of the single equipment based on the flow rate control condition.
The application tests and optimizes the self-adaptive dispatching algorithm on the data service cluster. By comparing with a single device without flow control, the self-adaptive scheduling algorithm is verified to be capable of effectively controlling the flow. And obtaining the optimal parameter combination by carrying out independent optimization experiments and analysis on each parameter of the algorithm.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a flow chart of an adaptive scheduling method for a data service cluster according to the present application;
FIG. 2 is a measurement result of the service capability of the interface of the single equipment part in the second embodiment;
FIG. 3 is an ideal value of the interface service capability of a single device portion under the flow control condition in the second embodiment;
FIG. 4 is a measurement result of the service capability of the interface of the single equipment part in the second embodiment;
FIG. 5 is a data structure used by the adaptive scheduling algorithm in the second embodiment;
fig. 6 is an effect of the adaptive scheduling algorithm controlling the flow in the second embodiment.
Detailed Description
Example 1
As shown in fig. 1, the present application provides an adaptive scheduling method for a data service cluster, including:
the API sends out call requests, and the API call requests are arranged after the existing requests according to a first-in first-out (FIFO) sequence;
reading the request at the forefront, analyzing the request, and reading an interface i' of the request;
traversal W j,i And C j,i Corresponding columns, wherein W n×m To maintain the receive window matrix, element W j,i Recording the maximum number of simultaneous calls of interface i on device j, C n×m Element C as a concurrent number matrix j,i Recording the current concurrency number (the number of requests being processed) of the interface i on the equipment j; get a set D '(D is the set of all devices) of candidate devices (i.e., devices with current concurrency number less than the receive window) D' = { j|c j,i′ <W j,i′ ,j∈D}
When D' is an empty set, rearranging the API call requests in a request queue according to the sequence;
when D ' is not an empty set, selecting the lowest loaded device j ' from set D ':
c is C j′,i′ Adding 1, assigning the request to device j' for execution, and if the update time is reached, updating V j′,i′ ,V j′,i′ Recording the rate at which requests calling interface i 'reach device j', V j′,i′ No frequent updates (e.g., updates per minute) are required;
when the execution of the device j' is overtime or fails, F is added j′,i′ Adding 1, F j′,i′ To maintain the failure request count matrix, determine if F is satisfied j′,i′ ≥λV j′,i′ When it is true, lambda is a rational number between 0 and 1, which may be 0.5, letAlpha is a rational number between 0 and 1, and can be 0.5, S j′,i′ =0,F j′,i′ =0; then returning to execute the next instruction; f (F) j′,i′ ≥λV j′,i′ When not established, returning to execute the next instruction;
when the device j' executes successfully, S is j′,i′ Adding 1, S m×n To execute a successful request count matrix, S j′,i′ To call interface i 'to perform successful counting at device j', it is determined whether S is satisfied j′,i′ ≥μV j′,i′ If true, μ is a rational number between 0 and 1, which may be 0.5, let W j′,i′ ,S j′,i′ =0,F j′,i′ =0, then return to execute the next instruction; s is S j′,i′ ≥μV j′,i′ When not established, returning to execute the next instruction;
preferably, the values of the parameters α, λ, μ are optimized, specifically:
selecting a group of parameters alpha, lambda and mu, wherein the values of alpha, lambda and mu are all 0 to 1;
at a rate v i =v 0 Requesting call interfaces i, v from a cluster gateway 0 To set the rate, measureThe rate of completion of the request w'; preferably, the operation is repeated a plurality of times to obtain the average value of wThen->G Q,i (v 0 ) The rate at which requests are completed for the cluster;
selecting v i Exponentially increasing (e.g. v=1, 2 2 ,2 3 ,...,2 n ) Similarly, G can be measured Q,i (v i );
Calculating a scheduling efficiency functionWherein F' i Rate F for single device request completion under flow control conditions i Ideal value F' i Approximation of>Wherein v is i * As v i Increase assay F i (v i ) By increasing to a critical value for overload; f (F) i (v i * ) At a rate of v i * The rate at which a single device requests completion;
changing the request interface i to the cluster gateway until all interfaces are traversed, and repeating calculation to obtain a dispatching efficiency function H Q,i (v i ) The method comprises the steps of carrying out a first treatment on the surface of the Defining comprehensive dispatching efficiency function H of m interfaces on cluster Q Q (v):
Setting step length, changing alpha, lambda and mu, repeatedly measuring to obtain comprehensive dispatching efficiency function H Q (v) Selecting a comprehensive scheduling efficiency function H Q (v) The maximum value corresponding to alpha ', lambda', mu 'is the optimal parameter alpha', lambda ', mu'.
Preferably, a and λ can be fixed to optimize μ, let a=0.5, λ=0.5, μ e {0.2,0.4,0.6,
0.8, obtaining the optimal mu'; fixing alpha and mu to optimize lambda, wherein alpha=0.5, mu=0.5 and lambda epsilon 0.2,0.4,0.6,0.8 to obtain optimal lambda'; fixing λ and μ to optimize α, let λ=0.5, μ=0.5, α e 0.2,0.4,0.6,0.8, to obtain the optimal α'.
Preferably, the verification of the local optimality is performed, the optimal parameter combination is (alpha ', lambda', mu '), the adjacent 26 groups (alpha, lambda, mu) are compared with (alpha', lambda ', mu'), if (alpha ', lambda', mu ') is better than 26 groups (alpha, lambda, mu), the local optimality is satisfied, and if one or more groups are better than (alpha', lambda ', mu'), H is selected Q (v) The highest corresponding (α, λ, μ) is the optimum (α ', λ ', μ ').
Example two
Single device service capability measurement
Step 1: controlling variables
(1) Software and hardware configuration of the device: the measurement is carried out on the equipment with the same manufacturer, the same model and the same Android version, all the applications which are allowed to be uninstalled are uninstalled, and only the applications and interfaces to be measured are installed. And (5) switching the equipment into a power supply, and keeping the state that the battery is full.
(2) Network environment: the device is connected to a stable Wi-Fi access point, measured during off-peak hours.
Step 2: measurement F i (v)
At a rate v=v 0 Sending a request for calling an interface i to equipment, and measuring the rate w' of completion of the request; repeating the operation for multiple times to obtain the average value of wThen->Similarly, can measure F i (v) Results when v takes other values. Due to measurement of single F i (v) Is time-consuming and the range of v is large, ifv is not properly selected and the total measurement time will be difficult to withstand. To determine F in a reasonable time i (v) The overall trend, maximum and variation law around maximum should be chosen as follows:
(1) First, v increases exponentially (e.g., v=1, 2 2 ,2 3 ,...,2 n ) Judgment F i (v) The overall trend and interval of maximum.
(2) Then, v linearly traversing the interval where the maximum value is located, and determining F i (v) Maximum value of F i (v) A law of variation around a maximum value.
Preferably, the completion rate of a single device is determined by:
to determine F in a reasonable time i (v) Overall trend, maximum and law of variation around maximum,
(1) First, v i Exponentially increasing (e.g. v i =1,2,2 2 ,2 3 ,...,2 n ) Judgment F i (v i ) The overall trend and interval of maximum.
(2) Then v i Determining F by linearly traversing the interval where the maximum value is located i (v i ) Maximum value of F i (v i ) A law of variation around a maximum value.
Preferably, in the measurement, the control variables are required: software and hardware configuration of the device: the measurement is carried out on the equipment with the same manufacturer, the same model and the same Android version, all the applications which are allowed to be uninstalled are uninstalled, and only the applications and interfaces to be measured are installed. The equipment is connected with a power supply, and the state that the battery is full is kept; selected network environment: the device is connected to a stable Wi-Fi access point, measured during off-peak hours.
The service capabilities of a partially behavioural reflective interface on a single device are measured. The experimental equipment is a Changhong S07 mobile phone, and the Android version is 6.0. The selected interface was open from seven common applications (see table 1). The other interfaces have parameters except for the interface of 'acquire new product list' of 'more than one' application. To prevent the application local cache from affecting the accuracy of the measurement, which may be present, a randomly chosen parameter is used per request. Wherein the key words are selected from the general word stock of Chinese, and the stock names or codes and the city codes are from the special word stock.
Table 1 measured behavioral reflex interface
As shown in FIG. 2, a measurement of v in an exponential increase shows F i (v) Is a general trend of (a):
(1) As v increases, F i (v) Increasing and then decreasing, and finally approaching to 0; f (F) i (v) The variation of (c) corresponds to a typical "idle-saturated-overloaded" procedure.
(2)F i (v) Image and F of the growing phase (before reaching maximum) i (v) The higher overlap of the image of =v (grey dashed line) indicates that the interface can provide service more stably before the requested traffic is saturated.
(3) F of different interfaces i (v) The maximum values are different and may be quite different, representing differences in service capabilities. For example, a "search coupon" interface max F for a "beauty team" application i (v) < 5, while "search stock information" interface max F of "Yi Union playman classical edition i (v) > 100, which differ by more than 20 times.
FIG. 4 shows a v linear traversal F i (v) And measuring results of the interval where the maximum value is located. The abscissa, the ordinate, and the grey dashed line have the same meaning as in fig. 2. The image on the left side of each row is the same as the corresponding curve in fig. 2 (v increases exponentially), and the highlighted part is the section where the maximum value is located; the image on the right side is F i (v) Details of the change around the maximum (v increases linearly). Image display, F i (v) If v continues to increase after reaching maximum value, F i (v) The fluctuation of (2) is obviously increased; description of the applicationThe interface is more sensitive to the increase in the requested traffic after it is saturated.
From the above measurement results, it can be presumed that if some flow control mechanism is used to reject part of the request when v is too large, so that the rate v' at which the device accepts the request satisfies the following equation, the service capability degradation caused by overload can be avoided.
As shown in fig. 3, F under defined flow control conditions i (v) Is F' i (v):
v<v * Time F i (v) Can be approximated as F i (v) =v, thus F' i (v) Can be approximately expressed as F i (v):
The "Yan Yun" data service cluster for which the adaptive scheduling algorithm is applicable should satisfy the following assumptions:
(1) The cluster consists of n devices (numbered 1,2, once again, n).
(2) The network environment where the clusters are located is stable.
(3) M interfaces (numbered 1,2,., m) are deployed, the set of device numbers for deployment interface i is D i
(4) The service capacity of the same interface on each device is the same, using the function F i (v) And (3) representing.
(5) Different interfaces deployed on the same equipment have no influence on each other, namely the function F i (v) Independent of other interfaces deployed on the device.
(6) The interface call request is stateless, and the gateway can forward the request to any device in the cluster where the corresponding interface is deployed for processing.
A "Yan Yun" data service cluster Q meeting the above assumption may be represented by 2m+2 tuples:
Q=(n,m,D 1 ,D 2 ,...,D m ,F 1 (v),F 2 (v),...,F m (v))
the total service capacity of cluster Q on interface i may be defined as a function G Q,i (v) V is the rate (in times/second) at which requests arrive at the cluster, G Q,i (v) The rate of requests completed in times per second for the cluster. The gateway assigns a request rate v to device j j (in times/second), then there are:
F″ i (v) Is F under the flow control condition i (v) Approximate representation of the ideal value, easy to prove:
i.e.For given |D i I and F i (v) Time G Q,i (v) Upper limit of (2). Defining a scheduling efficiency function H for interface i on cluster Q Q,i (v):
Defining comprehensive dispatching efficiency function H of m interfaces on cluster Q Q (v):
Cluster parameters (i.e. n, m, D i ,F i ) And scheduling algorithm can both affect H Q (v) When the cluster parameters are unchanged, H Q (v) Reflecting the performance of the scheduling algorithm. Thus scheduling algorithmThe aim is to improve H Q (v)
To improve H Q (v) It is necessary to raise H Q,i (v) A. The application relates to a method for producing a fibre-reinforced plastic composite To improve H Q,i (v) It is necessary to make G Q,i (v) Approaching the upper limit, while satisfying:
formula (1) requires control of the request flow, ensuring D i The rate at which each device accepts requests does not exceed a threshold v * The method comprises the steps of carrying out a first treatment on the surface of the Equation (2) requires an average allocation of the request traffic to make D i The rate at which each device accepts requests is equal (i.e., the load is equal). The scheduling algorithm can make G by only properly controlling and distributing the request traffic Q,i (v) Approaching the upper limit, thereby making H Q,i (v) Near 1, finally let H Q (v) Approaching 1, the optimization objective is realized.
However, in order to approximately satisfy the above conditions, the scheduling algorithm needs to solve the following problems:
(1) Critical value v of different interfaces * Different interfaces which are required to be supported by the scheduling algorithm are not controlled, and critical values of all interfaces cannot be measured through experiments in advance.
(2) When the request traffic to the cluster is large, monitoring the instantaneous traffic to each device faces a low latency and high accuracy conflict: in order to monitor traffic data more accurately, the frequency of updating data must be increased, but this leads to an increase in delay of the scheduling algorithm.
One possible solution is to adaptively control the flow, dynamically aware of the interface threshold and the device load as follows:
(1) Maintaining a receive window matrix W n×m Element W i,i The maximum number of simultaneous calls that interface i can be made on device j is recorded. W (W) j,i The initial value of (2) can be set to be a smaller integer, W is in the operation process of the algorithm j,i Will be updated continuously.
(2) Maintaining a current concurrency matrix C n×m Element C j,i The current concurrency number (number of requests being processed) of interface i on device j is recorded. C when gateway forwards interface i call request to device j j,i Adding 1; c when device j processes interface i call request once, no matter whether execution is successful or not j,i Minus 1.
(3) When the gateway receives the call request of the interface i, trying to find target equipment j' meeting the following conditions:
(4) If the target equipment i' conforming to the above formula does not exist, executing the current limiting operation, namely waiting for the equipment conforming to the condition to be processed after the equipment appears, and refusing the request when overtime occurs in the waiting process; if the target device j 'exists, forwarding the request to the device j' for execution, and waiting for an execution result.
Determining W j,i The adjustment rules of (a) need to solve two key problems: (1) How to judge success or failure of a large number of requests in a short time; (2) W (W) j,i The amount should be increased or decreased. The second problem is not difficult to solve, since the cluster equipment is very sensitive to request overload, for W j,i The adjustment of (c) should follow the principle of "additive increase, multiplicative decrease". The "large number" of decisions in the first question may be based on the ratio of the successful or failed request count to the traffic of requests to arrive at the device, with the real difficulty of defining a "short time". The proposed 'competition counting' rule can avoid the direct judgment of 'short time' and achieve the expected effect. The following is the "contention count" rule:
(1) Maintaining a current traffic matrix V n×m Element V j,i The rate at which the request invoking interface i arrives at device j is recorded. V (V) j,i No frequent updates (e.g., every minute) are required.
(2) Maintaining a successful request count matrix S n×m Whenever device j invokes interface i successfully, S j,i 1 is added. If updated S j,i Satisfy S j,i ≥μV j,i W is then j,i Adding 1, S j,i And F j,i And (5) zero clearing.
(3) Maintenance failure request count matrix F n×m F whenever device j fails to invoke interface i j,i 1 is added. If F after updating j,i Satisfy F j,i ≥λV j,i ThenS j,i And F j,i And (5) zero clearing.
The key to the "race count" rule is S j,i And F j,i The variable that reached the threshold earlier will result in W j,i Is then S j,i And F j,i Are all clear, so that a successful or failed request is to S j,i And F j,i The influence of (c) is not sustained, S j,i And F j,i The cluster state in a short time can be reflected more accurately.
The parameters to be determined are alpha, lambda and mu, and the value range is 0 < alpha, lambda and mu < 1. Updating W using a "race count" rule j,i Can be uniquely represented by a triplet (α, λ, μ).
As shown in fig. 5, the data structures used by the adaptive scheduling algorithm are mainly queues and hash tables.
The queue is used for storing pending requests. The spatial complexity is O (N), N is the maximum value of the length of the queue; the average time complexity of enqueue and dequeue operations is O (1).
Hash table for storing W j,i And (3) accessing elements according to the equipment-interface by adopting a nesting mode according to global variables, wherein the elements are NULL, which indicates that no corresponding interface is deployed on the equipment. The spatial complexity is O (nm), where n is the number of devices and m is the number of interfaces. The access element has an average temporal complexity of O (1).
The core of the self-adaptive scheduling algorithm is self-adaptive flow control, and the key affecting the algorithm performance is the adjustment rule of the receiving window and the parameters thereof. Therefore, three steps are needed to be completed to realize the adaptive scheduling algorithm:
(1) And selecting an adjustment rule of the receiving window, and determining parameters needing to be optimized. The chapter selects the 'competition counting' rule, and parameters to be optimized are alpha, lambda and mu.
(2) The validity of the algorithm is verified, and it is determined whether the algorithm is effective in controlling the flow to the device.
(3) And optimizing algorithm parameters aiming at actual data service clusters, and improving scheduling efficiency.
And measuring the service capacity of a single device after using the self-adaptive scheduling algorithm, comparing the result with the service capacity without flow control, and verifying that the self-adaptive scheduling algorithm can effectively control the flow reaching the device and prevent the device from losing the service capacity due to overload. The verification method comprises the following steps:
(1) And selecting a competition counting rule, and taking any legal value for the parameter. Here the parameter takes the value (α, λ, μ) = (0.5,0.5,0.5).
(2) The algorithm is applied to the single equipment which is completely the same, and the service capacity of the equipment is measured under the same experimental environment by the same method. As shown in fig. 6, the abscissa represents the request arrival rate v, and the ordinate represents the service capability F i (v) A. The application relates to a method for producing a fibre-reinforced plastic composite The black curve represents the result of using the adaptive scheduling algorithm, and the result shows that the adaptive scheduling algorithm significantly improves the service capability of a single device under high load.
The algorithm parameters α, λ, μ are optimized over the "Yan Yun" data service cluster. The optimization goal is to improve the comprehensive scheduling efficiency H Q (v) The optimization method is as follows:
(1) Assuming that three parameters can be optimized independently, the values of the other two parameters are fixed while optimizing one of the parameters.
(2) Fixing alpha and lambda, uniformly selecting k values from the (0, 1) interval as candidates of mu (the candidate value is 0.2,0.4,0.6,0.8 when k=4 is taken in experiments), and sequentially testing the comprehensive scheduling efficiency H of different mu pairs of clusters Q (v) Is selected to be H Q (v) Overall highest value μ'.
(3) Similarly, fixing α and μ optimizes λ, fixing λ and μ optimizes α, resulting in a combination of α, λ, μ optima (α ', λ ', μ ').
(4) Verifying the local optimality of (α ', λ ', μ ') i.e. better than all (α ' +pΔ, λ ' +qΔ, μ ' +rΔ (p, q, r e-1, 0,1 and not all 0) requires testing 26 parameter combinations, comparing the results with (α ', λ ', μ ').
The parameters (α, λ, μ) are optimized on the cluster Q' as described above. Cluster Q' consists of 20 identical devices (software and hardware configuration and deployed behavioral reflection interfaces). The experimental results are shown below.
Fixing alpha and lambda optimizing mu
Let α=0.5, λ=0.5, μ∈ {0.2,0.4,0.6,0.8}, the result shows that 0.4 is optimal among the candidates for μ, so μ' =0.4.
Fixing alpha and mu optimizing lambda
Let α=0.5, μ=0.5, λ e {0.2,0.4,0.6,0.8}, and the result shows that λ' =0.2.
Fixing lambda and mu optimizes alpha
Let λ=0.5, μ=0.5, α∈ {0.2,0.4,0.6,0.8}, and α' =0.8.
Verification of local optimality
The optimal parameter combination is (alpha ', lambda ', mu ') = (0.8,0.2,0.4), and the adjacent 26 groups (alpha, lambda, mu) are compared with (alpha ', lambda ', mu '), so that the experimental result shows that (alpha ', lambda ', mu ') is better than the 26 groups (alpha, lambda, mu), and the local optimality is met.
The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the application referred to in the present application is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept described above. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims (7)

1. An adaptive scheduling method for a data service cluster, comprising:
sending out call requests, and arranging the call requests in a request queue according to a first-in first-out sequence;
reading a request at the head of a team, analyzing the request, and reading an interface of the request;
screening candidate devices meeting the conditions;
if no candidate device exists, adding the calling request to the tail of the queue; if the candidate equipment exists, selecting the candidate equipment with the lowest load;
executing the request on the candidate equipment, if the execution request exceeds the set time or the execution request fails, recording the execution failure and executing the next instruction;
if the execution is successful, recording that the execution is successful, and executing the next instruction;
if the execution request exceeds a set time or the execution request fails, recording an execution failure, judging a failure condition, and executing a next instruction, wherein the judgment failure condition comprises the following specific steps: judging whether or not F is satisfied j′,i′ ≥λV j′,i′ Hold, F j′,i′ To invoke interface i 'to maintain a count of failures at device j', V j′,i′ To invoke the rate at which the requests of interface i 'reach device j', λ is a real number between 0 and 1, when established, letW j′,i′ The maximum number of interfaces i 'that can be called simultaneously on recording device j' is a real number between 0 and 1, S j′,i′ =0,F j′,i′ =0,S j′,i′ Performing a successful count at device j 'for call interface i'; then returning to execute the next instruction; f (F) j′,i′ ≥λV j′,i′ When not established, returning to execute the next instruction;
if the execution is successful, after the execution success is recorded, judging a success condition, and executing the next instruction, wherein the judging the success condition, the executing the next instruction specifically comprises: judging whether or not S is satisfied j′,i′ ≥μV j′,i′ It is true that the method is that,mu is a rational number between 0 and 1, S j′,i′ To call interface i 'to perform a successful count at device j', V j′,i′ The rate at which requests to invoke interface i 'reach device j'; when established, let S j′,i′ =0,F j′,i′ =0, where F j′,i′ Maintaining a failure count in the device j 'for the calling interface i', and then returning to execute the next instruction; s is S j′,i′ ≥μV j′,i′ When not established, returning to execute the next instruction;
optimizing the values of the parameters alpha, lambda and mu, specifically: selecting a group of parameters alpha, lambda and mu, wherein the values of alpha, lambda and mu are all 0 to 1; at a rate v i =v 0 Sending a call request to a cluster gateway request call interface i, v 0 To set the rate, the rate at which the request is completed, w', is measured; repeating the operation for multiple times to obtain the average value of wInterface i cluster completion request rate +.>Selecting different v i Similarly, different v can be measured i Corresponding interface i cluster completion request rate G Q,i (v i ) The method comprises the steps of carrying out a first treatment on the surface of the Computing interface i scheduling efficiency function H Q,i (v i ) The method comprises the steps of carrying out a first treatment on the surface of the Changing the request interface i to the cluster gateway until all interfaces are traversed, and repeating calculation to obtain a dispatching efficiency function H Q,i (v i ) The method comprises the steps of carrying out a first treatment on the surface of the Determination of the comprehensive scheduling efficiency function H of m interfaces on a cluster Q Q (V); setting step length, changing alpha, lambda and mu, repeatedly measuring to obtain comprehensive dispatching efficiency function H Q (v) Selecting a comprehensive scheduling efficiency function H Q (v) Alpha ', lambda ', mu ' corresponding to the maximum value.
2. The adaptive scheduling method of a data service cluster according to claim 1, wherein the scheduling efficiency function H Q,i (v i ) The calculation method of (1) is that:
Wherein F' i Rate F for completion of single device interface i request under flow control conditions i Representation of ideal value, (-)>Wherein v is i * As v i Increase assay F i (v i ) By increasing to a critical value for overload; f (F) i (v i * ) At a rate of v i * Rate of completion, |d, requested by a single device of (a) a i I is the number of devices of interface i.
3. The adaptive scheduling method of a data service cluster according to claim 2, wherein v i * And F i (v i ) The measurement method of (2) comprises the following steps:
controlling an external variable;
at a rate v i =v 0 Sending a request for calling an interface i to equipment, and measuring the rate w' of completion of the request; repeating the operation for multiple times to obtain the average value of wThen->
Changing v i F is measured i (v i ) At v i Taking the results of other values to obtain F i (v i ) And v i *
4. An adaptive scheduling method for a data service cluster according to claim 3, wherein said changing v i F is measured i (v i ) At v i As a result of taking the other value(s),the method comprises the following steps:
v i exponentially increasing to judge F i (v) The overall trend and the interval of the maximum value;
v i determining F by linearly traversing the interval where the maximum value is located i (v i ) Maximum value of F i (v i ) A law of variation around a maximum value.
5. The adaptive scheduling method of a data service cluster according to claim 4, wherein the setting step length is obtained by repeating the measurement by changing α, λ and μ to obtain the integrated scheduling efficiency function H Q (v) Selecting a comprehensive scheduling efficiency function H Q (v) Alpha ', lambda ', mu ' corresponding to the maximum value are specifically:
fixing alpha and lambda to optimize mu to obtain a measurement comprehensive scheduling efficiency function H Q (v) Optimal μ'; fixed alpha and mu optimized lambda, and the comprehensive dispatching efficiency function H is measured Q (v) Obtaining optimal lambda'; fixed lambda and mu optimized alpha, and measuring comprehensive dispatching efficiency function H Q (v) The optimum α 'is obtained, resulting in α', λ ', μ'.
6. The adaptive scheduling method of a data service cluster according to claim 5, wherein the local optimality is verified for the optimal parameters (α ', λ', μ '), 26 groups (α, λ, μ) generated according to the set step size are compared with (α', λ ', μ'), if (α ', λ', μ ') is better than 26 groups (α, λ, μ), the local optimality is satisfied, and if one or several groups are better than (α', λ ', μ'), H is selected Q (v) The highest corresponding (α, λ, μ) is the optimum (α ', λ ', μ ').
7. The adaptive scheduling method of a data service cluster of claim 6,
the comprehensive dispatching efficiency function H Q (v) The calculation method of (1) is as follows:
m is the number of interfaces.
CN201910803526.6A 2019-08-28 2019-08-28 Self-adaptive scheduling method of data service cluster Active CN111352728B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910803526.6A CN111352728B (en) 2019-08-28 2019-08-28 Self-adaptive scheduling method of data service cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910803526.6A CN111352728B (en) 2019-08-28 2019-08-28 Self-adaptive scheduling method of data service cluster

Publications (2)

Publication Number Publication Date
CN111352728A CN111352728A (en) 2020-06-30
CN111352728B true CN111352728B (en) 2023-10-03

Family

ID=71196890

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910803526.6A Active CN111352728B (en) 2019-08-28 2019-08-28 Self-adaptive scheduling method of data service cluster

Country Status (1)

Country Link
CN (1) CN111352728B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008151464A1 (en) * 2007-06-14 2008-12-18 Zte Corporation A wireless network simulation method
KR101396394B1 (en) * 2013-03-20 2014-05-19 주식회사 스마티랩 Methods to autonomously optimize performance using clustering in mobile cloud environment
WO2014173339A1 (en) * 2013-08-07 2014-10-30 中兴通讯股份有限公司 Task scheduling service system and method
US10382380B1 (en) * 2016-11-17 2019-08-13 Amazon Technologies, Inc. Workload management service for first-in first-out queues for network-accessible queuing and messaging services

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8082545B2 (en) * 2005-09-09 2011-12-20 Oracle America, Inc. Task dispatch monitoring for dynamic adaptation to system conditions

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008151464A1 (en) * 2007-06-14 2008-12-18 Zte Corporation A wireless network simulation method
KR101396394B1 (en) * 2013-03-20 2014-05-19 주식회사 스마티랩 Methods to autonomously optimize performance using clustering in mobile cloud environment
WO2014173339A1 (en) * 2013-08-07 2014-10-30 中兴通讯股份有限公司 Task scheduling service system and method
CN104346215A (en) * 2013-08-07 2015-02-11 中兴通讯股份有限公司 Task scheduling service system and method
US10382380B1 (en) * 2016-11-17 2019-08-13 Amazon Technologies, Inc. Workload management service for first-in first-out queues for network-accessible queuing and messaging services

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐文莉.分布式渲染调度策略优化研究与实现.《软件导刊》.2010,第1节到第6节. *

Also Published As

Publication number Publication date
CN111352728A (en) 2020-06-30

Similar Documents

Publication Publication Date Title
CN113950103B (en) Multi-server complete computing unloading method and system under mobile edge environment
WO2019050952A1 (en) Systems, methods, and media for distributing database queries across a metered virtual network
US9588809B2 (en) Resource-based scheduler
US8918792B2 (en) Workflow monitoring and control system, monitoring and control method, and monitoring and control program
US20110161294A1 (en) Method for determining whether to dynamically replicate data
CN111966484A (en) Cluster resource management and task scheduling method and system based on deep reinforcement learning
US11436054B1 (en) Directing queries to nodes of a cluster of a container orchestration platform distributed across a host system and a hardware accelerator of the host system
CN112799823B (en) Online dispatching and scheduling method and system for edge computing tasks
US10810054B1 (en) Capacity balancing for data storage system
Misra et al. Multiarmed-bandit-based decentralized computation offloading in fog-enabled IoT
CN114928607B (en) Collaborative task unloading method for polygonal access edge calculation
CN112416578B (en) Container cloud cluster resource utilization optimization method based on deep reinforcement learning
CN111143036A (en) Virtual machine resource scheduling method based on reinforcement learning
US9152457B2 (en) Processing request management
CN114422508B (en) Method and system for collecting network equipment performance
WO2007149224A1 (en) Resource-based scheduler
CN111352728B (en) Self-adaptive scheduling method of data service cluster
EP0863680B1 (en) Method and apparatus for improved call control scheduling in a distributed system with dissimilar call processors
CN116185584A (en) Multi-tenant database resource planning and scheduling method based on deep reinforcement learning
KR20210066502A (en) Apparatus and Method of Altruistic Scheduling based on Reinforcement Learning
CN116302404A (en) Resource decoupling data center-oriented server non-perception calculation scheduling method
CN114693141B (en) Transformer substation inspection method based on end edge cooperation
CN114938381A (en) D2D-MEC unloading method based on deep reinforcement learning and computer program product
CN115116879A (en) Dynamic weight optimization load balancing algorithm for wafer surface defect detection
CN114666409A (en) Service migration method based on cache management in edge computing environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant