CN114143327B

CN114143327B - Cluster resource quota allocation method and device and electronic equipment

Info

Publication number: CN114143327B
Application number: CN202111503812.4A
Authority: CN
Inventors: 韩向前; 谢健; 邸帅; 卢道和
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2021-12-09
Filing date: 2021-12-09
Publication date: 2024-04-09
Anticipated expiration: 2041-12-09
Also published as: CN114143327A; WO2023103342A1

Abstract

The embodiment of the application provides a cluster resource quota allocation method, a cluster resource quota allocation device and electronic equipment, wherein the method comprises the following steps: and receiving a service processing request of the service to be processed, which is sent by the client, carrying out resource quota verification on the service to be processed according to the latest load increment and the pre-stored resource configuration information to obtain a verification result, distributing resource quota for the service to be processed according to the service processing request if the verification result is verification passing, processing the service to be processed according to the distributed resource quota to obtain a service processing result, and processing the service to be processed according to the pre-stored super quota request processing rule if the verification result is verification failing. According to the method and the device, the situation that the cluster load judgment is inaccurate is reduced, the rationality and the accuracy of resource quota allocation are improved, and further the normal realization of each financial service is guaranteed.

Description

Cluster resource quota allocation method and device and electronic equipment

Technical Field

The embodiment of the application relates to the technical field of big data, in particular to a cluster resource quota allocation method and device and electronic equipment.

Background

With the development of computer technology, more and more technologies are applied in the financial field, and the traditional financial industry is gradually changed to the financial technology (Fintech), so that the big data technology is not exceptional, but because of the requirements of safety and real-time performance of the financial industry, the big data technology is also required to be higher. To meet the growing demands of various financial services, clustered applications are becoming more and more popular.

In the prior art, one cluster can usually serve a plurality of financial services, however, when the cluster serves a plurality of financial services, a situation that one service request amount suddenly increases and occupies a large amount of cluster resources, so that other service resources are insufficient may occur. In order to avoid the above situation, a fixed resource quota may be set for each financial service, and before processing each financial service request, it is determined whether the request exceeds the quota, and if the request exceeds the quota, an exception is thrown.

However, when a fixed resource quota is configured, the load and the bearable load condition of the current cluster often need to be known, when the load and the bearable load condition of the current cluster are determined, the load and the bearable load condition of the current cluster are generally determined according to pressure test data when the cluster is built, the capacity of processing service requests of the cluster is dynamically changed, an initial pressure test scene is difficult to be consistent with an actual production scene, the condition that the cluster load judgment is inaccurate often occurs, the accuracy of resource quota allocation is reduced, and then the normal realization of each financial service is affected.

Disclosure of Invention

The embodiment of the application provides a cluster resource quota allocation method, a cluster resource quota allocation device and electronic equipment, so that accuracy of resource quota allocation is improved.

In a first aspect, an embodiment of the present application provides a method for allocating a cluster resource quota, including:

receiving a service processing request of a service to be processed sent by a client;

verifying the resource quota of the service to be processed according to the latest load increment and the pre-stored resource configuration information to obtain a verification result;

if the verification result is that the verification is passed, distributing resource quota for the service to be processed according to the service processing request, and processing the service to be processed according to the distributed resource quota to obtain a service processing result;

and if the verification result is that the verification is not passed, processing the service to be processed according to a pre-stored super quota request processing rule.

Optionally, the service processing request includes a load request amount, and the verifying the resource quota of the service to be processed according to the latest load increment and the pre-stored resource configuration information to obtain a verification result includes:

summing the current load of the cluster in the pre-stored resource configuration information and the load request quantity to obtain a summation result;

acquiring the latest load increment;

judging whether the summation result is higher than the sum of the latest load increment and the resource quota distributed for the service to be processed in the pre-stored resource configuration information;

And if the summation result is higher than the sum of the latest load increment and the resource quota distributed for the service to be processed in the pre-stored resource configuration information, determining that the verification result is failed in verification.

Optionally, before the step of obtaining the latest load increment, the method further includes:

judging whether the summation result is higher than the resource quota distributed for the service to be processed in the pre-stored resource configuration information;

and if the summation result is higher than the resource quota allocated to the service to be processed in the pre-stored resource configuration information, executing the step of acquiring the latest load increment.

Optionally, after the sum result is higher than the resource quota allocated to the service to be processed in the pre-stored resource configuration information, the method further includes:

determining whether the pre-stored resource configuration information contains overflow ratio information;

if the overflow ratio information is contained, determining the highest quota of the service to be processed according to the overflow ratio information;

judging whether the summation result is higher than the highest quota;

if the summation result is not higher than the highest quota, continuing to execute the steps of acquiring the latest load increment and later;

And if the summation result is higher than the highest quota, determining that the verification result is verification failed.

Optionally, the obtaining the latest load increment includes:

acquiring total processing time length, a processing time length threshold value, the lowest load of the cluster and the highest load of the cluster according to a preset acquisition rule;

and determining the latest load increment according to the total processing time length, the processing time length threshold value, the lowest load of the cluster, the highest load of the cluster and the current load of the cluster in the pre-stored resource configuration information.

Optionally, the determining the latest load increment according to the total processing duration, the processing duration threshold, the lowest load of the cluster, the highest load of the cluster and the current load of the cluster in the pre-stored resource configuration information includes:

if the total processing time length is greater than or equal to the processing time length threshold, determining that the latest load increment is zero;

if the total processing time length is smaller than the processing time length threshold value and the current load of the cluster in the pre-stored resource configuration information is smaller than the lowest load of the cluster, determining that the latest load increment is the difference between the lowest load of the cluster and the current load of the cluster in the pre-stored resource configuration information;

If the total processing time length is smaller than the processing time length threshold value, and the current load of the cluster in the pre-stored resource configuration information is larger than the lowest load of the cluster and smaller than the highest load of the cluster, determining that the latest load increment is the difference between the highest load of the cluster and the current load of the cluster in the pre-stored resource configuration information;

and if the total processing time length is smaller than the processing time length threshold value and the current load of the cluster in the pre-stored resource configuration information is larger than the highest load of the cluster, determining the latest load increment as a preset load threshold value.

Optionally, the acquiring the lowest load of the cluster according to the preset acquiring rule includes:

acquiring a preset processing time threshold and total processing time and load of a target time;

judging the processing time threshold and the total processing time and the load of the target moment according to a preset minimum load judging rule, and determining the minimum load of the initial cluster;

acquiring a plurality of historical minimum loads in a preset historical time period, and determining a historical average minimum load according to the plurality of historical minimum loads;

and taking the minimum value of the initial cluster minimum load and the historical average minimum load as the cluster minimum load.

Optionally, the obtaining the highest load of the cluster according to the preset obtaining rule includes:

judging the processing time threshold and the total processing time and the loading capacity of the target moment according to a preset maximum load judging rule, and determining the maximum load of the initial cluster;

acquiring a plurality of historical highest loads in a preset historical time period, and determining a historical average highest load according to the plurality of historical highest loads;

and taking the maximum value of the initial cluster highest load and the historical average highest load as the cluster highest load.

Optionally, the method further comprises:

and if the summation result is not higher than the sum of the latest load increment and the resource quota distributed for the service to be processed in the pre-stored resource configuration information, determining that the verification result is verification passing.

Optionally, if the verification result is that verification is not passed, processing the service to be processed according to a pre-stored super quota request processing rule, including:

if the verification result is that the verification is not passed, judging whether the pre-stored resource configuration information contains a retry mark or not;

If the pre-stored resource configuration information contains a retry identifier, adding a service processing request of the service to be processed into a request queue;

and if the pre-stored resource configuration information does not contain the retry identification, generating a processing abnormality prompt.

Optionally, before the obtaining the latest load increment, the method further includes:

judging whether a preset updating time threshold is reached or not;

if the updated time length threshold is reached, acquiring the latest load increment;

or judging whether the total processing quantity of the service processing requests reaches a preset quantity threshold value;

and if the number threshold is reached, acquiring the latest load increment.

In a second aspect, an embodiment of the present application provides a cluster resource quota allocation apparatus, including:

the receiving module is used for receiving a service processing request of a service to be processed, which is sent by the client;

the processing module is used for verifying the resource quota of the service to be processed according to the latest load increment and the pre-stored resource configuration information to obtain a verification result;

the processing module is further configured to allocate a resource quota for the service to be processed according to the service processing request if the verification result is that the verification is passed, and process the service to be processed according to the allocated resource quota, so as to obtain a service processing result;

And the processing module is further used for processing the service to be processed according to a pre-stored super quota request processing rule if the verification result is that the verification is not passed.

In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes the computer-executed instructions stored in the memory to implement the cluster resource quota allocation method according to the first aspect and the various possible designs of the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer readable storage medium, where computer executable instructions are stored, and when executed by a processor, implement a cluster resource quota allocation method according to the first aspect and various possible designs of the first aspect.

In a fifth aspect, embodiments of the present application provide a computer program product, including a computer program, where the computer program is executed by a processor, to implement the cluster resource quota allocation method according to the first aspect and the various possible designs of the first aspect.

After the scheme is adopted, the service processing request of the service to be processed sent by the client can be received, and then the service to be processed is subjected to resource quota verification according to the latest load increment and the pre-stored resource configuration information, so that a verification result is obtained. In one implementation, if the verification result is that the verification is passed, allocating a resource quota for the service to be processed according to the service processing request, and then processing the service to be processed according to the allocated resource quota to obtain a service processing result. In another implementation manner, if the verification result is that the verification is not passed, the service to be processed may be processed according to a pre-stored super quota request processing rule. The resource quota allocation is carried out on the service to be processed by combining the pre-stored resource allocation information with the dynamically determined load increment, so that the situation that the cluster load judgment is inaccurate is reduced, the rationality and the accuracy of the resource quota allocation are improved, and the normal realization of each financial service is further ensured.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive faculty for a person skilled in the art.

Fig. 1 is a schematic architecture diagram of an application system of a cluster resource quota allocation method provided in an embodiment of the present application;

fig. 2 is a flow chart of a cluster resource quota allocation method provided in an embodiment of the present application;

fig. 3 is a flowchart of a cluster resource quota allocation method according to another embodiment of the present application;

fig. 4 is a schematic structural diagram of a cluster resource quota allocation apparatus provided in an embodiment of the present application;

fig. 5 is a schematic hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims of this application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the present application described herein may be capable of including other sequential examples in addition to those illustrated or described. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In the prior art, financial services can be account transfer, loan amount adjustment, balance inquiry and the like, each financial service needs to be realized by allocated resource quota, and in the prior art, fixed resource quota is generally allocated for each financial service. When a fixed resource quota is configured for each financial service, the load and the bearable load condition of the current cluster are required to be known, when the load and the bearable load condition of the current cluster are determined, the load and the bearable load condition of the current cluster are generally determined according to pressure test data when the cluster is built, the capacity of processing the service request of the cluster is dynamically changed, an initial pressure test scene and an actual production scene are difficult to be consistent, the condition that cluster load judgment is inaccurate often occurs, and the accuracy of resource quota allocation is reduced. In addition, in a multi-finance business scene, the business request amount is often a fluctuating curve, the occurrence time of business wave peaks and wave troughs can be predicted according to the business scene, and in order to effectively utilize cluster resources, businesses with scattered request peaks share the same cluster, and the request peaks are set to be quota. Therefore, it is important to determine the peak value of the request, the peak value often depends on experience and history data of operation and maintenance personnel, but when a certain service causes sudden increase of service request amount (for example, cluster active-standby switching caused by host fault) due to unexpected emergency, the condition that all sudden requests under the current limiting logic fail is caused, and normal realization of each financial service is affected.

Based on the technical problems, the resource quota allocation is carried out on the service to be processed in a mode of combining the pre-stored resource allocation information with the dynamically determined load increment, so that the situation that cluster load judgment is inaccurate is reduced, the rationality and the accuracy of the resource quota allocation are improved, and the technical effect of normal realization of each financial service is further ensured.

Fig. 1 is an architecture schematic diagram of an application system of a cluster resource quota allocation method provided in an embodiment of the present application, as shown in fig. 1, where the application system includes: the cluster 101 can receive a service processing request sent by the client 103, acquire pre-stored resource configuration information from the database 102, perform resource quota verification on a service to be processed by combining with the newly acquired load increment, obtain a verification result, and further process according to the verification result.

The client 103 may have one or more clients, which may be a smart phone, a tablet, a personal computer, or a wearable smart device, for example.

The cluster 101 may be an Hbase cluster, which is a distributed, scalable, high-availability, high-performance NoSQL database, and may support random or real-time read and write functions of a very large table.

The technical scheme of the present application is described in detail below with specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.

Fig. 2 is a flow chart of a cluster resource quota allocation method provided in an embodiment of the present application, where the method of the present embodiment may be executed by a cluster 101. As shown in fig. 2, the method of the present embodiment may include:

s201: and receiving a service processing request of the service to be processed, which is sent by the client.

In this embodiment, when a user wants to implement a financial service, the user may send, through a client, a service processing request corresponding to a service to be processed to a cluster.

S202: and verifying the resource quota of the service to be processed according to the latest load increment and the pre-stored resource configuration information, and obtaining a verification result.

In this embodiment, after receiving a service processing request, the latest load increment and the pre-stored resource configuration information may be acquired, and the service to be processed is subjected to resource quota verification according to the acquired latest load increment and the pre-stored resource configuration information, so as to obtain a verification result.

The pre-stored resource configuration information may include pre-configured overflow ratio information, a resource quota (RNQ, request Num Quota) allocated to each service to be processed, that is, a quota of a number of requests in a unit time, where a time unit may be sec, min, hour, day, and the number of requests is represented by req, for example, 1000req/sec, 50000req/hour, a current load of the cluster, and whether to retry the marking.

Specifically, the overflow ratio information can be set according to the self definition of the actual application scene, the specific setting mode refers to the following calculation mode, the service access burst problem can be dynamically processed through the overflow ratio information, and when the cluster resources are available, normal request failure caused by the fact that the resource quota of the service application reaches the limit can be avoided.

The configuration principle of allocating resource quota for each service to be processed is that on the basis of meeting the service requirement of each service to be processed, the sum of RNQ of all the services to be processed is ensured to be smaller than or equal to the lowest guaranteed load of the cluster, and the sum of (quota overflow percentage/100+1) RNQ of all the services to be processed is ensured to be smaller than or equal to the highest reachable load. The configuration policy may specifically be to perform a stress test on the cluster by using an existing stress test tool at the beginning of the cluster establishment, so as to obtain an initial cluster minimum load (LowTPS, low Transactions Per Second) and a cluster maximum load (UpTPS, up Transactions Per Second) of the cluster. When each service is online, an operation and maintenance personnel can evaluate the TPS (Transactions Per Second, namely the number of transactions executed per second is an important measurement standard of the cluster throughput) size of the service according to the service volume of the service system, set the evaluated TPS size as the RNQ value of the service, and ensure that the sum of RNQs of each service is smaller than or equal to the lowest load of the cluster. If the sum of RNQs of the services is greater than the lowest load of the cluster, the cluster can be expanded to increase the lowest load of the cluster. Correspondingly, if the TPS of the service contains burrs, the TPS peak value excluding the burrs may be set as the RNQ of the service, and if the TPS of the service has no burrs, the TPS of the service peak value may be set as the RNQ of the service.

In addition, the overflow ratio information may be calculated by:

the burr TPS of service 1 is MTPS1, the burr TPS of service n is mtsn, the overflow ratio information of service 1 is op1, and the overflow ratio information of service n is opn, (SUM (MTPS 1 … mtsn)/RNQ-1) ×100=sum (op 1 … … opn).

opn＝MTPSn/sum(MTPS1…MTPSn)*sum(op1……opn)。

In addition, each service and the cluster can be continuously monitored, the latest TPS, the lowest load of the cluster, the highest load of the cluster and the current load of the cluster of each service are obtained, an index chart is drawn, and a data basis is provided for the configuration of the follow-up RNQ and the overflow ratio information.

The cluster current load (CTPS, currentTPS) may be counted against the total request count index of the Regionserver at fixed time intervals (defaulting to 5 seconds) by continuously monitoring the regonnserver service of the cluster. Illustratively, totalRequestCount at time T is RequestCount (T), totalRequestCount at time T1 is RequestCount (T1), then CurrentTPS= (RequestCount (T1) -RequestCount (T))/(T1-T). The total number of the processed service requests is an index provided by the cluster itself, and can be directly obtained through the existing function, and is a dynamic index, and each time a request is processed, 1 is added on the basis of the original value. After the CurrentTPS is obtained, the CurrentTPS may be stored in the pre-stored resource configuration information, that is, the current load of the cluster in the pre-stored resource configuration information is updated.

Further, the service processing request includes a load request amount, and the verifying the resource quota of the service to be processed according to the latest load increment and the pre-stored resource configuration information to obtain a verification result may specifically include:

and carrying out summation processing on the current load of the cluster in the pre-stored resource configuration information and the load request quantity to obtain a summation result.

The latest load increment is obtained.

And judging whether the summation result is higher than the sum of the latest load increment and the resource quota distributed for the service to be processed in the pre-stored resource configuration information.

And if the summation result is not higher than the sum of the latest load increment and the resource quota allocated to the service to be processed in the pre-stored resource configuration information, determining that the verification result is verification passing.

Furthermore, before the step of obtaining the latest load delta, the method may further include: and judging whether the summation result is higher than the resource quota distributed for the service to be processed in the pre-stored resource configuration information.

In addition, if the summation result is not higher than the resource quota allocated to the service to be processed in the pre-stored resource allocation information, determining that the verification result is verification passing.

Specifically, the service processing request may directly include the service request amount, or may include type identifiers that represent different request types, where different type identifiers may correspond to different service request amounts. For example, the type identifier may be Put, get, scan and muli, the amount of service request corresponding to the Put type identifier is 1 time, the amount of service request corresponding to the Get type identifier is 1 time, the amount of service request corresponding to the Scan type identifier is 1 time, and the amount of service request corresponding to the muli type identifier is the number of regions involved.

When the load request quantity and the current load of the cluster are obtained, the current load of the cluster and the load request quantity can be summed to obtain a summation result. The summation result may have two cases, one is that the summation result is smaller than or equal to the resource quota allocated in advance for the service to be processed, and the other is that the summation result is larger than the resource quota allocated for the service to be processed. After the summation result is obtained, the latest load increment can be directly obtained, then whether the summation result is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the pre-stored resource configuration information is judged, namely, the latest load increment can be firstly obtained each time, then the relation between the summation result and the sum of the latest load increment and the resource quota allocated for the service to be processed is directly judged, and further processing is carried out according to the judgment result.

In addition, after the summation result is obtained, the latest load increment is not acquired first, but whether the summation result is higher than the resource quota allocated to the service to be processed in the pre-stored resource configuration information is judged first, if the summation result is smaller than or equal to the resource quota allocated to the service to be processed in the pre-stored resource configuration information, the verification result can be determined to pass the verification, and the service to be processed can be processed according to the service processing request later. If the summation result is larger than the resource quota allocated for the service to be processed, the latest load increment determined by the load calculation dynamic module can be obtained, and then whether the summation result is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the pre-stored resource configuration information is judged. If the summation result is smaller than or equal to the sum of the latest load increment and the resource quota allocated to the service to be processed in the pre-stored resource allocation information, determining that the verification result is verification passing, and subsequently processing the service to be processed according to the service processing request. If the summation result is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the pre-stored resource allocation information, the verification result can be directly determined to be that verification fails.

The new load increment can be triggered and acquired according to a preset load increment triggering rule, when the preset load increment triggering rule is reached, the newly acquired load increment is the latest load increment, and if the preset load increment triggering rule is not reached, the latest load increment is the latest load increment.

For example, the latest load increment is 10, the resource quota allocated for the service is 100, the sum of the latest load increment and the resource quota allocated for the service to be processed is 100+10=110, if the current request load is 11, the sum result is 100+11=111, and the sum is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed, and the obtained verification result is that the verification is failed. If the current request load is 9, the summation result is 100+9=109, and is lower than the sum of the latest load increment and the resource quota allocated for the service to be processed, the verification result is that verification is passed, and the dynamic and accurate allocation of the resource quota is realized by combining the latest load increment in a mode of setting the latest load increment, so that the rationality of allocating the number of the resource quota for the service to be processed is improved, and the normal realization of each service is further ensured.

In addition, after the summation result is higher than the resource quota allocated to the service to be processed in the pre-stored resource configuration information, the method may further include:

and determining whether the pre-stored resource configuration information contains overflow ratio information.

And if the overflow ratio information is contained, determining the highest quota of the service to be processed according to the overflow ratio information.

And judging whether the summation result is higher than the highest quota.

And if the summation result is not higher than the highest quota, continuing to execute the steps of acquiring the latest load increment and the later.

Specifically, after determining that the summation result is higher than the resource quota allocated to the service to be processed in the pre-stored resource configuration information, whether the pre-stored resource configuration information contains the overflow ratio information or not may be determined. When setting the resource quota for each service, a positive number can be set as overflow ratio information, namely, when the whole resources of the cluster are sufficient, the actual request quantity of the service can exceed the maximum percentage of the quota, and the default is 0 without overflow. The highest quota allocated to the service to be processed, that is, the allowed maximum request amount= (quota overflow percentage/100+1) is determined according to the overflow ratio information. After determining the highest quota, whether the summation result is greater than the highest quota or not can be judged, if the summation result is smaller than or equal to the highest quota, the latest load increment is continuously acquired, and whether the summation result is higher than the sum of the latest load increment and the resource quota distributed for the service to be processed in the pre-stored resource configuration information is judged. And if the summation result is not higher than the sum of the latest load increment and the resource quota distributed for the service to be processed in the pre-stored resource configuration information, determining that the verification result is verification passing. And if the summation result is higher than the sum of the latest load increment and the resource quota distributed for the service to be processed in the pre-stored resource configuration information, determining that the verification result is failed in verification. And if the summation result is higher than the highest quota, determining that the verification result is verification failed.

For example, the highest quota may be 120, the latest load increment is 10, the resource quota allocated for the service is 100, the sum of the latest load increment and the resource quota allocated for the service to be processed is 100+10=110, if the current request load is 9, the summation result is 100+9=109, the verification result may be obtained as verification, the summation result is compared with the highest quota, and when the summation result is smaller than the highest configuration, the summation result is compared with the sum of the latest load increment and the resource quota allocated for the service to be processed, and the rationality of allocating the number of resource quotas for the service to be processed is improved by means of secondary comparison.

In addition, the current request load may be 11, and the summation result is 100+11=111, in this case, although the summation result is lower than the highest quota, the summation result is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed, the obtained verification result is that verification is failed, and by setting the load increment, the number of the resource quota allocated for the service to be processed is increased, the situation that the number of the resource quota allocated for the service to be processed is too much is avoided, the rationality of the number of the resource quota allocated for the service to be processed is further improved, and meanwhile, the situation that when the service request amount suddenly increases due to an unexpected situation outside prediction, all requests fail is reduced.

S203: if the verification result is that the verification is passed, allocating a resource quota for the service to be processed according to the service processing request, and processing the service to be processed according to the allocated resource quota to obtain a service processing result.

In this embodiment, the verification result may include two cases, one being verification passed and the other being verification failed. If the verification result is that the verification is passed, the current load condition of the cluster indicates that the cluster can serve the service to be processed, so that a resource quota can be allocated to the service to be processed according to the service processing request, and the service to be processed is processed according to the allocated resource quota, and a service processing result is obtained.

S204: and if the verification result is that the verification is not passed, processing the service to be processed according to a pre-stored super quota request processing rule.

In this embodiment, if the verification result is that the verification is not passed, a pre-stored hyperquota request processing rule may be obtained first, and then the service to be processed may be processed according to the obtained hyperquota request processing rule.

Further, if the verification result is that the verification is not passed, the processing the service to be processed according to a pre-stored super quota request processing rule may specifically include:

If the verification result is that the verification is not passed, judging whether the pre-stored resource configuration information contains a retry mark or not.

And if the pre-stored resource configuration information contains a retry identifier, adding the service processing request of the service to be processed into a request queue.

Specifically, the existing quota flow limiting logic simply compares the request with the quota, discards requests larger than the quota and throws out exceptions, and for a scene that writing can be delayed but discarding is not allowed when the quota is exceeded, the prior art may cause data loss, and affect the integrity of the data. The application can pre-configure whether the service to be processed needs to be retried or not and store the resource configuration information. The retry identifier can be configured for the service to be processed in a scene of high data consistency requirement and data writing failure impermissibility, so that the data integrity is better maintained.

Correspondingly, if the pre-stored resource configuration information contains the retry identifier, the service processing request of the service to be processed can be added into the request queue, and the cluster can sequentially process the service to be processed according to the adding sequence of each service processing request in the request queue. If the pre-stored resource allocation information does not contain the retry identification, an abnormality processing prompt can be generated, namely, abnormality is thrown out, so that operation and maintenance personnel are reminded to maintain in time, and more choices are provided for the operation and maintenance personnel when the condition of exceeding quota is processed, so that normal realization of each service is ensured.

After the scheme is adopted, the service processing request of the service to be processed, which is sent by the client, can be received first, and then the resource quota verification is carried out on the service to be processed according to the latest load increment and the pre-stored resource configuration information, so that a verification result is obtained. In one implementation, if the verification result is that the verification is passed, allocating a resource quota for the service to be processed according to the service processing request, and then processing the service to be processed according to the allocated resource quota to obtain a service processing result. In another implementation manner, if the verification result is that the verification is not passed, the service to be processed may be processed according to a pre-stored super quota request processing rule. The resource quota allocation is carried out on the service to be processed by combining the pre-stored resource allocation information with the dynamically determined load increment, so that the situation that the cluster load judgment is inaccurate is reduced, the rationality and the accuracy of the resource quota allocation are improved, and the normal realization of each financial service is further ensured.

The examples of the present specification also provide some specific embodiments of the method based on the method of fig. 2, which is described below.

Furthermore, in another embodiment, obtaining the latest load delta may include:

And acquiring the total processing time length, the processing time length threshold value, the lowest cluster load and the highest cluster load according to a preset acquisition rule.

In this embodiment, the total processing duration may be denoted by total calltime, which is a duration that a pending request is received and processed by a cluster, and the index may directly reflect the processing performance of the current register server (i.e. cluster) service. Correspondingly, totalcalltime=queuecalltime+processcalltime. The queue calltime is an index of a register level in the cluster, the cluster can put the request to be processed into a request queue after receiving the request to be processed of the client, and then a special thread consumes the request to be processed from the queue and gives the request to a processing thread for processing, wherein the waiting time of one request to be processed in the queue is the queue calltime. ProcessCallTime is a target of the RegionServer level in the cluster, and refers to the time period when a request to be processed is consumed from a queue until the processing is completed, wherein the target is a key target for reflecting the processing efficiency of the cluster.

The processing duration threshold may be expressed as maxcall time, the maximum value of totalcall time that can be tolerated for the traffic set from the traffic side perspective. Illustratively, the query for the pending service tolerates a time consumption of at most 0.5s (if the request cannot be processed in 0.5s, the service application reports an error), the maxcalltem may be set to a value of no more than 0.5 s.

For the lowest load of the cluster, acquiring the lowest load of the cluster according to a preset acquisition rule specifically may include:

and acquiring a preset processing time threshold and the total processing time and the loading capacity of the target time.

And judging the processing time threshold and the total processing time and the loading capacity of the target moment according to a preset minimum load judging rule, and determining the minimum load of the initial cluster.

And acquiring a plurality of historical minimum loads in a preset historical time period, and determining a historical average minimum load according to the plurality of historical minimum loads.

Specifically, the lowest load of the cluster may also be referred to as LowTPS, which represents the TPS lower limit when the TotalCallTime reaches MaxCallTime, that is, the TPS load that the current Regionserver service can provide when the performance is worst. The load (TPS, transactions Per Second) represents the number of transactions performed per second and is an important measure of cluster throughput. Correspondingly, the region server service can be continuously monitored, the totalcall time at the time of t is CTt, the TPS is TPSt, and then the calculation can be performed at fixed time intervals:

Taking the minimum value of TPSt at all moments of the interval of MaxCallTime of 99% <=CTt < =MaxCallTime of 101% as the LowTPS of the current time interval, and archiving. If the CTt duration in the current time interval is less than 99% of MaxCallTime, the register server is indicated to operate under light load, the CTt is ordered, and the TPS value corresponding to the maximum CTt is obtained as the LowTps of the current time interval. If the CTt duration of the current time interval is greater than MaxCallTime by 101%, indicating that the register server is in overload operation, and no effective LowTPS data exists in the current time interval. And then taking the minimum value of the average LowTPS in the historical time period and the LowTPS of the latest time interval as the LowTPS of the current cluster service, and taking the validity period as the time interval. The time interval and the historical time period can be set according to the actual application scene in a self-defining mode, and the time interval can be any value within 3-6 minutes. The historical time period may be any value from 1 to 3 months.

For the highest load of the cluster, acquiring the highest load of the cluster according to a preset acquisition rule may specifically include:

And judging the processing time threshold and the total processing time and the loading capacity of the target moment according to a preset maximum load judging rule, and determining the maximum load of the initial cluster.

And acquiring a plurality of historical highest loads in a preset historical time period, and determining a historical average highest load according to the plurality of historical highest loads.

Specifically, the highest load of the cluster may also be called UpTPS, which represents the TPS upper limit when the totalcelltime reaches maxcall time, that is, the TPS load that the current Regionserver service can provide when the performance is the best. The load (TPS, transactions Per Second) represents the number of transactions performed per second and is an important measure of cluster throughput. Correspondingly, the region server service can be continuously monitored, the totalcall time at the time of t is CTt, the TPS is TPSt, and then the calculation can be performed at fixed time intervals:

taking the maximum value of TPSt at all moments of the interval of maxcall time of 99% <=ctt < =maxcall time of 101% as the UpTPS of the current time interval. If the CTt duration in the current time interval is less than 99% of maxcall time, indicating that the region server operates under light load, sequencing the ctts, and obtaining a TPS x 101% value corresponding to the maximum time of the CTt as the UpTps of the time interval. If the CTt duration of the current hour is greater than MaxCallTime by 101%, the table name regionserver is operated in overload, and no valid UpTPS data exists in the current time interval. And then taking the maximum value of the average UpTPS and the UpTPS of the latest time interval in the historical time period as the UpTPS of the current RgionServer service, wherein the validity period is the time interval. The time interval and the historical time period can be set according to the actual application scene in a self-defining mode, and the time interval can be any value within 3-6 minutes. The historical time period may be any value from 1 to 3 months.

For the current load of the cluster, the current load can be directly obtained from pre-stored resource configuration information. The current load of the cluster in the pre-stored resource configuration information is updated in real time, as in the foregoing embodiment, the regonnserver service may be continuously monitored, and after a fixed time interval (default of 5 seconds) is reached, the totalrequest count index of the Regionserver is counted, the totalrequest count at time T is RequestCount (T), the totalrequest count at time T1 is recorded as request count (T1), and currenttps= (request count (T1) -RequestCount (T))/(T1-T), and then the newly determined CurrentTPS may be stored in the resource configuration information, that is, only the current load of the cluster in the configuration information is updated. The current load of the clusters in the resource configuration information is updated in real time, so that the current load condition of the clusters can be accurately determined, and further, a basis is provided for whether to continue to increase the service and the increased service scale, the waste of resources is avoided, the overload operation condition is avoided, and the normal realization of each service is ensured.

Further, after obtaining the total processing time length, the processing time length threshold, the cluster lowest load, the cluster highest load and the cluster current load, the latest load increment may be determined, that is, the latest load increment (may also be referred to as a cluster bearable load) may be determined according to the total processing time length, the processing time length threshold, the cluster lowest load, the cluster highest load and the cluster current load in the pre-stored resource configuration information, which may specifically include:

And if the total processing time length is greater than or equal to the processing time length threshold value, determining that the latest load increment is zero.

And if the total processing time length is smaller than the processing time length threshold value and the current load of the cluster in the pre-stored resource configuration information is smaller than the lowest load of the cluster, determining the latest load increment as the difference between the lowest load of the cluster and the current load of the cluster in the pre-stored resource configuration information.

And if the total processing time length is smaller than the processing time length threshold value and the current load of the cluster in the pre-stored resource configuration information is larger than the lowest load of the cluster and smaller than the highest load of the cluster, determining the latest load increment as the difference between the highest load of the cluster and the current load of the cluster in the pre-stored resource configuration information.

Specifically, if TotalCallTime > =maxcalltime, it is determined that the current Regionserver load has reached a maximum value, and quota overflow cannot be performed. If totalcallTime < MaxCallTime and CurrentTPS < LowTPS, then it is determined that the RegionServer is operating lightly loaded and has idle load, and the load that can be increased is LowTPS-CurrentTPS. If totalcallTime < MaxCallTime and lowTPS < CurrentTPS < UpTPS, then determining that the Regionserver has idle load, the load that can be increased is UpTPS-CurrentTPS. If TotalCallTime < MaxCallTime and CurrentTPS > UpTPS, determining that the Regionserver load is new, the processing capacity meets the requirement, and the load can be increased in a small scale. The number of small-scale load increases can be set according to the actual application scene in a customized manner, and 10TPS can be increased in an exemplary manner.

In addition, when the latest load increment is acquired, a plurality of trigger mechanisms can be provided, and the trigger mechanisms specifically can be as follows:

in one implementation, a determination may be made as to whether a preset update duration threshold is reached.

And if the updated time length threshold is reached, acquiring the latest load increment.

The update duration threshold may be any value in 3-6 minutes, and when the update duration threshold is reached, the latest load increment rule may be automatically triggered and acquired.

In another implementation, it may be determined whether the total processing number of service processing requests reaches a preset number threshold.

And if the number threshold is reached, acquiring the latest load increment.

For example, the number threshold may be 10000, and when the number of service processing requests is accumulated to 10000, the latest load increment rule may be automatically triggered to be acquired, and the latest load increment may be acquired. And meanwhile, the number of the service processing requests can be emptied, and the calculation can be restarted from zero.

In addition, if the updated time period threshold value is not reached or the number threshold value is not reached when the latest load increment is acquired, the load increment acquired in the previous time may be used as the load increment newly acquired.

Fig. 3 is a flow chart of a cluster resource quota allocation method according to another embodiment of the present application, as shown in fig. 3, in this embodiment, the method may include: and receiving a to-be-processed request of the to-be-processed service, then analyzing the to-be-processed request, and determining the load request quantity. After the load request quantity is determined, a summation result can be determined according to the current load and the load request quantity of the cluster in the pre-stored resource configuration information, and whether the summation result exceeds the resource quota distributed for the service to be processed is judged. If not, the request to be processed is processed. If yes, judging whether the overflow ratio information is configured. If yes, determining the highest quota according to the overflow ratio information, and judging whether the summation result exceeds the highest quota. If the highest quota is not exceeded, the latest load increment is obtained, whether the summation result exceeds the sum of the latest load increment and the resource quota allocated for the service to be processed is judged, and if the summation result does not exceed the sum of the latest load increment and the resource quota allocated for the service to be processed, the request to be processed is processed. If the sum of the latest load increment and the resource quota allocated for the service to be processed is exceeded, determining whether to retry according to a pre-stored super quota request processing rule, if so, re-adding the request queue, otherwise, directly throwing out the exception.

In addition, if the overflow ratio information is not configured, whether retry is carried out is determined according to a pre-stored super quota request processing rule, if retry is carried out, the request queue is added again, and if not, the exception is thrown out directly.

If the highest configuration is exceeded, determining whether to retry according to the pre-stored super quota request processing rule, if so, re-adding the request queue, otherwise, directly throwing out the exception.

Wherein, the latest load increment determination has two mechanisms for triggering recalculation, the timing calculation defaults to 5 minutes, and a new round of calculation is triggered after a certain amount (defaults 10000) of processing requests are reached.

Based on the same idea, the embodiment of the present disclosure further provides a device corresponding to the method, and fig. 4 is a schematic structural diagram of a cluster resource quota allocation device provided in the embodiment of the present disclosure, where, as shown in fig. 4, the device provided in the embodiment may include:

the receiving module 401 is configured to receive a service processing request of a service to be processed sent by a client.

And the processing module 402 is configured to perform resource quota verification on the service to be processed according to the latest load increment and the pre-stored resource configuration information, so as to obtain a verification result.

In this embodiment, the service processing request includes a load request amount, and the processing module 402 is further configured to:

The latest load increment is obtained.

Furthermore, the processing module 402 is further configured to:

and judging whether the summation result is higher than the resource quota distributed for the service to be processed in the pre-stored resource configuration information.

Furthermore, the processing module 402 is further configured to:

Still further, the processing module 402 is further configured to:

And judging whether the summation result is higher than the highest quota.

The processing module 402 is further configured to allocate a resource quota for the service to be processed according to the service processing request if the verification result is that the verification is passed, and process the service to be processed according to the allocated resource quota, so as to obtain a service processing result.

The processing module 402 is further configured to process the service to be processed according to a pre-stored super quota request processing rule if the verification result is that verification is not passed.

In this embodiment, the processing module 402 is further configured to:

Furthermore, in another embodiment, the processing module 402 is further configured to:

In this embodiment, the processing module 402 is further configured to:

In addition, the processing module 402 is further configured to:

In this embodiment, the processing module 402 is further configured to:

Furthermore, the processing module 402 is further configured to:

judging whether a preset updating time threshold value is reached.

Or judging whether the total processing quantity of the service processing requests reaches a preset quantity threshold value.

And if the number threshold is reached, acquiring the latest load increment.

The device provided in the embodiment of the present application may implement the method of the embodiment shown in fig. 2, and its implementation principle and technical effects are similar, and are not described herein again.

Fig. 5 is a schematic hardware structure of an electronic device provided in an embodiment of the present application, as shown in fig. 5, an apparatus 500 provided in the embodiment includes: a processor 501, and a memory communicatively coupled to the processor. The processor 501 and the memory 502 are connected by a bus 503.

In a specific implementation process, the processor 501 executes the computer-executed instructions stored in the memory 502, so that the processor 501 executes the cluster resource quota allocation method in the method embodiment described above.

The specific implementation process of the processor 501 may refer to the above-mentioned method embodiment, and its implementation principle and technical effects are similar, and this embodiment will not be described herein again.

In the embodiment shown in fig. 5, it should be understood that the processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), or may be other general purpose processors, digital signal processors (english: digital Signal Processor, abbreviated as DSP), application specific integrated circuits (english: application Specific Integrated Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.

The memory may comprise high speed RAM memory or may further comprise non-volatile storage NVM, such as at least one disk memory.

The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or one type of bus.

The embodiment of the application also provides a computer readable storage medium, wherein computer execution instructions are stored in the computer readable storage medium, and when a processor executes the computer execution instructions, the cluster resource quota allocation method of the method embodiment is realized.

The embodiment of the application also provides a computer program product, which comprises a computer program, wherein the computer program realizes the cluster resource quota allocation method when being executed by a processor.

The computer readable storage medium described above may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk, or optical disk. A readable storage medium can be any available medium that can be accessed by a general purpose or special purpose computer.

An exemplary readable storage medium is coupled to the processor such the processor can read information from, and write information to, the readable storage medium. In the alternative, the readable storage medium may be integral to the processor. The processor and the readable storage medium may reside in an application specific integrated circuit (Application Specific Integrated Circuits, ASIC for short). The processor and the readable storage medium may reside as discrete components in a device.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present application.

Claims

1. A method for allocating a cluster resource quota, comprising:

determining the latest load increment according to the total processing time length, the processing time length threshold value, the lowest load of the cluster, the highest load of the cluster and the current load of the cluster in the pre-stored resource configuration information;

performing resource quota verification on the service to be processed according to the latest load increment and the pre-stored resource allocation information to obtain a verification result;

2. The method of claim 1, wherein the service processing request includes a load request amount, and the verifying the resource quota of the service to be processed according to the latest load increment and the pre-stored resource configuration information to obtain a verification result includes:

3. The method of claim 2, wherein prior to the step of obtaining the latest load delta, the method further comprises:

4. The method of claim 3, further comprising, after the if the summation result is higher than the allocated resource quota for the pending service in the pre-stored resource configuration information:

judging whether the summation result is higher than the highest quota;

5. The method of claim 1, wherein the determining the latest load delta based on the total processing time, the processing time threshold, the cluster lowest load, the cluster highest load, and a cluster current load in the pre-stored resource configuration information comprises:

6. The method of claim 1, wherein the obtaining the cluster lowest load according to the preset obtaining rule comprises:

7. The method of claim 1, wherein the obtaining the highest load of the cluster according to the preset obtaining rule comprises:

8. The method according to claim 2, wherein the method further comprises:

9. The method according to any one of claims 1-4, wherein if the verification result is that verification is failed, processing the service to be processed according to a pre-stored super quota request processing rule includes:

10. The method of any of claims 2-4, further comprising, prior to said obtaining the latest load delta:

judging whether a preset updating time threshold is reached or not;

and if the number threshold is reached, acquiring the latest load increment.

11. A cluster resource quota allocation apparatus, comprising:

the processing module is used for acquiring total processing time length, a processing time length threshold value, the lowest load of the cluster and the highest load of the cluster according to a preset acquisition rule; determining the latest load increment according to the total processing time length, the processing time length threshold value, the lowest load of the cluster, the highest load of the cluster and the current load of the cluster in the pre-stored resource configuration information;

The processing module is further configured to perform resource quota verification on the service to be processed according to the latest load increment and the pre-stored resource configuration information, so as to obtain a verification result;

12. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executing computer-executable instructions stored in the memory to implement the cluster resource quota allocation method of any one of claims 1 to 10.

13. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor implement the cluster resource quota allocation method of any of claims 1 to 10.

14. A computer program product comprising a computer program which, when executed by a processor, implements a cluster resource quota allocation method as claimed in any one of claims 1 to 10.