WO2023103342A1

WO2023103342A1 - Cluster resource quota allocation method and apparatus, and electronic device

Info

Publication number: WO2023103342A1
Application number: PCT/CN2022/100694
Authority: WO
Inventors: 韩向前; 谢健; 邸帅; 卢道和
Original assignee: 深圳前海微众银行股份有限公司
Priority date: 2021-12-09
Filing date: 2022-06-23
Publication date: 2023-06-15
Also published as: CN114143327A; CN114143327B

Abstract

Provided in the embodiments of the present application are a cluster resource quota allocation method and apparatus, and an electronic device. The method comprises: receiving a service processing request, which is sent by a client for a service to be processed; performing resource quota verification on said service according to the latest load increment and pre-stored resource configuration information, so as to obtain a verification result; if the verification result is that the verification is passed, allocating a resource quota to said service according to the service processing request, and processing said service according to the allocated resource quota, so as to obtain a service processing result; and if the verification result is that the verification is not passed, processing said service according to a pre-stored over-quota request processing rule. By means of the present application, inaccurate determination of a cluster load is reduced, the rationality and accuracy of resource quota allocation are improved, and thus, the normal implementation of various financial services is ensured.

Description

Cluster resource quota allocation method, device and electronic equipment

This application claims the priority of a Chinese patent application with application number 202111503812.4 and titled "Cluster Resource Quota Allocation Method, Device, and Electronic Equipment" filed with the China Patent Office on December 09, 2021, the entire contents of which are hereby incorporated by reference In this application.

technical field

The embodiments of the present application relate to the technical field of big data, and in particular to a cluster resource quota allocation method, device and electronic equipment.

Background technique

With the development of computer technology, more and more technologies are applied in the financial field. The traditional financial industry is gradually transforming into financial technology (Fintech), and big data technology is no exception. However, due to the security and real-time requirements of the financial industry, It also puts forward higher requirements for big data technology. In order to meet the growing needs of various financial services, the application of clusters has become more and more common.

In the prior art, a cluster can usually serve multiple financial services. However, when a cluster serves multiple financial services, there may be a sudden increase in the request volume of one service and occupy a large amount of cluster resources, resulting in insufficient resources of other services. In order to avoid the above situation, it is possible to set a fixed resource quota for each financial business, and determine whether the request exceeds the quota before processing each financial business request, and throw an exception if the quota exceeds the quota.

However, when configuring a fixed resource quota, it is often necessary to know the current cluster load and the tolerable load. When determining the current cluster load and the tolerable load, it is generally determined based on the stress test data when the cluster is built. , while the ability of the cluster to process business requests is dynamically changing, the initial stress test scenario is difficult to be consistent with the actual production scenario, often inaccurate cluster load judgments reduce the accuracy of resource quota allocation, which in turn affects the normal operation of the business.

technical solution

The purpose of the present application is to provide a cluster resource quota allocation method, device and electronic equipment, so as to improve the accuracy of resource quota allocation.

In the first aspect, the embodiment of the present application provides a cluster resource quota allocation method, including:

Receive the business processing request of the pending business sent by the client;

Verifying the resource quota of the service to be processed according to the latest load increment and pre-stored resource configuration information, and obtaining a verification result;

If the verification result is that the verification is passed, allocating a resource quota for the pending business according to the business processing request, and processing the pending business according to the allocated resource quota to obtain a business processing result;

If the verification result is that the verification fails, the pending service is processed according to the pre-stored over-quota request processing rule.

Optionally, the service processing request includes a load request amount, and the resource quota verification is performed on the service to be processed according to the latest load increment and pre-stored resource configuration information, and the verification result is obtained, including:

summing the current cluster load and the load request in the pre-stored resource configuration information to obtain a summation result;

Get the latest load increment;

judging whether the summation result is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the pre-stored resource configuration information;

If the summation result is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the prestored resource configuration information, it is determined that the verification result is a verification failure.

Optionally, before the step of obtaining the latest load increment, the method further includes:

judging whether the summation result is higher than the resource quota allocated for the service to be processed in the pre-stored resource configuration information;

If the summation result is higher than the resource quota allocated for the service to be processed in the pre-stored resource configuration information, the step of acquiring the latest load increment is performed.

Optionally, after said if the summation result is higher than the resource quota allocated for the service to be processed in the pre-stored resource configuration information, the method further includes:

Determine whether the pre-stored resource configuration information includes overflow ratio information;

If overflow ratio information is included, then determine the maximum quota of the service to be processed according to the overflow ratio information;

Judging whether the summation result is higher than the maximum quota;

If the summation result is not higher than the maximum quota, continue to perform the steps of obtaining the latest load increment and subsequent steps;

If the summation result is higher than the maximum quota, it is determined that the verification result is verification failure.

Optionally, the acquisition of the latest load increment includes:

Obtain the total processing time, processing time threshold, cluster minimum load, and cluster maximum load according to preset acquisition rules;

The latest load increment is determined according to the total processing duration, the processing duration threshold, the cluster minimum load, the cluster maximum load, and the cluster current load in the prestored resource configuration information.

Optionally, determining the latest load increment according to the total processing time, the processing time threshold, the minimum load of the cluster, the maximum load of the cluster, and the current load of the cluster in the pre-stored resource configuration information, include:

If the total processing time is greater than or equal to the processing time threshold, then determine that the latest load increment is zero;

If the total processing duration is less than the processing duration threshold, and the current load of the cluster in the pre-stored resource configuration information is less than the minimum load of the cluster, then determine that the latest load increment is the minimum load of the cluster and the pre-stored The difference between the current load of the cluster in the resource configuration information of ;

If the total processing duration is less than the processing duration threshold, and the current load of the cluster in the pre-stored resource configuration information is greater than the minimum load of the cluster and less than the maximum load of the cluster, then determine that the latest load increment is the The difference between the highest load of the cluster and the current load of the cluster in the pre-stored resource configuration information;

If the total processing duration is less than the processing duration threshold, and the current cluster load in the prestored resource configuration information is greater than the cluster maximum load, then determine that the latest load increment is the preset load threshold.

Optionally, the acquisition of the minimum load of the cluster according to the preset acquisition rules includes:

Obtain the preset processing time threshold and the total processing time and load at the target moment;

Judging the processing duration threshold and the total processing duration and load at the target time according to the preset minimum load judgment rule, and determining the minimum initial cluster load;

Obtaining several historical minimum loads within a preset historical time period, and determining the historical average minimum load according to the several historical minimum loads;

The minimum value of the initial cluster minimum load and the historical average minimum load is used as the cluster minimum load.

Optionally, the acquisition of the highest load of the cluster according to the preset acquisition rules includes:

Judging the processing duration threshold and the total processing duration and load at the target time according to the preset highest load judgment rule, and determining the initial cluster maximum load;

Obtaining several historical maximum loads within a preset historical time period, and determining the historical average maximum load according to the several historical maximum loads;

The maximum value of the initial cluster maximum load and the historical average maximum load is used as the maximum cluster load.

Optionally, the method also includes:

If the summation result is not higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the pre-stored resource configuration information, it is determined that the verification result is passed the verification.

Optionally, if the verification result is that the verification fails, processing the pending service according to the pre-stored over-quota request processing rules includes:

If the verification result is that the verification fails, then judging whether the pre-stored resource configuration information includes a retry flag;

If the pre-stored resource configuration information includes a retry identifier, adding the service processing request of the pending service to a request queue;

If the pre-stored resource configuration information does not include a retry identifier, a processing exception prompt is generated.

Optionally, before acquiring the latest load increment, further include:

Determine whether the preset update duration threshold is reached;

If the update duration threshold is reached, the latest load increment is obtained;

Alternatively, it is judged whether the total processing quantity of business processing requests reaches a preset quantity threshold;

If the quantity threshold is reached, the latest load increment is obtained.

In the second aspect, the embodiment of the present application provides a cluster resource quota allocation device, including:

The receiving module is used to receive the service processing request of the pending service sent by the client;

A processing module, configured to verify the resource quota of the service to be processed according to the latest load increment and pre-stored resource configuration information, and obtain a verification result;

The processing module is further configured to allocate a resource quota for the service to be processed according to the service processing request if the verification result is passed, and process the service to be processed according to the allocated resource quota to obtain Business processing results;

The processing module is further configured to process the service to be processed according to a pre-stored over-quota request processing rule if the verification result is that the verification fails.

In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, and a memory communicatively connected to the processor;

the memory stores computer-executable instructions;

The processor executes the computer-executed instructions stored in the memory to implement the cluster resource quota allocation method described in the first aspect and various possible designs of the first aspect.

In the fourth aspect, the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the above first aspect and the first Aspects of various possible designs of the cluster resource quota allocation method.

In the fifth aspect, the embodiment of the present application provides a computer program product, including a computer program. When the computer program is executed by a processor, the cluster resource quota allocation described in the above first aspect and various possible designs of the first aspect is realized. method.

The embodiment of the present application provides a cluster resource quota allocation method, device, and electronic equipment. After adopting the above scheme, the service processing request of the pending business sent by the client can be received first, and then according to the latest load increment and the pre-stored resource The configuration information is verified by the resource quota of the business to be processed, and the verification result is obtained. In one implementation manner, if the verification result is that the verification is passed, a resource quota is allocated to the pending business according to the business processing request, and then the pending business is processed according to the allocated resource quota to obtain a business processing result. In another implementation manner, if the verification result is that the verification fails, the pending service may be processed according to the pre-stored over-quota request processing rule. By combining pre-stored resource configuration information with dynamically determined load increments, resource quotas are allocated for pending services, which reduces inaccurate cluster load judgments and improves the rationality and accuracy of resource quota allocation. Thus ensuring the normal realization of various financial services.

Description of drawings

FIG. 1 is a schematic diagram of the architecture of the application system of the cluster resource quota allocation method provided by the embodiment of the present application;

FIG. 2 is a schematic flowchart of a method for allocating cluster resource quotas provided by an embodiment of the present application;

FIG. 3 is a schematic flowchart of a cluster resource quota allocation method provided by another embodiment of the present application;

FIG. 4 is a schematic structural diagram of a cluster resource quota allocation device provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.

Embodiments of the present invention

The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above drawings are used to distinguish similar objects and not necessarily Describe a specific order or sequence. It should be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application described herein are also capable of other sequential instances than those illustrated or described. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.

In the existing technology, financial services can be transfer, loan, loan amount adjustment, balance inquiry, etc. Each financial service needs to be realized through the allocated resource quota, and in the prior art, fixed resources are generally allocated to each financial service quota. When configuring fixed resource quotas for various financial services, it is often necessary to know the current cluster load and the tolerable load. When determining the current cluster load and tolerable load, it is generally based on the stress test data when the cluster is built Certainly, but the ability of the cluster to process business requests changes dynamically. It is difficult to be consistent between the initial stress test scenario and the actual production scenario. Inaccurate cluster load judgments often occur, which reduces the accuracy of resource quota allocation. In addition, in multi-financial business scenarios, the volume of business requests is often a fluctuating curve, and the time when business peaks and troughs appear can be predicted according to the business scenario. Set a cluster and set the peak request as a quota. Therefore, it is more important to determine the peak value of requests. The peak value often depends on the experience and historical data of the operation and maintenance personnel. In the event of standby switching, etc.), it will lead to the failure of all the sudden requests under the current current limiting logic, which affects the normal realization of various financial services.

Based on the above technical problems, this application allocates resource quotas for pending services by combining pre-stored resource configuration information with dynamically determined load increments, which not only reduces the inaccurate judgment of cluster loads, but also improves This ensures the rationality and accuracy of the allocation of resource quotas, thereby ensuring the technical effects of the normal realization of various financial services.

Figure 1 is a schematic diagram of the architecture of the application system of the cluster resource quota allocation method provided by the embodiment of the present application. As shown in Figure 1, the application system includes: a cluster 101, a database 102 and a client 103, and the cluster 101 can receive the client 103 The sent service processing request is then obtained from the database 102 to obtain pre-stored resource configuration information, combined with the newly obtained load increment to verify the resource quota of the service to be processed, obtain the verification result, and further process according to the verification result.

Among them, there may be one or more clients 103, for example, they may be smart phones, tablets, personal computers, or wearable smart devices.

The cluster 101 may be an Hbase cluster, and the Hbase cluster is a distributed, scalable, high-availability, high-performance NoSQL database that can support random or real-time read and write functions of very large tables.

The technical solution of the present application will be described in detail below with specific embodiments. The following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be repeated in some embodiments.

FIG. 2 is a schematic flowchart of a method for allocating cluster resource quotas provided by an embodiment of the present application. The method of this embodiment may be executed by the cluster 101 . As shown in Figure 2, the method of this embodiment may include:

S201: Receive a service processing request of a service to be processed sent by a client.

In this embodiment, when a user wants to implement a financial service, he may send a service processing request corresponding to the service to be processed to the cluster through the client.

S202: Perform resource quota verification on the service to be processed according to the latest load increment and pre-stored resource configuration information, and obtain a verification result.

In this embodiment, after receiving the business processing request, the latest load increment and pre-stored resource configuration information can be obtained, and the resource quota verification of the business to be processed can be performed according to the obtained latest load increment and pre-stored resource configuration information , get the verification result.

Among them, the pre-stored resource configuration information may include pre-configured overflow ratio information, resource quota (RNQ, Request Num Quota) allocated for each pending service, that is, the quota of the number of requests per unit time, and the time unit may be sec , min, hour, day, the number of requests is represented by req, such as 1000req/sec, 50000req/hour, the current load of the cluster and whether to retry the mark.

Specifically, the overflow ratio information can be customized and set according to the actual application scenario. For the specific setting method, refer to the calculation method below. The overflow ratio information can dynamically handle business access emergencies. When cluster resources are available, the The resource quota of the resource reaches the limit and normal requests fail.

The configuration principle of allocating resource quotas for each pending business is to ensure that the sum of the RNQs of all pending businesses is less than or equal to the minimum guaranteed load of the cluster on the basis of meeting the business needs of each pending business. Overflow percentage/100+1)*RNQ The sum is less than or equal to the maximum achievable load. Among them, the specific configuration strategy can be to use the existing stress test tool to stress test the cluster at the beginning of the cluster construction, and obtain the initial minimum load of the cluster (LowTPS, Low Transactions Per Second), and the maximum load of the cluster (UpTPS, Up Transactions Per Second ). When each business goes online, the operation and maintenance personnel can evaluate the TPS (Transactions Per Second, that is, the number of transactions executed per second, which is an important measure of cluster throughput) of the business according to the business volume of the business system, and will evaluate the The TPS size of the service is set to the RNQ value of the service, and at the same time, it is necessary to ensure that the sum of the RNQ of each service is less than or equal to the minimum load of the cluster. If the sum of RNQ of the business is greater than the minimum load of the cluster, the cluster can be expanded to increase the minimum load of the cluster. Correspondingly, if the TPS of the service contains glitches, the TPS peak value excluding glitches can be set as the RNQ of the service, and if the TPS of the service has no glitches, the peak TPS of the service can be set as the RNQ of the service.

In addition, the calculation method of overflow ratio information can be:

The glitch TPS of business 1 is MTPS1, the glitch TPS of business n is MTPSn, the overflow ratio information of business 1 is op1, and the overflow ratio information of business n is opn, then (SUM(MTPS1....MTPSn)/RNQ-1)*100 =sum(op1...opn).

opn = MTPSn/sum(MTPS1....MTPSn)*sum(op1...opn).

In addition, it can also continuously monitor each business and cluster, obtain the latest TPS of each business, the minimum load of the cluster, the highest load of the cluster, and the current load of the cluster, and draw an indicator map to provide data basis for the subsequent configuration of RNQ and overflow ratio information .

The current load of the cluster (CTPS, CurrentTPS) can be continuously monitored by the RegonServer service of the cluster, and the TotalRequestCount indicator of the Regionserver can be counted at a fixed interval (the default is 5 seconds). Exemplarily, the TotalRequestCount at time T is recorded as RequestCount(T), and the TotalRequestCount at time T1 is recorded as RequestCount(T1), then CurrentTPS=(RequestCount(T1)- RequestCount(T))/(T1-T). Among them, TotalRequestCount represents the total number of business requests processed. It is an indicator provided by the cluster itself, which can be obtained directly through existing functions, and is a dynamic indicator. Every time a request is processed, it will be increased by 1 on the basis of the original value. . After the CurrentTPS is obtained, the CurrentTPS can be stored in the pre-stored resource configuration information, that is, the current load of the cluster in the pre-stored resource configuration information is updated.

Further, the service processing request includes the load request amount, and the resource quota verification is performed on the pending business according to the latest load increment and pre-stored resource configuration information, and the verification result is obtained, which may specifically include:

Summing the current load of the cluster in the pre-stored resource configuration information and the load request amount is performed to obtain a summing result.

Get the latest load delta.

Judging whether the summation result is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the pre-stored resource configuration information.

In addition, if the summation result is not higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the pre-stored resource configuration information, it is determined that the verification result is passed the verification.

In addition, before the step of acquiring the latest load increment, the method may further include: judging whether the summation result is higher than the resource quota allocated for the service to be processed in the pre-stored resource configuration information.

In addition, if the summation result is not higher than the resource quota allocated for the service to be processed in the pre-stored resource configuration information, it is determined that the verification result is passed the verification.

Specifically, the service processing request may directly contain the service request amount, or may contain type identifiers representing different request types, and different type identifiers may correspond to different service request amounts. Exemplarily, the type identifiers can be Put, Get, Scan, and Mulit. The service request amount corresponding to the Put type identifier is 1 time, the service request amount corresponding to the Get type identifier is 1 time, and the service request amount corresponding to the Scan type identifier is 1 time. Second, the business request volume corresponding to the Mulit type identifier is the number of regions involved.

When obtaining the load request amount and the current load of the cluster, the current load of the cluster and the load request amount can be summed first to obtain the summation result. The summation result can have two cases, one case is that the summation result is less than or equal to the resource quota allocated in advance for the service to be processed, and the other case is that the summation result is greater than the resource quota allocated for the service to be processed. After obtaining the summation result, you can directly obtain the latest load increment, and then judge whether the summation result is higher than the sum of the latest load increment and the resource quota allocated for the pending business in the pre-stored resource configuration information, that is, you can Each time, the latest load increment is obtained first, and then the relationship between the summation result and the sum of the latest load increment and the resource quota allocated for the business to be processed is directly judged, and further processing is performed according to the judgment result.

In addition, after obtaining the summation result, it is also possible not to obtain the latest load increment first, but first to determine whether the summation result is higher than the resource quota allocated for the pending business in the pre-stored resource configuration information, if the summation result is less than or is equal to the resource quota allocated for the pending business in the pre-stored resource configuration information, then it can be determined that the verification result is verified, and the pending business can be processed subsequently according to the business processing request. If the summation result is greater than the resource quota allocated for the business to be processed, you can obtain the latest load increment determined by the load calculation dynamic module, and then judge whether the summation result is higher than the latest load increment and the pre-stored resource configuration information. The sum of the resource quotas allocated by the pending business. If the summation result is less than or equal to the sum of the latest load increment and the resource quota allocated for the pending business in the pre-stored resource configuration information, then the verification result is determined to be verified, and the pending business can be processed according to the business processing request. If the summation result is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the pre-stored resource configuration information, it can be directly determined that the verification result is a verification failure.

Among them, the acquisition of a new load increment can be triggered according to the preset load increment trigger rule. When the preset load increment trigger rule is reached, the newly acquired load increment is the latest load increment. If the load increment trigger rule is set, the load increment acquired last time is the latest load increment.

Exemplarily, the latest load increment is 10, and the resource quota allocated for this business is 100, then the sum of the latest load increment and the resource quota allocated for the pending business is 100+10=110, if the current request load If the amount is 11, the sum result is 100+11=111, which is higher than the sum of the latest load increment and the resource quota allocated for the pending business, and the verification result is that the verification fails. If the current request load is 9, the sum result is 100+9=109, which is lower than the sum of the latest load increment and the resource quota allocated for the business to be processed, and the verification result obtained is verification passed. By setting the latest The method of load increment, combined with the latest load increment, realizes the dynamic and accurate allocation of resource quotas, improves the rationality of allocating resource quotas for pending businesses, and thus ensures the normal realization of each business.

In addition, after the summation result is higher than the resource quota allocated for the service to be processed in the pre-stored resource configuration information, the method may further include:

Determine whether the prestored resource configuration information includes overflow ratio information.

If overflow ratio information is included, the maximum quota of the service to be processed is determined according to the overflow ratio information.

It is judged whether the summation result is higher than the maximum quota.

If the summation result is not higher than the maximum quota, continue to perform the steps of obtaining the latest load increment and subsequent steps.

Specifically, after it is determined that the summation result is higher than the resource quota allocated for the service to be processed in the pre-stored resource configuration information, it may first be determined whether the overflow ratio information is included in the pre-stored resource configuration information. When setting resource quotas for each business, you can also set a positive number as the overflow ratio information, that is, when the overall resources of the cluster are sufficient, the actual request volume of the business can exceed the maximum percentage of the quota, and the default is 0 without overflow. The maximum quota allocated for the pending business can be determined according to the overflow ratio information, that is, the maximum allowed request amount=(quota overflow percentage/100+1)*resource quota allocated for the pending business. After determining the maximum quota, it can be judged whether the summation result is greater than the maximum quota, if the summation result is less than or equal to the maximum quota, continue to obtain the latest load increment, and judge whether the summation result is higher than the latest load increment and the sum of the resource quota allocated for the service to be processed in the pre-stored resource configuration information. If the summation result is not higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the pre-stored resource configuration information, it is determined that the verification result is passed the verification. If the summation result is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the prestored resource configuration information, it is determined that the verification result is a verification failure. If the summation result is higher than the maximum quota, it is determined that the verification result is verification failure.

Exemplarily, the highest quota can be 120, the latest load increment is 10, and the resource quota allocated for this business is 100, then the sum of the latest load increment and the resource quota allocated for the pending business is 100+10= 110. If the current request load is 9, the summation result is 100+9=109, and the verification result can be obtained as verification passed. By comparing the summation result with the maximum quota, when the summation result is less than the maximum configuration , and then compare the summation result with the latest load increment and the sum of the resource quota allocated for the pending business, and improve the rationality of allocating the resource quota for the pending business through the second comparison.

In addition, the current request load can also be 11, and the summation result is 100+11=111. In this case, although the summation result is lower than the maximum quota, it is higher than the latest load increment and is pending The sum of the resource quotas allocated by the business, the verification result obtained is that the verification failed. Through the setting of the load increment, the number of resource quotas allocated for the pending business is increased, and the resource quota allocated for the pending business is also avoided. If the number is too large, it further improves the rationality of allocating resource quotas for pending services, and at the same time reduces the situation that all requests fail when a certain service requests a sudden increase due to an unforeseen unexpected situation. .

S203: If the verification result is that the verification is passed, allocate a resource quota for the business to be processed according to the business processing request, process the business to be processed according to the allocated resource quota, and obtain a business processing result.

In this embodiment, the verification result may include two situations, one is that the verification is passed, and the other is that the verification is not passed. If the verification result is verified as passed, it indicates that the current load of the cluster can serve the pending business. Therefore, the resource quota can be allocated for the pending business according to the business processing request, and the pending business can be processed according to the allocated resource quota. business processing results.

S204: If the verification result is that the verification fails, process the pending business according to the pre-stored over-quota request processing rules.

In this embodiment, if the verification result is that the verification fails, the pre-stored over-quota request processing rules may be obtained first, and then the pending business is processed according to the obtained over-quota request processing rules.

Further, if the verification result is that the verification fails, the pending business is processed according to the pre-stored over-quota request processing rules, which may specifically include:

If the verification result is that the verification fails, it is judged whether the pre-stored resource configuration information includes a retry flag.

If the pre-stored resource configuration information includes a retry identifier, adding the service processing request of the service to be processed to a request queue.

Specifically, the existing quota current limiting logic simply compares the request concurrency with the quota, and discards requests larger than the quota and throws an exception. For scenarios where writing can be delayed but discarding is not allowed when the quota is exceeded, the existing technology may It will cause data loss and affect data integrity. However, in this application, it is possible to pre-configure whether the service to be processed needs to be retried, and store it in the resource configuration information. Among them, for scenarios where data consistency requirements are high and data writing failure is not allowed, a retry flag can be configured for the pending business to better maintain data integrity.

Correspondingly, if the pre-stored resource configuration information contains a retry flag, the service processing request of the pending service can be added to the request queue, and the cluster can subsequently process each service according to the order in which the service processing requests are added in the request queue. Request to process pending business. If the pre-stored resource configuration information does not include a retry flag, a processing exception prompt can be generated, that is, an exception is thrown to remind the operation and maintenance personnel to perform maintenance in time, and provide more operations and maintenance personnel when dealing with over-quota situations Select to ensure the normal realization of each business.

After adopting the above scheme, the service processing request of the pending business sent by the client can be received first, and then the resource quota verification of the pending business can be performed according to the latest load increment and pre-stored resource configuration information, and the verification result can be obtained. In one implementation manner, if the verification result is that the verification is passed, a resource quota is allocated to the pending business according to the business processing request, and then the pending business is processed according to the allocated resource quota to obtain a business processing result. In another implementation manner, if the verification result is that the verification fails, the pending service may be processed according to the pre-stored over-quota request processing rule. By combining pre-stored resource configuration information with dynamically determined load increments, resource quotas are allocated for pending services, which reduces inaccurate cluster load judgments and improves the rationality and accuracy of resource quota allocation. Thus ensuring the normal realization of various financial services.

Based on the method in FIG. 2 , the embodiment of this specification also provides some specific implementations of the method, which will be described below.

In addition, in another embodiment, obtaining the latest load increment may include:

Obtain the total processing time, processing time threshold, minimum cluster load, and maximum cluster load according to preset acquisition rules.

In this embodiment, the total processing time can be represented by TotalCallTime, which is the time for a request to be processed to be received and processed by the cluster, and this index can directly reflect the processing performance of the current Regionserver (that is, the cluster) service. Correspondingly, TotalCallTime=QueueCallTime+ ProcessCallTime. Among them, QueueCallTime is an indicator at the Regionserver level in the cluster. After the cluster receives the pending request from the client, it can put the pending request into the request queue first, and then a dedicated thread consumes the pending request from the queue and hands it over for processing. The thread performs processing, and the length of time a pending request waits in the queue is QueutCallTime. ProcessCallTime is an indicator at the Regionserver level in the cluster. It refers to the time from when a pending request is consumed from the queue until the processing is completed. This indicator is a key indicator that reflects the processing efficiency of the cluster.

The processing time threshold can be represented by MaxCallTime, which is the maximum value of TotalCallTime that can be tolerated by the service set from the perspective of the service side. Exemplarily, the query of the service to be processed can tolerate a maximum time-consuming of 0.5s (if the request cannot be processed within 0.5s, the service application will report an error), then MaxCallTime can be set to a value not greater than 0.5s.

For the minimum load of the cluster, obtain the minimum load of the cluster according to the preset acquisition rules, which can include:

Obtain the preset processing time threshold and the total processing time and load at the target moment.

The processing duration threshold, the total processing duration and the load at the target time are judged according to the preset minimum load judgment rule, and the minimum initial cluster load is determined.

Several historical minimum loads within a preset historical time period are obtained, and a historical average minimum load is determined according to the several historical minimum loads.

Specifically, the minimum cluster load can also be called LowTPS, which means the lower limit of TPS when TotalCallTime reaches MaxCallTime, that is, the TPS load that the current Regionserver service can provide when the performance is the worst. Load (TPS, Transactions Per Second) indicates the number of transactions executed per second, which is an important measure of cluster throughput. Correspondingly, the RegionServer service can be continuously monitored, the TotalCallTime at the time t is CTt, and the TPS is TPSt, and then it can be calculated at a fixed time interval:

Take the minimum value of TPSt at all moments in the MaxCallTime*99%<=CTt<=MaxCallTime*101% interval in the time interval, as the LowTPS of the current time interval, and archive it. If CTt continues to be less than MaxCallTime*99% in the current time interval, it indicates that the Regionserver is running under a light load. Sort the CTt, and obtain the TPS value corresponding to the maximum CTt as the LowTps of the current time interval. If CTt for the current time interval is consistently greater than MaxCallTime*101%, it means that the Regionserver is overloaded and there is no valid LowTPS data in the current time interval. Then take the minimum value of the average LowTPS in the historical time period and the LowTPS of the latest time interval as the LowTPS of the current cluster service, and the validity period is the time interval. Wherein, both the time interval and the historical time period can be customized and set according to actual application scenarios. Exemplarily, the time interval can be any value within 3-6 minutes. The historical time period can be any value from 1-3 months.

For the highest cluster load, obtain the highest cluster load according to the preset acquisition rules, which can include:

The processing duration threshold, the total processing duration and the load at the target time are judged according to the preset highest load judgment rule, and the initial maximum load of the cluster is determined.

Several historical maximum loads within a preset historical time period are obtained, and a historical average maximum load is determined according to the several historical maximum loads.

Specifically, the highest cluster load can also be called UpTPS, indicating the TPS upper limit when TotalCallTime reaches MaxCallTime, that is, the TPS load that the current RegionServer service can provide when the performance is the best. Load (TPS, Transactions Per Second) indicates the number of transactions executed per second, which is an important measure of cluster throughput. Correspondingly, the RegionServer service can be continuously monitored, the TotalCallTime at the time t is CTt, and the TPS is TPSt, and then it can be calculated at a fixed time interval:

Take the maximum value of TPSt at all moments in the interval of MaxCallTime*99%<=CTt<=MaxCallTime*101% in the time interval as the UpTPS of the current time interval. If CTt continues to be less than MaxCallTime*99% in the current time interval, It indicates that the Regionserver is running under a light load, sorts the CTt, and obtains the TPS*101% value corresponding to the maximum CTt moment as the UpTps of the time interval. If the CTt of the current hour is continuously greater than MaxCallTime*101%, Then the table name regionserver is overloaded, and there is no valid UpTPS data in the current time interval. Then take the average UpTPS in the historical time period and the maximum value of the UpTPS in the latest time interval as the UpTPS of the current Rgionserver service, and the validity period is the time interval. Wherein, both the time interval and the historical time period can be customized and set according to actual application scenarios. Exemplarily, the time interval can be any value within 3-6 minutes. The historical time period can be any value from 1-3 months.

The current load of the cluster can be obtained directly from the pre-stored resource configuration information. The current load of the cluster in the pre-stored resource configuration information is updated in real time. Similar to the previous embodiment, the RegonServer service can be continuously monitored, and after a fixed time interval (5 seconds by default), the TotalRequestCount of the Regionserver Count the indicators, the TotalRequestCount at time T is recorded as RequestCount(T), the TotalRequestCount at time T1 is recorded as RequestCount(T1), CurrentTPS=(RequestCount(T1)- RequestCount(T))/(T1-T), and then the new The determined CurrentTPS is stored in the resource configuration information, that is, only the current load of the cluster in the configuration information is updated. By updating the current load of the cluster in the resource configuration information in real time, the current load of the cluster can be accurately determined, and then provide a basis for whether the business can continue to increase and the increased business scale, which avoids waste of resources and overload The running situation ensures the normal realization of various businesses.

Further, after obtaining the total processing time, the processing time threshold, the minimum load of the cluster, the maximum load of the cluster, and the current load of the cluster, the latest load increment can be determined, that is, according to the total processing time, the processing time threshold, the The minimum load of the cluster, the maximum load of the cluster, and the current load of the cluster in the pre-stored resource configuration information determine the latest load increment (also referred to as the acceptable load of the cluster), which may specifically include:

If the total processing duration is greater than or equal to the processing duration threshold, it is determined that the latest load increment is zero.

If the total processing duration is less than the processing duration threshold, and the current load of the cluster in the pre-stored resource configuration information is less than the minimum load of the cluster, then determine that the latest load increment is the minimum load of the cluster and the pre-stored The difference between the current load of the cluster in the resource configuration information of .

If the total processing duration is less than the processing duration threshold, and the current load of the cluster in the pre-stored resource configuration information is greater than the minimum load of the cluster and less than the maximum load of the cluster, then determine that the latest load increment is the The difference between the highest load of the cluster and the current load of the cluster in the pre-stored resource configuration information.

Specifically, if TotalCallTime >= MaxCallTime, it is determined that the current Regionserver load has reached the maximum value, and quota overflow cannot occur. If TotalCallTime<MaxCallTime and CurrentTPS<LowTPS, it is determined that the Regionserver is running lightly and has idle load, and the load that can be increased is LowTPS-CurrentTPS. If TotalCallTime<MaxCallTime and LowTPS<CurrentTPS<UpTPS, it is determined that the Regionserver has an idle load, and the load that can be increased is UpTPS- Current TPS. If TotalCallTime<MaxCallTime and CurrentTPS>UpTPS, it is determined that the Regionserver load has reached a new high, and the processing capacity meets the requirements, and the load can be increased in a small scale. Wherein, the number of small-scale increased loads can be customized and set according to actual application scenarios, for example, it can be increased by 10TPS.

In addition, when obtaining the latest load increment, there can be multiple trigger mechanisms, specifically:

In an implementation manner, it may be determined whether a preset update duration threshold is reached.

If the update duration threshold is reached, the latest load increment is obtained.

Exemplarily, the update duration threshold may be any value in 3-6 minutes, and when the update duration threshold is reached, the rule for acquiring the latest load increment may be automatically triggered, and the latest load increment may be acquired.

In another implementation manner, it may be determined whether the total processing quantity of service processing requests reaches a preset quantity threshold.

If the quantity threshold is reached, the latest load increment is obtained.

Exemplarily, the quantity threshold may be 10000, and when the number of business processing requests reaches 10000, the rule for acquiring the latest load increment may be automatically triggered, and the latest load increment may be acquired. At the same time, the number of business processing requests can be cleared and the calculation can be restarted from zero.

In addition, if neither the update duration threshold nor the quantity threshold is reached when the latest load increment is acquired, the load increment obtained in the previous acquisition may be used as the newly acquired load increment.

Fig. 3 is a schematic flowchart of a method for allocating cluster resource quotas provided by another embodiment of the present application. As shown in Fig. 3, in this embodiment, the method may include: receiving pending requests of pending services, and then The request is parsed to determine the load request volume. After determining the load request amount, the summation result can be determined according to the cluster current load and the load request amount in the pre-stored resource configuration information, and judge whether the summation result exceeds the resource quota allocated for the business to be processed. If not, the pending request is processed. If yes, it is judged whether overflow ratio information is configured. If yes, determine the maximum quota according to the overflow ratio information, and judge whether the summation result exceeds the maximum quota. If the maximum quota is not exceeded, obtain the latest load increment, and judge whether the summation result exceeds the sum of the latest load increment and the resource quota allocated for the pending business, if not exceeding the latest load increment and the pending business The sum of the resource quotas allocated by the business will process pending requests. If the sum of the latest load increment and the resource quota allocated for the pending business is exceeded, it will be determined whether to retry according to the pre-stored over-quota request processing rules. If it is retried, it will be added to the request queue again, otherwise, an exception will be thrown directly.

In addition, if the overflow ratio information is not configured, it will be determined whether to retry according to the pre-stored over-quota request processing rules. If it is retried, it will re-join the request queue, otherwise, an exception will be thrown directly.

If the maximum configuration is exceeded, it will be determined whether to retry according to the pre-stored over-quota request processing rules. If it is retried, it will rejoin the request queue, otherwise, an exception will be thrown directly.

Among them, the latest load increment is determined to have two mechanisms to trigger recalculation. The default timing calculation is 5 minutes, and a new round of calculation will be triggered after processing requests reach a certain amount (default 10000).

Based on the same idea, the embodiment of this specification also provides a device corresponding to the above method. Figure 4 is a schematic structural diagram of the cluster resource quota allocation device provided by the embodiment of this application. As shown in Figure 4, the device provided by this embodiment can be include:

The receiving module 401 is configured to receive the service processing request of the service to be processed sent by the client.

The processing module 402 is configured to perform resource quota verification on the service to be processed according to the latest load increment and pre-stored resource configuration information, and obtain a verification result.

In this embodiment, the service processing request includes a load request amount, and the processing module 402 is further configured to:

Get the latest load delta.

In addition, the processing module 402 is also used for:

Judging whether the summation result is higher than the resource quota allocated for the service to be processed in the prestored resource configuration information.

In addition, the processing module 402 is also used for:

Furthermore, the processing module 402 is also used for:

It is judged whether the summation result is higher than the maximum quota.

The processing module 402 is further configured to allocate a resource quota for the service to be processed according to the service processing request if the verification result is passed, and process the service to be processed according to the allocated resource quota, Get the business processing result.

The processing module 402 is further configured to process the service to be processed according to a pre-stored over-quota request processing rule if the verification result is that the verification fails.

In this embodiment, the processing module 402 is further configured to:

In addition, in another embodiment, the processing module 402 is further configured to:

In this embodiment, the processing module 402 is further configured to:

In addition, the processing module 402 is also used for:

In this embodiment, the processing module 402 is further configured to:

In addition, the processing module 402 is also used for:

Determine whether the preset update duration threshold is reached.

Alternatively, it is judged whether the total processing quantity of the service processing request reaches a preset quantity threshold.

If the quantity threshold is reached, the latest load increment is acquired.

The device provided in the embodiment of the present application can implement the method in the above embodiment as shown in FIG. 2 , and its implementation principle and technical effect are similar, and will not be repeated here.

FIG. 5 is a schematic diagram of a hardware structure of an electronic device provided in an embodiment of the present application. As shown in FIG. 5 , a device 500 provided in this embodiment includes a processor 501 and a memory communicatively connected to the processor. Wherein, the processor 501 and the memory 502 are connected through a bus 503 .

In a specific implementation process, the processor 501 executes the computer-executed instructions stored in the memory 502, so that the processor 501 executes the method for allocating cluster resource quotas in the foregoing method embodiments.

For the specific implementation process of the processor 501, reference may be made to the foregoing method embodiments. The implementation principles and technical effects thereof are similar, and details are not repeated here in this embodiment.

In the above-mentioned embodiment shown in FIG. 5, it should be understood that the processor may be a central processing unit (English: Central Processing Unit, referred to as: CPU), can also be other general-purpose processors, digital signal processors (English: Digital Signal Processor, referred to as: DSP), application-specific integrated circuits (English: Application Specific Integrated Circuit, referred to as: ASIC), etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in conjunction with the invention can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.

The memory may include high-speed RAM memory, and may also include non-volatile storage NVM, such as at least one disk memory.

The bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Device Interconnect (Peripheral Component Interconnect, PCI) bus or Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. For ease of representation, the buses in the drawings of the present application are not limited to only one bus or one type of bus.

The embodiment of the present application also provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when the processor executes the computer-executable instructions, the cluster resource quota allocation method of the above method embodiment is implemented .

An embodiment of the present application further provides a computer program product, including a computer program, and when the computer program is executed by a processor, implements the cluster resource quota allocation method as described above.

The above-mentioned computer-readable storage medium, the above-mentioned readable storage medium can be implemented by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable Programmable Read Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic Disk or Optical Disk. Readable storage media can be any available media that can be accessed by a general purpose or special purpose computer.

An exemplary readable storage medium is coupled to the processor such the processor can read information from, and write information to, the readable storage medium. Of course, the readable storage medium can also be a component of the processor. The processor and the readable storage medium can be located in an application-specific integrated circuit (Application Specific Integrated Circuits, referred to as: ASIC). Of course, the processor and the readable storage medium can also exist in the device as discrete components.

Those of ordinary skill in the art can understand that all or part of the steps for implementing the above method embodiments can be completed by program instructions and related hardware. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, it executes the steps including the above-mentioned method embodiments; and the aforementioned storage medium includes: ROM, RAM, magnetic disk or optical disk and other various media that can store program codes.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, rather than limiting them; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present application. scope.

Claims

A method for allocating cluster resource quotas, comprising:

Receive the business processing request of the pending business sent by the client;

Verifying the resource quota of the service to be processed according to the latest load increment and pre-stored resource configuration information, and obtaining a verification result;

If the verification result is that the verification is passed, allocating a resource quota for the pending business according to the business processing request, and processing the pending business according to the allocated resource quota to obtain a business processing result;

If the verification result is that the verification fails, the pending service is processed according to the pre-stored over-quota request processing rule.
The method according to claim 1, wherein the service processing request includes a load request amount, and the resource quota verification is performed on the pending service according to the latest load increment and pre-stored resource configuration information to obtain Verify results, including:

summing the current cluster load and the load request in the pre-stored resource configuration information to obtain a summation result;

Get the latest load increment;

judging whether the summation result is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the pre-stored resource configuration information;

If the summation result is higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the prestored resource configuration information, it is determined that the verification result is a verification failure.
The method according to claim 2, wherein, before the step of obtaining the latest load increment, the method further comprises:

judging whether the summation result is higher than the resource quota allocated for the service to be processed in the pre-stored resource configuration information;

If the summation result is higher than the resource quota allocated for the service to be processed in the pre-stored resource configuration information, the step of acquiring the latest load increment is performed.
The method according to claim 3, further comprising:

Determine whether the pre-stored resource configuration information includes overflow ratio information;

If overflow ratio information is included, then determine the maximum quota of the service to be processed according to the overflow ratio information;

Judging whether the summation result is higher than the maximum quota;

If the summation result is not higher than the maximum quota, continue to perform the steps of obtaining the latest load increment and subsequent steps;

If the summation result is higher than the maximum quota, it is determined that the verification result is verification failure.
The method according to any one of claims 2-4, wherein said obtaining the latest load increment comprises:

Obtain the total processing time, processing time threshold, cluster minimum load, and cluster maximum load according to preset acquisition rules;

The latest load increment is determined according to the total processing duration, the processing duration threshold, the cluster minimum load, the cluster maximum load, and the cluster current load in the prestored resource configuration information.
The method according to claim 5, characterized in that, according to the total processing duration, the processing duration threshold, the minimum load of the cluster, the maximum load of the cluster, and the clusters in the pre-stored resource configuration information The current load determines the latest load increment, including:

If the total processing time is greater than or equal to the processing time threshold, then determine that the latest load increment is zero;

If the total processing duration is less than the processing duration threshold, and the current load of the cluster in the pre-stored resource configuration information is less than the minimum load of the cluster, then determine that the latest load increment is the minimum load of the cluster and the pre-stored The difference between the current load of the cluster in the resource configuration information of ;

If the total processing duration is less than the processing duration threshold, and the current load of the cluster in the pre-stored resource configuration information is greater than the minimum load of the cluster and less than the maximum load of the cluster, then determine that the latest load increment is the The difference between the highest load of the cluster and the current load of the cluster in the pre-stored resource configuration information;

If the total processing duration is less than the processing duration threshold, and the current cluster load in the prestored resource configuration information is greater than the cluster maximum load, then determine that the latest load increment is the preset load threshold.
The method according to claim 5 or 6, wherein the obtaining the minimum load of the cluster according to the preset obtaining rules includes:

Obtain the preset processing time threshold and the total processing time and load at the target moment;

Judging the processing duration threshold and the total processing duration and load at the target time according to the preset minimum load judgment rule, and determining the minimum initial cluster load;

Obtaining several historical minimum loads within a preset historical time period, and determining the historical average minimum load according to the several historical minimum loads;

The minimum value of the initial cluster minimum load and the historical average minimum load is used as the cluster minimum load.
The method according to any one of claims 5-7, wherein the obtaining the highest load of the cluster according to preset obtaining rules includes:

Obtain the preset processing time threshold and the total processing time and load at the target moment;

Judging the processing duration threshold and the total processing duration and load at the target time according to the preset highest load judgment rule, and determining the initial cluster maximum load;

Obtaining several historical maximum loads within a preset historical time period, and determining the historical average maximum load according to the several historical maximum loads;

The maximum value of the initial cluster maximum load and the historical average maximum load is used as the maximum cluster load.
The method according to any one of claims 2-8, wherein the method further comprises:

If the summation result is not higher than the sum of the latest load increment and the resource quota allocated for the service to be processed in the pre-stored resource configuration information, it is determined that the verification result is passed the verification.
The method according to any one of claims 1-9, wherein if the verification result is that the verification fails, processing the pending business according to the pre-stored over-quota request processing rules includes:

If the verification result is that the verification fails, then judging whether the pre-stored resource configuration information includes a retry flag;

If the pre-stored resource configuration information includes a retry identifier, adding the service processing request of the pending service to a request queue;

If the pre-stored resource configuration information does not include a retry identifier, a processing exception prompt is generated.
The method according to any one of claims 2-9, characterized in that, before acquiring the latest load increment, further comprising:

Determine whether the preset update duration threshold is reached;

If the update duration threshold is reached, the latest load increment is obtained;

Alternatively, it is judged whether the total processing quantity of business processing requests reaches a preset quantity threshold;

If the quantity threshold is reached, the latest load increment is obtained.
A device for allocating cluster resource quotas, comprising:

The receiving module is used to receive the service processing request of the pending service sent by the client;

A processing module, configured to verify the resource quota of the service to be processed according to the latest load increment and pre-stored resource configuration information, and obtain a verification result;

The processing module is further configured to allocate a resource quota for the service to be processed according to the service processing request if the verification result is passed, and process the service to be processed according to the allocated resource quota to obtain Business processing results;

The processing module is further configured to process the service to be processed according to a pre-stored over-quota request processing rule if the verification result is that the verification fails.
An electronic device, characterized by comprising: a processor, and a memory connected to the processor in communication;

the memory stores computer-executable instructions;

The processor executes the computer-executed instructions stored in the memory, so as to realize the cluster resource quota allocation method according to any one of claims 1 to 11.
A computer-readable storage medium, characterized in that, computer-executable instructions are stored in the computer-readable storage medium, and when the processor executes the computer-executable instructions, in order to realize any one of claims 1 to 11 The cluster resource quota allocation method.
A computer program product, comprising a computer program, characterized in that, when the computer program is executed by a processor, the cluster resource quota allocation method according to any one of claims 1 to 11 is implemented.