WO2022110796A1

WO2022110796A1 - Cloud service request responding method and apparatus, electronic device, and storage medium

Info

Publication number: WO2022110796A1
Application number: PCT/CN2021/102872
Authority: WO
Inventors: 韩秋明; 李建; 符柱; 陈家园
Original assignee: 上海商汤智能科技有限公司
Priority date: 2020-11-24
Filing date: 2021-06-28
Publication date: 2022-06-02
Also published as: CN112395091A

Abstract

A cloud service request responding method and apparatus, an electronic device, and a storage medium. The method is applied to a cloud service system. The method may comprise: acquiring the total cloud service quota that a tenant applies for from a cloud service system, wherein the cloud service system comprises a system which is constructed on the basis of a distributed architecture (S102); on the basis of the total cloud service quota, allocating a working quota to a work node which is comprised in the distributed architecture and corresponds to the tenant, such that the work node responds, according to the working quota corresponding thereto, to a cloud service request initiated by the tenant (S104). By means of the method, the amount of cloud service requests of tenants that the cloud service system frequently reads and writes by means of communicating with each work node is reduced, such that the cloud service system performs, less frequently, network input and output operations and locking operations for the reading and writing of common storage, thereby increasing the response speed of the system to cloud service requests, and improving the tenant experience.

Description

Cloud service request response method and apparatus, electronic device and storage medium

CROSS-REFERENCE TO RELATED APPLICATIONS

The present disclosure is based on a Chinese patent application with application number 202011331362.0 and an application date of November 24, 2020, and claims the priority of the Chinese patent application, the entire contents of which are incorporated herein by reference.

technical field

The embodiments of the present application relate to the field of computer technologies, and relate to, but are not limited to, a cloud service request response method and apparatus, an electronic device, and a storage medium.

Background technique

As the Internet becomes more and more developed, more and more tenants choose cloud services. In the case of choosing a cloud service, the tenant usually applies to the cloud service system for a certain total cloud service quota; and within the scope of the total cloud service quota, initiates a cloud service request.

After the cloud service system receives the cloud service request initiated by the tenant, it will only respond to the cloud service request after determining that the tenant's request is within the range of the above-mentioned total cloud service quota.

SUMMARY OF THE INVENTION

In view of this, the embodiment of the present application discloses at least one cloud service request response method, and the method is executed by a cloud service system; the above method includes:

Obtain the total cloud service quota applied by the tenant to the above-mentioned cloud service system; wherein, the above-mentioned cloud service system includes a system constructed based on a distributed architecture;

Based on the total cloud service quota, a work quota is allocated to the worker nodes corresponding to the tenants included in the distributed architecture, and the work quotas are used to trigger the work nodes to respond to cloud service requests initiated by the tenants according to their corresponding work quotas.

In some of the illustrated embodiments, the above-mentioned allocating a work quota to each worker node included in the above-mentioned distributed architecture based on the above-mentioned total cloud service quota includes:

Based on a part of the above-mentioned total cloud service quota, a work quota is allocated to the worker nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture.

In some of the illustrated embodiments, the above-mentioned allocating a work quota to each working node corresponding to the above-mentioned distributed architecture including the above-mentioned tenant based on part of the quota in the above-mentioned total cloud service quota includes: according to the processing capability corresponding to the above-mentioned working node, Determine the cloud service request response volume reached by the above-mentioned working node within a preset time period; wherein, the above-mentioned processing capability indicates the cloud service request response volume reached within a unit time length; according to the above-mentioned cloud service request response volume corresponding to the above-mentioned working node, to the above-mentioned working node Allocate work quotas. In this way, multiple assignments of work quotas to the working nodes can be realized, thereby reducing the problem of unreasonable assignment caused by one assignment.

In some of the illustrated embodiments, allocating a work quota to the worker nodes included in the distributed architecture and corresponding to the tenants based on a partial quota in the total cloud service quota includes: determining the worker nodes included in the distributed architecture. Corresponding quota weight; based on part of the quota in the total quota of the cloud service, assign the work quota to the work node that matches the quota weight corresponding to the work node. In this way, work quotas can be reasonably allocated to each work node, so that nodes with high configuration can be allocated more work quotas, thereby improving the response speed of the cloud service system and improving the tenant experience.

In some of the illustrated embodiments, the above-mentioned determining the quota weight corresponding to each work node included in the distributed architecture includes: based on the configuration information of each work node, and according to a preset quota weight determination rule, determining the corresponding quota weight of each work node. or, based on the processing capability corresponding to each work node, determine the quota weight corresponding to each of the above-mentioned work nodes. In this way, the response speed of the cloud service system can be improved, and the tenant experience can be improved.

In some of the illustrated embodiments, the above method further includes:

In the case of receiving a quota application request from any work node, based on the remaining quota, the work quota is allocated to the above-mentioned work node; wherein, the above-mentioned residual quota includes: the above-mentioned total cloud service quota after removing the allocated work quota amount of. In this way, the worker nodes that consume the work quota at a high rate can receive the work quota allocation multiple times, thereby improving the response speed of the cloud service system and improving the tenant experience.

In some of the illustrated embodiments, the above-mentioned allocating a work quota to the above-mentioned working nodes based on the remaining quota includes: based on the remaining quota, allocating to the above-mentioned working nodes according to the cloud service request responses reached by the above-mentioned working nodes within a preset period of time. The amount of work that matches the above cloud service request response volume. In this way, the system can allocate work quotas that match the processing capabilities of the nodes to the working nodes, so that nodes with strong processing capabilities can allocate more work quotas, thereby improving the response speed of the cloud service system and improving the tenant experience.

In some of the illustrated embodiments, the above-mentioned working node responds to the cloud service request initiated by the above-mentioned tenant according to the work quota corresponding to itself, including: after the above-mentioned working node receives the cloud service request initiated by the tenant In the case of remaining, the cloud service calculation is provided in response to the above cloud service request, and the remaining work quota is adjusted according to the consumption quota corresponding to the above calculation. In this way, the worker node can determine whether to respond to the cloud service request by analyzing its own work quota.

In some of the illustrated embodiments, the above-mentioned working node responds to the cloud service request initiated by the above-mentioned tenant according to the work quota corresponding to itself, and further includes: after receiving the cloud service request initiated by the tenant, the above-mentioned working node responds to the corresponding work quota by itself. If there is no remaining amount, submit a quota application request to the above-mentioned cloud service system; and in the case that the above-mentioned total cloud service quota is still remaining, receive the work quota allocated by the cloud service system to the above-mentioned working node based on the remaining quota in response to the above-mentioned cloud service. ask. In this way, when the total cloud service quota still remains, the worker node can continue to receive the work quota allocated by the cloud service, thereby speeding up processing efficiency.

In some of the illustrated embodiments, the above method further includes: after the above-mentioned working node submits a quota application request to the above-mentioned cloud service system, if the above-mentioned total cloud service quota is not left, forwarding the above-mentioned cloud service request to other work quotas There are remaining worker nodes for processing. In this way, the cloud service system can be made to provide cloud services to the tenants within the range of the total amount applied by the tenants as much as possible, thereby improving the experience of the tenants.

In some of the illustrated embodiments, the above-mentioned method further includes: the above-mentioned working node charges the request for using the cloud service initiated by the tenant.

In some of the illustrated embodiments, the above-mentioned cloud services include artificial intelligence (Artificial Intelligence, AI) cloud services; the above-mentioned obtaining the total amount of cloud services applied by the tenant to the above-mentioned cloud service system includes: AI cloud service total quota; based on the above-mentioned total cloud service quota, a work quota is allocated to the work nodes corresponding to the above tenants included in the distributed architecture, and the work quota is used to trigger the above-mentioned work nodes to respond to the above according to their corresponding work quotas. The cloud service request initiated by the tenant includes: based on the above-mentioned total quota of AI cloud services, allocating a work quota to the work nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture, and the work quota is used to trigger the above-mentioned work nodes according to their corresponding work. The quota responds to the AI cloud service requests initiated by the above tenants. In this way, for the AI cloud service system, the worker node responds to the AI cloud service request according to its own work quota, which can improve the response speed of the AI cloud service system and improve the tenant experience.

The embodiment of the present application also proposes a cloud service request response device, wherein the above device includes:

an obtaining module, configured to obtain the total cloud service quota applied by the tenant to the above-mentioned cloud service system; wherein, the above-mentioned cloud service system includes a system constructed based on a distributed architecture;

The allocation module is configured to, based on the above-mentioned total cloud service quota, allocate a work quota to the work nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture, and the work quotas are used to trigger the above-mentioned work nodes to respond to the above-mentioned tenants according to their corresponding work quotas. cloud service requests.

In some illustrated embodiments, the above allocation module is specifically configured as:

In some of the illustrated embodiments, the distribution module described above includes:

a first determining module, configured to determine the cloud service request response volume reached by the working node within a preset duration according to the processing capability corresponding to the working node; wherein the processing capability indicates the cloud service request response volume reached within a unit duration;

The allocation sub-module is configured to allocate a work quota to the above-mentioned working nodes according to the above-mentioned cloud service request responses corresponding to the above-mentioned working nodes.

The second determination module is configured to determine the quota weights corresponding to the working nodes included in the distributed architecture;

The allocation sub-module is configured to allocate a work quota matching the quota weight corresponding to the above-mentioned working node to the above-mentioned working node based on a part of the quota in the above-mentioned total quota of the cloud service.

In some of the illustrated embodiments, the above-mentioned second determining module is specifically configured as:

Based on the configuration information of each work node, and according to the preset quota weight determination rule, determine the quota weight corresponding to each work node; or,

Based on the processing capability corresponding to each work node, the quota weight corresponding to each work node is determined.

In some of the illustrated embodiments, the above allocation module is further configured to:

In the case of receiving a quota application request from any work node, based on the remaining quota, the work quota is allocated to the above-mentioned work node; wherein, the above-mentioned residual quota includes: the above-mentioned total cloud service quota after removing the allocated work quota amount of.

Based on the remaining quota, and according to the cloud service request response volume that the worker node can reach within a preset time period, a work quota matching the cloud service request response volume is allocated to the worker node.

Based on the above total cloud service quota, a work quota is allocated to the work nodes corresponding to the above tenants included in the distributed architecture. After the above work nodes receive the cloud service request initiated by the tenant, there are still remaining work quotas corresponding to their own work quotas. In this case, it provides cloud service calculation in response to the above cloud service request, and adjusts its remaining work quota according to the consumption quota corresponding to the above calculation.

Based on the above-mentioned total cloud service quota, a work quota is allocated to the work nodes corresponding to the above-mentioned tenants included in the distributed architecture. After the above-mentioned work nodes receive the cloud service request initiated by the tenant, there is no remaining work quota corresponding to themselves. Next, submit a quota application request to the cloud service system; and in the case that the total cloud service quota is still remaining, receive the work quota allocated by the cloud service system to the working node based on the remaining quota to respond to the cloud service request.

Based on the above-mentioned total cloud service quota, a work quota is allocated to the working nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture. After the above-mentioned working node submits a quota application request to the above-mentioned cloud service system, the above-mentioned total cloud service quota is not left. Next, forward the above cloud service request to other working nodes with remaining work quotas for processing.

In some of the illustrated embodiments, the above-mentioned apparatus further includes:

The billing module is configured to charge the cloud service request initiated by the tenant for the above-mentioned working node.

In some of the illustrated embodiments, the above-mentioned cloud service includes an AI cloud service; the above-mentioned obtaining module is specifically configured as:

Obtain the total amount of AI cloud services applied by the tenant to the above cloud service system;

The specific configuration of the above allocation module is:

Based on the above-mentioned total quota of AI cloud services, a work quota is allocated to the work nodes corresponding to the above-mentioned tenants included in the distributed architecture, and the work quotas are used to trigger the above-mentioned work nodes to respond to the AI cloud services initiated by the above-mentioned tenants according to their corresponding work quotas ask.

The embodiment of the present application also proposes an electronic device, and the above-mentioned device includes:

processor;

a memory for storing the above-mentioned processor-executable instructions;

The processor is configured to invoke the executable instructions stored in the memory to implement the cloud service request response method shown in any of the foregoing embodiments.

An embodiment of the present application further provides a computer-readable storage medium, characterized in that, the storage medium stores a computer program, and the computer program is used to execute the cloud service request response method shown in any of the foregoing embodiments.

In the above technical solution, the above-mentioned cloud service system constructed by a distributed architecture can allocate a work quota to each work node included in the above-mentioned distributed architecture based on the total cloud service quota applied by the tenant to the system, so that each of the above-mentioned work nodes can be autonomous Responding to cloud service requests initiated by tenants according to their corresponding work quotas, reducing the amount of cloud service requests that the cloud service system frequently communicates with the above-mentioned working nodes to read and write tenants, thereby reducing the frequent network I/O of the above-mentioned cloud service system The operation and the locking operation of reading and writing public storage improve the response speed of the cloud service request of the system, thereby improving the tenant experience.

It should be understood that the above general description and the following detailed description are only exemplary and explanatory, and do not limit the embodiments of the present application.

Description of drawings

In order to more clearly illustrate the technical solutions in one or more embodiments of the embodiments of the present application or related technologies, the following briefly introduces the accompanying drawings used in the description of the embodiments or related technologies. Obviously, in the following description The accompanying drawings are only some of the embodiments described in one or more of the embodiments of the present application. For those of ordinary skill in the art, without creative labor, they can also obtain other Attached.

1 is a method flowchart of a method for responding to a cloud service request shown in an embodiment of the application;

FIG. 2 is a schematic diagram of interaction between an AI cloud service system and a tenant according to an embodiment of the application;

3 is a schematic diagram of total cloud service quota allocation shown in an embodiment of the present application;

4A is a schematic diagram of another allocation of total cloud service quotas shown in an embodiment of the application;

4B is a schematic diagram of total cloud service quota allocation shown in an embodiment of the application;

5 is a schematic structural diagram of a cloud service request response apparatus shown in an embodiment of the application;

FIG. 6 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the application.

Detailed ways

Exemplary embodiments will be described in detail below, examples of which are illustrated in the accompanying drawings. Where the following description refers to the drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the embodiments of the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the embodiments of the present application as recited in the appended claims.

Terms used in the embodiments of the present application are only for the purpose of describing specific embodiments, and are not intended to limit the embodiments of the present application. As used in the embodiments of this application and the appended claims, the singular forms "a," "above," and "the" are intended to include plural forms as well, unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items. It should also be understood that the word "if", as used herein, can be interpreted as "at the time of" or "when" or "in response to determining", depending on the context.

In the related art, in order to determine whether the cloud service request initiated by the tenant is within the range of the total cloud service quota applied by the tenant, the cloud service system will count the cloud service request amount of the tenant. The above tenant may include multiple users. Users can apply for cloud services using the tenant account assigned to them.

In practical applications, when the type of cloud service request is cloud service invocation, the cloud service system may use the number of cloud service invocations as a dimension to count the cloud service request volume of the tenant. When the type of cloud service request is stream data processing, the cloud service system can use the number of bytes of processing traffic as the dimension to count the cloud service request volume of the tenant.

The following description takes the cloud service request initiated by the tenant as the cloud service invocation request as an example.

For example, when a tenant initiates a cloud service invocation request, the above cloud service system can determine whether the currently counted number of cloud service invocation requests of the tenant reaches the total cloud service quota applied by the tenant (the total number of cloud service invocation requests), if not , then respond to the request; otherwise, limit the request.

It is not difficult to understand that when the above-mentioned cloud service system is a single-node system (the above-mentioned single-node system, specifically a system that provides cloud services through only one node), the number of invocations of tenant services or the number of bytes of processing traffic are compared. convenient. Therefore, it is not complicated to count the number of cloud service requests of tenants, and it will not affect the speed of the cloud service system to respond to requests. However, when the above cloud service system is a system based on a distributed architecture, it may be due to the distributed architecture, which makes the statistics of the cloud service requests of the tenants very complicated, which affects the speed of the cloud service system in responding to the requests.

For example, when the above-mentioned cloud service system is a system based on a distributed architecture, the system can allocate a shared space (for example, a shared cache or a shared cache or a shared cache or a shared space) for storing the total cloud service quota applied for by the tenant and a usage quota indicating the number of calls initiated by the tenant to the tenant. shared database).

When the above-mentioned cloud service system receives a cloud service invocation request initiated by a tenant, the request may be distributed to any node A under the above-mentioned distributed architecture. When the node A receives the above request, it will read the total cloud service quota stored in the shared space and the usage quota already used by the tenant (the number of calls initiated by the tenant) through I/O. After reading the above-mentioned total cloud service quota and the above-mentioned usage quota, the tenant can determine whether the above-mentioned total cloud service quota is greater than the above-mentioned usage quota. If so, the above-mentioned node A responds to the call request and increases the above-mentioned usage quota. Then, the node A can write the increased usage quota to the above-mentioned shared space through I/O.

It is not difficult to find that when the above cloud service system is a system constructed based on a distributed architecture, the cloud service invocation request or traffic processing request initiated by the tenant may be distributed to any node under the distributed architecture. Therefore, the above cloud service system must frequently communicate with each node under the distributed architecture to read and write the cloud service request volume of the tenant. Frequent network I/O operations and locking operations of reading and writing public storage may cause the system's cloud service request response efficiency to become low, with delays, thereby affecting tenant experience.

In view of this, an embodiment of the present application proposes a method for responding to a cloud service request, which is applied to a cloud service system. Wherein, the above cloud service system includes a system constructed based on a distributed architecture.

The method allocates the total cloud service quota applied by the tenant to each working node under the above-mentioned distributed architecture, triggers each working node to independently determine whether to respond to the cloud service request initiated by the tenant, and reduces the frequent communication between the above-mentioned cloud service system and the above-mentioned various working nodes. Communicate to read and write the cloud service request volume of the tenant, thereby reducing the frequent network I/O operations of the above cloud service system and the locking operation of reading and writing public storage, improving the response speed of the cloud service request of the system, thereby improving the tenant experience. .

Please refer to FIG. 1. FIG. 1 is a method flowchart of a method for responding to a cloud service request according to an embodiment of the present application.

As shown in FIG. 1 , the method for responding to the cloud service request shown in the embodiment of the present application may include:

S102, obtain the total cloud service quota applied by the tenant to the above-mentioned cloud service system; wherein, the above-mentioned cloud service system includes a system constructed based on a distributed architecture;

S104: Based on the total cloud service quota, assign a work quota to the work nodes included in the distributed architecture and corresponding to the tenants, where the work quotas are used to trigger the work nodes to respond to the cloud service requests initiated by the tenants according to their corresponding work quotas .

The above cloud service system (hereinafter referred to as the "system") is specifically a system that provides cloud services to tenants. The foregoing system may include a certain number of hardware devices or software devices to provide cloud services, and the embodiments of the present application do not limit the types of hardware devices and software devices included in the foregoing system.

In practical applications, a tenant can apply to the above cloud service system for a certain total cloud service quota. In some examples, the total number of cloud service calls that can be initiated by the tenant may be used as a dimension to calculate the total cloud service quota. The tenant can initiate a cloud service invocation request to the above cloud service system within the scope of the above-mentioned total cloud service quota, so as to enjoy the services provided by the cloud service system.

The above cloud service system includes a system constructed based on a distributed architecture. The above-mentioned distributed architecture may be an architecture including several working nodes. Wherein, the working node (hereinafter referred to as "node") may be a terminal or a server (a terminal or server may be a notebook computer, a desktop computer, a tablet computer (Portable Android Device, PAD) terminal, etc., and the embodiments of the present application do not identify the types of devices of the terminal or server. and model number).

The above-mentioned distributed architecture provides computing power through its included working nodes, so that the above-mentioned cloud service system can provide cloud services for tenants. It should be noted that the above cloud service type may be cloud service invocation or traffic storage, etc., and the embodiment of the present application does not limit the cloud service type.

In some embodiments, the above-mentioned cloud service system may include an AI cloud service system.

Please refer to FIG. 2, which is a schematic diagram of interaction between an AI cloud service system and a tenant according to an embodiment of the present application. As shown in Figure 2, the above AI cloud service system is a system constructed based on a distributed architecture. Wherein, the above-mentioned distributed architecture includes working nodes A, B, and C. It should be noted that the cloud service system shown in FIG. 2 is only a schematic illustration, and is not particularly limited.

In the AI cloud service scenario shown in FIG. 2 , the tenant 201 may apply to the AI cloud service system 202 for a total cloud service quota for a certain number of calls. Then, the tenant 201 may initiate a service invocation request such as model training to the above-mentioned AI cloud service system 202 by calling an interface (for example, a Hyper Text Transfer Protocol (Hyper Text Transfer Protocol, HTTP) invocation). After the above-mentioned AI cloud service system 202 receives the above-mentioned invocation request, it can distribute the invocation request task to the target working node A under the distributed architecture according to a pre-stored distribution rule (for example, a load balancing distribution rule), so that the node can A can respond to the cloud service request initiated by the tenant according to its corresponding work quota, and return the response result to the tenant.

The above-mentioned total amount of cloud services includes the total amount of services provided by the cloud service system that tenants can enjoy.

In practical applications, if the type of cloud service applied by the tenant is cloud service invocation, the cloud service system may use the number of cloud service invocations as the dimension to count the above-mentioned total service volume of the tenant. If the cloud service type applied by the tenant is stream data processing, the cloud service system can count the above-mentioned total service volume of the tenant by taking the number of bytes of processed traffic as the dimension.

It should be noted that, on the one hand, the embodiment of the present application does not limit the statistical dimension of the total amount of cloud services. The following takes the cloud service type as the cloud service invocation request as an example for description. On the other hand, in some instances, tenants can apply for the above-mentioned total amount through a paid purchase. In some instances, tenants may apply for the total amount above by applying for a trial. This embodiment of the present application does not limit the manner in which the tenant applies for the total cloud service quota.

The above-mentioned work node can respond to the cloud service request initiated by the tenant according to its corresponding work quota.

In some examples, the above-mentioned worker nodes may charge for cloud service requests initiated by tenants. For example, worker nodes can maintain a summary table of quotas corresponding to tenants. The above quota summary table can count the remaining quota, used quota and other information of the combination. After a worker node responds to the cloud service request initiated by the tenant, the above-mentioned used quota may be increased to complete the billing for the cloud service request initiated by the tenant.

The above work quota refers to the amount of cloud service requests that a worker node can respond to. When the above-mentioned working node receives the cloud service request initiated by the tenant, it can determine whether it responds to the cloud service request by judging whether there is a remaining work quota.

When a worker node responds to a cloud service request, it can consume the corresponding cloud service request accordingly. For example, when the cloud service request volume is counted by the number of calls, each time a worker node responds to a call request initiated by the tenant, it can respond with a work quota of 1 unit.

In some examples, the above-mentioned work credits may include two credits. First, the above-mentioned work quota may be the work quota initially allocated by the above-mentioned system to each of the above-mentioned nodes after the tenant applies for the total quota of cloud services, so that each node can operate. Second, the above-mentioned work quota may be the work quota applied to the above-mentioned system when the allocated work quota is exhausted during the operation of each node, so that each node can replenish the work quota and continue to operate.

The above cloud service request includes a cloud service request initiated by the tenant to the above system. The above cloud service requests may include cloud service invocation requests and/or stream data processing requests.

It should be noted that, generally, the type of cloud service request initiated by the tenant is related to the type of cloud service applied for by the tenant.

For example, if the cloud service type applied by the tenant is cloud service invocation, the tenant can initiate a cloud service invocation request. When the type of cloud service applied by the tenant includes both cloud service invocation and stream data processing, the tenant can initiate both a cloud service invocation request and a traffic processing request.

In some embodiments, after receiving the cloud service request initiated by the tenant, the above-mentioned working node can provide cloud service calculation in response to the above-mentioned cloud service request when the corresponding work quota still remains, and calculate the corresponding consumption quota according to the above-mentioned calculation. Adjust your remaining work quota.

For example, referring to FIG. 2 , after receiving the cloud service request initiated by the tenant, the above-mentioned working node A can determine whether its own work quota remains. If its own work quota remains, the above-mentioned node A can respond to the cloud service request and consume 1 unit of work quota. If the above-mentioned work quota is not left, the above-mentioned node A can limit the cloud service request.

It should be noted that this embodiment of the present application does not limit the manner in which the working node determines whether there is a remaining work quota. In some embodiments, a worker node may store the amount of work assigned by the system, as well as the amount of cloud service requests that the node has responded to. At this time, when it is determined whether there is a remaining work quota, a corresponding result can be obtained by subtracting the currently responded cloud service request amount from the work quota. If the result is greater than 0, it is determined that the above-mentioned work quota has remaining; otherwise, there is no remaining. In some embodiments, worker nodes may store remaining credits. That is, the initial value of the remaining quota is the work quota allocated by the system, and each time the worker node responds to a cloud service request, the remaining quota value is adjusted. At this time, when it is determined whether there is a remaining amount of work, it can be determined whether the remaining amount is greater than 0, and if so, it is determined that there is a surplus of the above-mentioned amount of work, otherwise, there is no remaining amount.

In some embodiments, after receiving the cloud service request initiated by the tenant, if there is no remaining work quota corresponding to itself, the above-mentioned working node may submit a quota application request to the above-mentioned system, and when the above-mentioned total cloud service quota still remains, Receiving the work quota allocated by the cloud service system to the above-mentioned working node based on the remaining quota in response to the above-mentioned cloud service request.

For example, referring to FIG. 2 , after receiving the cloud service request initiated by the tenant, the above-mentioned working node A can determine whether its own work quota remains. If the above-mentioned work quota is not left, the above-mentioned node A may first submit a quota application request to the above-mentioned system. After receiving the above quota application request, the above system can determine whether the total cloud service quota corresponding to the above tenant still remains, and if there is still a residual quota, it will continue to allocate a work quota to the node A. After the above-mentioned node A receives the work quota, it will continue to respond to the cloud service request.

In some embodiments, after the above-mentioned work node submits a quota application request to the above-mentioned cloud service system, if the above-mentioned total cloud service quota is not left, the above-mentioned cloud service request is forwarded to other work nodes with remaining work quotas for processing.

For example, referring to FIG. 2, it is assumed that the working status of each working node is stored in the above system (the working status refers to whether the node can respond to the request, that is, whether there is still a working quota). When a worker node A receives a cloud service request initiated by a tenant, if its corresponding work quota is not left, and the above-mentioned total cloud service quota is not left, the node A can query the work of other work nodes through the above system. state. If the node B that can still respond to the cloud service is queried, the node A can route the above-mentioned request to the above-mentioned node B, and the node B responds to the request.

In this embodiment, after each worker node receives the cloud service request initiated by the tenant, if there is no remaining work quota corresponding to itself, and the above-mentioned total cloud service quota is not left, the above-mentioned cloud service request is forwarded to other work quotas There are remaining worker nodes for processing. Therefore, the cloud service system can be made to provide cloud services to the tenants within the range of the total amount applied by the tenants as much as possible, thereby improving the experience of the tenants.

Of course, after each of the above-mentioned working nodes receives the cloud service request initiated by the tenant, if the corresponding work quota is not left, the total cloud service quota is not left, and there is no work quota with remaining work nodes, then the above restrictions will be imposed. This cloud service request of the tenant.

In the above technical solution, the cloud service system constructed by the distributed architecture can allocate a work quota to each work node included in the above distributed architecture based on the total cloud service quota applied by the tenant to the system, and each of the above work nodes independently according to their own The corresponding work quota responds to the cloud service request initiated by the tenant, reducing the cloud service request volume of the above-mentioned cloud service system that frequently communicates with the above-mentioned working nodes to read and write the tenant's cloud service, thereby reducing the frequent network I/O operations of the above-mentioned cloud service system and The locking operation of reading and writing public storage improves the response speed of cloud service requests of the system, thereby improving the tenant experience.

In some embodiments, when the above-mentioned system performs the above-mentioned S104, based on the above-mentioned total cloud service quota, when allocating a work quota to the work nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture, the above-mentioned total cloud service quota may be based on part of the quota. , assigning a work quota to the work nodes included in the distributed architecture and corresponding to the tenants.

Here, since only a part of the total quota is used when allocating the work quota to the above-mentioned work nodes, multiple work quota allocations to the work nodes can be realized, thereby reducing the problem of unreasonable distribution caused by one-time allocation.

In some embodiments, when allocating work quotas to the work nodes included in the distributed architecture and corresponding to the tenants based on part of the above-mentioned total cloud service quotas, the above-mentioned cloud services are allocated according to the quota weights corresponding to each work node The total quota is distributed to each worker node.

In practical applications, the above-mentioned system may first obtain the quota weights corresponding to the working nodes included in the above-mentioned distributed architecture. After determining the quota weights corresponding to the working nodes, the above-mentioned system may allocate a working quota matching the quota weights corresponding to the above-mentioned working nodes to the above-mentioned working nodes based on part of the quotas in the above-mentioned total cloud service quotas.

The quota weight corresponding to each of the above working nodes may specifically be a preset fixed value. For example, the quota weight corresponding to each worker node can be set to the same value. At this time, when allocating the work quota, the total quota can be equally distributed to each worker node.

In some embodiments, when determining the quota weight, the quota weight corresponding to each work node is determined based on the configuration information of each work node and according to a preset quota weight determination rule.

For example, when constructing the above-mentioned system, a configuration information table corresponding to each working node can be maintained. For example, worker node CPU, GPU processing performance, hard disk model, etc. When determining the quota weight corresponding to each working node included in the above-mentioned distributed architecture, the configuration information table corresponding to each working node may be queried to determine the configuration information of each working node.

After determining the configuration information corresponding to each working node, the system may determine the quota weight corresponding to each working node according to a preset quota weight determination rule.

In some embodiments, the above-mentioned quota weight determination rule may be to score various configuration information of each working node first. Then the weighted summation of each score is carried out to obtain the total score corresponding to each work node. Finally, the weight of each work node is determined according to the total score corresponding to each work node.

Because when determining the quota weight corresponding to each work node included in the distributed architecture, the system can determine the quota weight corresponding to each work node based on the configuration information of each work node and a preset quota weight determination rule. Therefore, it is possible to reasonably allocate work quotas to each work node, so that nodes with high configuration can be allocated more work quotas, thereby improving the response speed of the cloud service system and improving the tenant experience.

In some embodiments, when determining the quota weight corresponding to each work node included in the distributed architecture, the system may determine the quota weight corresponding to each work node based on the processing capability corresponding to each work node; wherein, the above processing Capability indicates the amount of cloud service request responses that can be achieved within a unit time.

For example, the above-mentioned system can determine the cloud service request response amount (processing capacity) that each working node can achieve within a unit time by means of testing. After determining the processing capability corresponding to each working node, the above-mentioned system may determine the quota weight of each working node according to the processing capability corresponding to each working node.

Because when the work quota is allocated to each work node, the allocation can be performed according to the processing capacity of each work node. Therefore, work quotas can be allocated to each work node reasonably, so that nodes with strong processing capabilities can be allocated more work quotas, thereby improving the response speed of the cloud service system and improving the tenant experience.

In some examples, when allocating a work quota to each work node corresponding to the above-mentioned distributed architecture including the above-mentioned tenant based on a partial quota in the above-mentioned total cloud service quota, the above-mentioned system may, according to the processing capability corresponding to the above-mentioned work node, determine The amount of cloud service request responses that can be reached by the above-mentioned worker nodes within the preset time period. Wherein, the above-mentioned processing capability indicates the amount of cloud service request responses that can be achieved within a unit time. After determining the cloud service request response amount corresponding to the working node, the system may allocate a work quota to the working node according to the cloud service request response amount corresponding to the working node.

In some examples, the above-mentioned system may determine the value of the above-mentioned partial quota for participating in the initial allocation according to the sum of the cloud service request responses that each working node can reach within a preset time period. When allocating a work quota to each of the work nodes according to the cloud service request responses corresponding to the work nodes, the system may determine the cloud service request responses corresponding to the work nodes as the work corresponding to the work nodes. The quota is allocated to each of the above working nodes.

The above-mentioned preset duration may specifically be a value set according to experience. For example, 1 minute.

Please refer to FIG. 3 , which is a schematic diagram of total quota allocation of cloud services according to an embodiment of the present application.

As shown in FIG. 3 , the response volume of cloud service requests that can be reached within 1 minute corresponding to the worker node A included in the distributed architecture is a dark gray square. The amount of cloud service request responses that can be reached within 1 minute corresponding to the worker node B is a light gray square. The amount of cloud service request responses that can be reached within 1 minute corresponding to worker node C is a black square.

When the above-mentioned system allocates the work quota for the first time, from the total quota, the work node A can be allocated the work quota indicated by the dark gray square, the work node B can be allocated the work quota indicated by the light gray square, and the work node C can be allocated the work quota indicated by the black square. Work quota.

Because when a work quota is allocated to each work node included in the above distributed architecture based on a part of the above-mentioned total cloud service quota, the above-mentioned system can determine the above-mentioned work within a preset duration according to the processing capability corresponding to each of the above-mentioned work nodes. The amount of cloud service request responses that the node can reach. Wherein, the above-mentioned processing capability indicates the amount of cloud service request responses that can be achieved within a unit time. After determining the cloud service request responses corresponding to the working nodes, the system may determine the cloud service request responses corresponding to the working nodes as the work quotas corresponding to the working nodes, and assign them to the working nodes. Therefore, it is possible to determine a reasonable part of the quota for participating in the initial allocation, and reasonably allocate the initial work quota to each node, thereby further improving the work efficiency of the cloud service.

In the actual situation, since the rate at which each node consumes the work quota is not the same, if the total cloud service quota is allocated at one time, the work of some work nodes may have been consumed, but some work nodes are still working. The remaining status of the quota, so that some nodes are idle, and the efficiency of the marketing cloud service system.

In order to improve this situation, in some embodiments, when allocating the total quota to each worker node, it does not need to be allotted at one time, but is allocated again when any worker node applies for the quota, so that the point consumes 10% of the work quota. A worker node with a fast rate can receive work quota assignments multiple times, thereby improving the response speed of the cloud service system and improving the tenant experience.

In practical applications, when the above-mentioned system allocates a work quota to each working node included in the above-mentioned distributed architecture based on the above-mentioned total cloud service quota, the system may assign work quotas to each work node included in the above-mentioned distributed architecture based on a part of the quota in the above-mentioned total cloud service quota. Worker nodes allocate work quotas. In addition, when the system receives a quota application request from any of the above-mentioned work nodes, it allocates a work quota to the above-mentioned work nodes based on the remaining quota.

Wherein, the above-mentioned remaining quota includes the remaining quota after deducting the allocated work quota from the above-mentioned total cloud service quota.

In some embodiments, when the above-mentioned system performs total quota allocation, the value of the initially allocated partial quota and the allocation rule may be determined first. For example, you can specify an initial distribution of one-third of the total quota, as well as an even distribution rule. At this time, the above-mentioned system can evenly distribute one-third of the total quota to each worker node.

After that, after receiving the quota application request from any work node, the above system can check whether there is any remaining quota, and if so, it can allocate the work quota to the above work node.

When allocating the total quota to each worker node, it does not need to be allotted at one time, but is allocated when any worker node applies for the quota, so that the worker node that consumes the work quota at a high rate can receive the work quota allocation multiple times. , thereby improving the response speed of the cloud service system and improving the tenant experience.

In some embodiments, when allocating a work quota to the above-mentioned work node based on the remaining quota, the above-mentioned system may allocate the above-mentioned cloud service to the above-mentioned work node according to the cloud service request response amount that the above-mentioned work node can reach within a preset period of time. The amount of work that matches the request response volume.

Please refer to FIG. 4A , which is a schematic diagram of another allocation of the total cloud service quota according to an embodiment of the present application. In FIG. 4A , the slashed box represents that the corresponding work quota of the node has been consumed. That is, after the quota on node A is used up, continue to divide a part of the quota from the remaining total quota to this node A until all the total quota is allocated. When the quota of all nodes is exhausted, it means that the total purchase quota of the tenant has been consumed.

Please refer to FIG. 4B , which is a schematic diagram of total quota allocation of cloud services according to an embodiment of the present application.

As shown in FIG. 4B , the slashed box represents that the corresponding work quota of the node has been consumed. When node A consumes the allocated work quota and initiates a quota application to the above-mentioned system, the above-mentioned system can divide the cloud service request response amount that node A can reach within 1 minute from the remaining quota. The above system can then allocate a work quota (dark gray box in FIG. 4B ) corresponding to the request response amount to node A. When the quota on the node is used up and there is no remaining quota to divide the total quota, the node will write the current state into the shared storage service, and will not process the user's request, but will receive The service request is forwarded to other nodes with quota.

Because when allocating the work quota to the above-mentioned working node, the above-mentioned system can allocate the above-mentioned work node to the above-mentioned work quota matching the above-mentioned cloud service request response amount to the above-mentioned working node according to the cloud service request response amount that the above-mentioned working node can reach within a preset time period. Therefore, the above system can allocate work quotas that meet the processing capabilities of the nodes to the working nodes, so that nodes with strong processing capabilities can allocate more work quotas, thereby improving the response speed of the cloud service system and improving the tenant experience.

It can be understood that when the above-mentioned total quota is exhausted, it means that the total quota of cloud services applied by the tenant has been exhausted.

Corresponding to any of the foregoing embodiments, an embodiment of the present application further provides a cloud service request response apparatus.

Please refer to FIG. 5, which is a schematic structural diagram of a cloud service request response apparatus according to an embodiment of the present application.

As shown in FIG. 5, the above-mentioned apparatus 50 may include:

The obtaining module 51 is configured to obtain the total cloud service quota applied by the tenant to the above-mentioned cloud service system; wherein, the above-mentioned cloud service system includes a system constructed based on a distributed architecture;

The allocation module 52 is configured to allocate a work quota to the work nodes corresponding to the tenants included in the distributed architecture based on the above-mentioned total cloud service quota, and the work quotas are used to trigger the above-mentioned work nodes to respond to the above-mentioned tenants according to their corresponding work quotas Initiated cloud service request.

In some of the illustrated embodiments, the above allocation module 52 is specifically configured as:

In some of the illustrated embodiments, the distribution module 52 described above includes:

Based on the configuration information of each work node, and according to the preset quota weight determination rule, determine the quota weight corresponding to each of the above work nodes; or,

In some of the illustrated embodiments, the above allocation module 52 is further configured to:

In the case of receiving a quota application request from any work node, based on the remaining quota, the work quota is allocated to the above-mentioned work node; wherein, the above-mentioned residual quota includes: the above-mentioned total cloud service quota after excluding the work quota that has been allocated, the remaining quota amount of.

Based on the above-mentioned total cloud service quota, a work quota is allocated to the work nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture. After the above-mentioned work node submits a quota application request to the above-mentioned cloud service system, the above-mentioned total cloud service quota is not left. Next, forward the above cloud service request to other working nodes with remaining work quotas for processing.

In some of the illustrated embodiments, the above-mentioned apparatus 50 further includes:

In some of the illustrated embodiments, the above-mentioned cloud service includes an AI cloud service; the above-mentioned obtaining module 51 is specifically configured as:

The above allocation module 52 is specifically configured as:

The embodiments of the cloud service request response apparatus shown in the embodiments of this application may be configured on an electronic device. Correspondingly, the embodiment of the present application discloses an electronic device, and the device may include: a processor.

A memory configured to store processor executable instructions.

Please refer to FIG. 6 , which is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.

As shown in FIG. 6 , the electronic device may include a processor for executing instructions, a network interface for network connection, a memory for storing operating data for the processor, and a corresponding instruction for storing the cloud service request response device of non-volatile memory.

The embodiment of the cloud service request response apparatus may be implemented by software, or may be implemented by hardware or a combination of software and hardware. Taking software implementation as an example, a device in a logical sense is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of the electronic device where the device is located. From a hardware perspective, in addition to the processor, memory, network interface, and non-volatile memory shown in FIG. 6 , the electronic device in which the apparatus in the embodiment is located may also include other electronic devices according to the actual functions of the electronic device. Hardware, no further details on this.

It can be understood that, in order to improve the processing speed, the corresponding instructions of the cloud service request response apparatus may also be directly stored in the memory, which is not limited herein.

An embodiment of the present application provides a computer-readable storage medium, where a computer program is stored in the storage medium, and the computer program is used to execute the cloud service request response method shown in any of the foregoing embodiments.

It should be understood by those skilled in the art that one or more of the embodiments of the present application may be provided as a method, a system or a computer program product. Accordingly, one or more of the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, one or more of the embodiments of the present application may be implemented on one or more computer-usable storage media (which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein in the form of a computer program product.

"And/or" in the embodiments of the present application means at least one of the two. For example, "A and/or B" may include three schemes: A, B, and "A and B".

Each embodiment in the embodiments of the present application is described in a progressive manner, and the same and similar parts between the various embodiments may be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the data processing device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the partial description of the method embodiment.

The above describes specific embodiments of the embodiments of the present application. Other embodiments are within the scope of the appended claims. In some cases, the acts or steps recited in the claims can be performed in an order different from that in the embodiments and still achieve desirable results. Additionally, the processes depicted in the figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

Embodiments of the subject matter and functional operations described in the embodiments of this application can be implemented in digital electronic circuits, computer software or firmware in tangible embodiment, computers that can include the structures disclosed in the embodiments of this application and their structural equivalents hardware, or a combination of one or more of them. Embodiments of the subject matter described in the embodiments of this application may be implemented as one or more computer programs, ie computer program instructions encoded on a tangible non-transitory program carrier for execution by or to control the operation of a data processing apparatus one or more modules. Alternatively or additionally, the program instructions may be encoded on an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver device for interpretation by the data. The processing device executes. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of these.

The processes and logic flows described in the embodiments of the present application can be executed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating according to input data and generating output. The processes and logic flows described above can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, eg, an FPGA (Field Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit).

A computer suitable for the execution of a computer program may include, for example, a general and/or special purpose microprocessor, or any other type of central processing unit. Typically, the central processing unit will receive instructions and data from read only memory and/or random access memory. The basic components of a computer may include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Typically, a computer will also include, or be operably coupled to, such mass storage devices to receive data therefrom or to include one or more mass storage devices, such as magnetic disks, magneto-optical disks, or optical disks, etc., for storing data. Send data to it, or both. However, the computer does not have to have such a device. Additionally, the computer may be embedded in another device, such as a mobile phone, personal digital assistant (PDA), mobile audio or video player, game console, global positioning system (GPS) receiver, or a universal serial bus (USB) ) flash drives for portable storage devices, to name a few.

Computer readable media suitable for storage of computer program instructions and data may include all forms of non-volatile memory, media, and memory devices, and may include, for example, semiconductor memory devices (eg, EPROM, EEPROM, and flash memory devices), magnetic disks (eg, internal hard disks) or removable discs), magneto-optical discs, and CD-ROM and DVD-ROM discs. The processor and memory may be supplemented by or incorporated in special purpose logic circuitry.

Although the present application examples contain many specific implementation details, these should not be construed as limiting the scope of any disclosed or claimed, but rather are used primarily to describe the features of particular disclosed specific embodiments. Certain features that are described in the embodiments herein in the context of multiple embodiments can also be implemented in combination in a single embodiment. On the other hand, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although features may function as above in certain combinations and even be originally claimed as such, one or more features from a claimed combination may in some instances be removed from the combination and claimed A combination of can point to a subcombination or a variation of a subcombination.

Similarly, although operations are depicted in the figures in a particular order, this should not be construed as requiring the operations to be performed in the particular order shown or sequentially, or that all illustrated operations be performed, to achieve the desired result. In some cases, multitasking and parallel processing may be advantageous. Furthermore, the separation of the various system modules and components in the above-described embodiments should not be construed as requiring such separation in all embodiments, and it should be understood that the described program components and systems may generally be integrated together in a single software product , or packaged into multiple software products.

Thus, specific embodiments of the subject matter have been described. Other embodiments are within the scope of the appended claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. Furthermore, the processes depicted in the figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some implementations, multitasking and parallel processing may be advantageous.

The above are only preferred embodiments of one or more embodiments of the embodiments of the present application, and are not intended to limit the one or more embodiments of the embodiments of the present application. Within the principle, any modifications, equivalent replacements, improvements, etc. made should be included within the protection scope of one or more embodiments of the embodiments of the present application.

Industrial Applicability

The embodiments of the present application provide a cloud service request response method and device, an electronic device, and a storage medium. The method is executed by the cloud service system. The method may include acquiring the total cloud service quota applied by the tenant to the cloud service system. Wherein, the above cloud service system includes a system constructed based on a distributed architecture. Based on the total cloud service quota, a work quota is allocated to the work nodes corresponding to the tenants included in the distributed architecture, so that the work nodes respond to cloud service requests initiated by the tenants according to their corresponding work quotas.

Claims

A cloud service request response method, the method is executed by a cloud service system; the method includes:

Obtain the total cloud service quota applied by the tenant to the cloud service system; wherein, the cloud service system includes a system constructed based on a distributed architecture;

Based on the total cloud service quota, a work quota is allocated to the work nodes included in the distributed architecture and corresponding to the tenant, where the work quota is used to trigger the work node to respond to the tenant's initiation according to its corresponding work quota cloud service requests.
The method according to claim 1, wherein, based on the total cloud service quota, allocating a work quota to each work node included in the distributed architecture comprises:

Based on part of the total cloud service quota, a work quota is allocated to the worker nodes included in the distributed architecture and corresponding to the tenant.
The method according to claim 2, wherein the allocating a work quota to the distributed architecture including each work node corresponding to the tenant based on a partial quota in the total cloud service quota comprises:

According to the processing capability corresponding to the working node, determine the cloud service request response volume reached by the working node within a preset duration; wherein, the processing capability indicates the cloud service request response volume reached within a unit duration;

A work quota is allocated to the work node according to the cloud service request response amount corresponding to the work node.
The method according to claim 2 or 3, wherein the allocating work quotas to the worker nodes corresponding to the tenants included in the distributed architecture based on part of the total cloud service quotas includes:

determining the quota weights corresponding to the working nodes included in the distributed architecture;

Based on a portion of the total cloud service quota, the worker node is allocated a work quota that matches the quota weight corresponding to the worker node.
The method according to claim 4, wherein the determining the quota weight corresponding to each working node included in the distributed architecture comprises:

Based on the configuration information of each working node, according to a preset quota weight determination rule, determine the quota weight corresponding to each working node; or,

Based on the processing capability corresponding to each work node, the quota weight corresponding to each work node is determined.
The method according to any one of claims 2 to 5, wherein the method further comprises:

In the case of receiving a quota application request from any work node, based on the remaining quota, assign a work quota to the work node; wherein the remaining quota includes: excluding the allocated work from the total cloud service quota The remaining amount after the amount.
The method of claim 6, wherein the assigning a work quota to the worker nodes based on the remaining quota comprises:

Based on the remaining quota, according to the cloud service request response volume that the worker node can reach within a preset time period, a work quota matching the cloud service request response volume is allocated to the worker node.
The method according to any one of claims 1 to 7, wherein the worker node responds to the cloud service request initiated by the tenant according to its corresponding work quota, comprising:

After receiving the cloud service request initiated by the tenant, the working node provides cloud service computing in response to the cloud service request in the case that its corresponding work quota is still remaining, and adjusts itself according to the consumption quota corresponding to the calculation. remaining work quota.
The method according to any one of claims 1 to 8, wherein the worker node responds to the cloud service request initiated by the tenant according to its corresponding work quota, further comprising:

After receiving the cloud service request initiated by the tenant, the working node submits a quota application request to the cloud service system when there is no remaining work quota corresponding to itself; and there are still remaining quotas in the total cloud service quota. In this case, the work quota allocated to the worker node by the cloud service system based on the remaining quota is received in response to the cloud service request.
The method of claim 9, wherein the method further comprises:

After the working node submits a quota application request to the cloud service system, if the total cloud service quota is not left, the cloud service request is forwarded to other working nodes with remaining work quotas for processing.
The method according to any one of claims 1 to 10, wherein the method further comprises:

The working node charges for the cloud service request initiated by the tenant.
The method according to any one of claims 1 to 11, wherein the cloud service includes an AI cloud service; and the acquiring the total cloud service quota applied by the tenant to the cloud service system includes:

Obtain the total amount of AI cloud services applied by the tenant to the cloud service system;

The allocating a work quota to the work nodes corresponding to the tenant included in the distributed architecture based on the total cloud service quota includes:

Based on the total AI cloud service quota, a work quota is allocated to the work nodes included in the distributed architecture and corresponding to the tenant, where the work quota is used to trigger the work node to respond to the tenant according to its corresponding work quota The AI cloud service request initiated.
A cloud service request response device, the device comprising:

an obtaining module, configured to obtain the total cloud service quota applied by the tenant to the cloud service system; wherein the cloud service system includes a system constructed based on a distributed architecture;

an allocation module, configured to, based on the total cloud service quota, allocate a work quota to the work nodes included in the distributed architecture and corresponding to the tenants, where the work quota is used to trigger the work nodes according to their corresponding work quotas Respond to a cloud service request initiated by the tenant.
The apparatus of claim 13, wherein the distribution module is further configured to:

Based on a part of the above-mentioned total cloud service quota, a work quota is allocated to the worker nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture.
The apparatus of claim 14, wherein the distribution module comprises:

a first determining module, configured to determine the cloud service request response volume reached by the working node within a preset duration according to the processing capability corresponding to the working node; wherein the processing capability indicates the cloud service request response volume reached within a unit duration;

The allocation sub-module is configured to allocate a work quota to the above-mentioned working nodes according to the above-mentioned cloud service request responses corresponding to the above-mentioned working nodes.
The apparatus of claim 14 or 15, wherein the distribution module comprises:

The second determination module is configured to determine the quota weights corresponding to the working nodes included in the distributed architecture;

The allocation sub-module is configured to allocate a work quota matching the quota weight corresponding to the above-mentioned working node to the above-mentioned working node based on a part of the quota in the above-mentioned total quota of the cloud service.
The apparatus of claim 16, wherein the second determining module is further configured to:

Based on the configuration information of each work node, and according to the preset quota weight determination rule, determine the quota weight corresponding to each of the above work nodes; or,

Based on the processing capability corresponding to each work node, the quota weight corresponding to each work node is determined.
The apparatus according to any one of claims 14 to 17, wherein the distribution module is further configured to:

In the case of receiving a quota application request from any work node, based on the remaining quota, the work quota is allocated to the above-mentioned work node; wherein, the above-mentioned residual quota includes: the above-mentioned total cloud service quota after excluding the work quota that has been allocated, the remaining quota amount of.
The apparatus of claim 18, wherein the distribution module is further configured to:

Based on the remaining quota, and according to the cloud service request response volume that the worker node can achieve within a preset time period, a work quota matching the cloud service request response volume is allocated to the worker node.
The apparatus according to any one of claims 13 to 19, wherein the distribution module is further configured to:

Based on the above total cloud service quota, a work quota is allocated to the work nodes corresponding to the above tenants included in the distributed architecture. After the above work nodes receive the cloud service request initiated by the tenant, there are still remaining work quotas corresponding to their own work quotas. In this case, it provides cloud service computing in response to the above cloud service request, and adjusts its remaining work quota according to the consumption quota corresponding to the above calculation.
The apparatus according to any one of claims 13 to 20, wherein the distribution module is further configured to:

Based on the above-mentioned total cloud service quota, a work quota is allocated to the work nodes corresponding to the above-mentioned tenants included in the distributed architecture. After the above-mentioned work nodes receive the cloud service request initiated by the tenant, there is no remaining work quota corresponding to themselves. Next, submit a quota application request to the cloud service system; and in the case that the total cloud service quota is still remaining, receive the work quota allocated by the cloud service system to the working node based on the remaining quota to respond to the cloud service request.
The apparatus of claim 21, wherein the distribution module is further configured to:

Based on the above-mentioned total cloud service quota, a work quota is allocated to the working nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture. After the above-mentioned working node submits a quota application request to the above-mentioned cloud service system, the above-mentioned total cloud service quota is not left. Next, forward the above cloud service request to other working nodes with remaining work quotas for processing.
The apparatus of any one of claims 13 to 22, wherein the apparatus further comprises:

The billing module is configured to charge the cloud service request initiated by the tenant for the above-mentioned working node.
The apparatus according to any one of claims 13 to 23, wherein the cloud service includes an AI cloud service; the obtaining module is further configured to:

Obtain the total amount of AI cloud services applied by the tenant to the above cloud service system;

The distribution module is also configured to:

Based on the above-mentioned total quota of AI cloud services, a work quota is allocated to the work nodes corresponding to the above-mentioned tenants included in the distributed architecture, and the work quotas are used to trigger the above-mentioned work nodes to respond to the AI cloud services initiated by the above-mentioned tenants according to their corresponding work quotas ask.
An electronic device, wherein the device comprises:

processor;

memory for storing instructions executable by the processor;

Wherein, the processor is configured to invoke the executable instructions stored in the memory to implement the cloud service request response method according to any one of claims 1 to 12.
A computer-readable storage medium, wherein the storage medium stores a computer program, and the computer program is used to execute the cloud service request response method according to any one of claims 1 to 12.