WO2022110796A1 - Procédé et appareil de réponse à une demande de service en nuage, dispositif électronique et support de stockage - Google Patents

Procédé et appareil de réponse à une demande de service en nuage, dispositif électronique et support de stockage Download PDF

Info

Publication number
WO2022110796A1
WO2022110796A1 PCT/CN2021/102872 CN2021102872W WO2022110796A1 WO 2022110796 A1 WO2022110796 A1 WO 2022110796A1 CN 2021102872 W CN2021102872 W CN 2021102872W WO 2022110796 A1 WO2022110796 A1 WO 2022110796A1
Authority
WO
WIPO (PCT)
Prior art keywords
quota
cloud service
work
mentioned
node
Prior art date
Application number
PCT/CN2021/102872
Other languages
English (en)
Chinese (zh)
Inventor
韩秋明
李建
符柱
陈家园
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2022110796A1 publication Critical patent/WO2022110796A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5013Request control

Definitions

  • the embodiments of the present application relate to the field of computer technologies, and relate to, but are not limited to, a cloud service request response method and apparatus, an electronic device, and a storage medium.
  • the tenant usually applies to the cloud service system for a certain total cloud service quota; and within the scope of the total cloud service quota, initiates a cloud service request.
  • the cloud service system After the cloud service system receives the cloud service request initiated by the tenant, it will only respond to the cloud service request after determining that the tenant's request is within the range of the above-mentioned total cloud service quota.
  • the embodiment of the present application discloses at least one cloud service request response method, and the method is executed by a cloud service system; the above method includes:
  • the above-mentioned cloud service system includes a system constructed based on a distributed architecture
  • a work quota is allocated to the worker nodes corresponding to the tenants included in the distributed architecture, and the work quotas are used to trigger the work nodes to respond to cloud service requests initiated by the tenants according to their corresponding work quotas.
  • the above-mentioned allocating a work quota to each worker node included in the above-mentioned distributed architecture based on the above-mentioned total cloud service quota includes:
  • a work quota is allocated to the worker nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture.
  • the above-mentioned allocating a work quota to each working node corresponding to the above-mentioned distributed architecture including the above-mentioned tenant based on part of the quota in the above-mentioned total cloud service quota includes: according to the processing capability corresponding to the above-mentioned working node, Determine the cloud service request response volume reached by the above-mentioned working node within a preset time period; wherein, the above-mentioned processing capability indicates the cloud service request response volume reached within a unit time length; according to the above-mentioned cloud service request response volume corresponding to the above-mentioned working node, to the above-mentioned working node Allocate work quotas.
  • multiple assignments of work quotas to the working nodes can be realized, thereby reducing the problem of unreasonable assignment caused by one assignment.
  • allocating a work quota to the worker nodes included in the distributed architecture and corresponding to the tenants based on a partial quota in the total cloud service quota includes: determining the worker nodes included in the distributed architecture. Corresponding quota weight; based on part of the quota in the total quota of the cloud service, assign the work quota to the work node that matches the quota weight corresponding to the work node. In this way, work quotas can be reasonably allocated to each work node, so that nodes with high configuration can be allocated more work quotas, thereby improving the response speed of the cloud service system and improving the tenant experience.
  • the above-mentioned determining the quota weight corresponding to each work node included in the distributed architecture includes: based on the configuration information of each work node, and according to a preset quota weight determination rule, determining the corresponding quota weight of each work node. or, based on the processing capability corresponding to each work node, determine the quota weight corresponding to each of the above-mentioned work nodes. In this way, the response speed of the cloud service system can be improved, and the tenant experience can be improved.
  • the above method further includes:
  • the work quota is allocated to the above-mentioned work node; wherein, the above-mentioned residual quota includes: the above-mentioned total cloud service quota after removing the allocated work quota amount of.
  • the above-mentioned allocating a work quota to the above-mentioned working nodes based on the remaining quota includes: based on the remaining quota, allocating to the above-mentioned working nodes according to the cloud service request responses reached by the above-mentioned working nodes within a preset period of time. The amount of work that matches the above cloud service request response volume. In this way, the system can allocate work quotas that match the processing capabilities of the nodes to the working nodes, so that nodes with strong processing capabilities can allocate more work quotas, thereby improving the response speed of the cloud service system and improving the tenant experience.
  • the above-mentioned working node responds to the cloud service request initiated by the above-mentioned tenant according to the work quota corresponding to itself, including: after the above-mentioned working node receives the cloud service request initiated by the tenant In the case of remaining, the cloud service calculation is provided in response to the above cloud service request, and the remaining work quota is adjusted according to the consumption quota corresponding to the above calculation. In this way, the worker node can determine whether to respond to the cloud service request by analyzing its own work quota.
  • the above-mentioned working node responds to the cloud service request initiated by the above-mentioned tenant according to the work quota corresponding to itself, and further includes: after receiving the cloud service request initiated by the tenant, the above-mentioned working node responds to the corresponding work quota by itself. If there is no remaining amount, submit a quota application request to the above-mentioned cloud service system; and in the case that the above-mentioned total cloud service quota is still remaining, receive the work quota allocated by the cloud service system to the above-mentioned working node based on the remaining quota in response to the above-mentioned cloud service. ask. In this way, when the total cloud service quota still remains, the worker node can continue to receive the work quota allocated by the cloud service, thereby speeding up processing efficiency.
  • the above method further includes: after the above-mentioned working node submits a quota application request to the above-mentioned cloud service system, if the above-mentioned total cloud service quota is not left, forwarding the above-mentioned cloud service request to other work quotas There are remaining worker nodes for processing.
  • the cloud service system can be made to provide cloud services to the tenants within the range of the total amount applied by the tenants as much as possible, thereby improving the experience of the tenants.
  • the above-mentioned method further includes: the above-mentioned working node charges the request for using the cloud service initiated by the tenant.
  • the above-mentioned cloud services include artificial intelligence (Artificial Intelligence, AI) cloud services; the above-mentioned obtaining the total amount of cloud services applied by the tenant to the above-mentioned cloud service system includes: AI cloud service total quota; based on the above-mentioned total cloud service quota, a work quota is allocated to the work nodes corresponding to the above tenants included in the distributed architecture, and the work quota is used to trigger the above-mentioned work nodes to respond to the above according to their corresponding work quotas.
  • AI Artificial Intelligence
  • the cloud service request initiated by the tenant includes: based on the above-mentioned total quota of AI cloud services, allocating a work quota to the work nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture, and the work quota is used to trigger the above-mentioned work nodes according to their corresponding work.
  • the quota responds to the AI cloud service requests initiated by the above tenants.
  • the worker node responds to the AI cloud service request according to its own work quota, which can improve the response speed of the AI cloud service system and improve the tenant experience.
  • the embodiment of the present application also proposes a cloud service request response device, wherein the above device includes:
  • an obtaining module configured to obtain the total cloud service quota applied by the tenant to the above-mentioned cloud service system; wherein, the above-mentioned cloud service system includes a system constructed based on a distributed architecture;
  • the allocation module is configured to, based on the above-mentioned total cloud service quota, allocate a work quota to the work nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture, and the work quotas are used to trigger the above-mentioned work nodes to respond to the above-mentioned tenants according to their corresponding work quotas. cloud service requests.
  • the above allocation module is specifically configured as:
  • a work quota is allocated to the worker nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture.
  • the distribution module described above includes:
  • a first determining module configured to determine the cloud service request response volume reached by the working node within a preset duration according to the processing capability corresponding to the working node; wherein the processing capability indicates the cloud service request response volume reached within a unit duration;
  • the allocation sub-module is configured to allocate a work quota to the above-mentioned working nodes according to the above-mentioned cloud service request responses corresponding to the above-mentioned working nodes.
  • the distribution module described above includes:
  • the second determination module is configured to determine the quota weights corresponding to the working nodes included in the distributed architecture
  • the allocation sub-module is configured to allocate a work quota matching the quota weight corresponding to the above-mentioned working node to the above-mentioned working node based on a part of the quota in the above-mentioned total quota of the cloud service.
  • the above-mentioned second determining module is specifically configured as:
  • the quota weight corresponding to each work node is determined.
  • the above allocation module is further configured to:
  • the work quota is allocated to the above-mentioned work node; wherein, the above-mentioned residual quota includes: the above-mentioned total cloud service quota after removing the allocated work quota amount of.
  • the above allocation module is specifically configured as:
  • a work quota matching the cloud service request response volume is allocated to the worker node.
  • the above allocation module is specifically configured as:
  • a work quota is allocated to the work nodes corresponding to the above tenants included in the distributed architecture. After the above work nodes receive the cloud service request initiated by the tenant, there are still remaining work quotas corresponding to their own work quotas. In this case, it provides cloud service calculation in response to the above cloud service request, and adjusts its remaining work quota according to the consumption quota corresponding to the above calculation.
  • the above allocation module is specifically configured as:
  • a work quota is allocated to the work nodes corresponding to the above-mentioned tenants included in the distributed architecture. After the above-mentioned work nodes receive the cloud service request initiated by the tenant, there is no remaining work quota corresponding to themselves. Next, submit a quota application request to the cloud service system; and in the case that the total cloud service quota is still remaining, receive the work quota allocated by the cloud service system to the working node based on the remaining quota to respond to the cloud service request.
  • the above allocation module is specifically configured as:
  • a work quota is allocated to the working nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture.
  • the above-mentioned working node submits a quota application request to the above-mentioned cloud service system, the above-mentioned total cloud service quota is not left.
  • forward the above cloud service request to other working nodes with remaining work quotas for processing.
  • the above-mentioned apparatus further includes:
  • the billing module is configured to charge the cloud service request initiated by the tenant for the above-mentioned working node.
  • the above-mentioned cloud service includes an AI cloud service; the above-mentioned obtaining module is specifically configured as:
  • a work quota is allocated to the work nodes corresponding to the above-mentioned tenants included in the distributed architecture, and the work quotas are used to trigger the above-mentioned work nodes to respond to the AI cloud services initiated by the above-mentioned tenants according to their corresponding work quotas ask.
  • the embodiment of the present application also proposes an electronic device, and the above-mentioned device includes:
  • a memory for storing the above-mentioned processor-executable instructions
  • the processor is configured to invoke the executable instructions stored in the memory to implement the cloud service request response method shown in any of the foregoing embodiments.
  • An embodiment of the present application further provides a computer-readable storage medium, characterized in that, the storage medium stores a computer program, and the computer program is used to execute the cloud service request response method shown in any of the foregoing embodiments.
  • the above-mentioned cloud service system constructed by a distributed architecture can allocate a work quota to each work node included in the above-mentioned distributed architecture based on the total cloud service quota applied by the tenant to the system, so that each of the above-mentioned work nodes can be autonomous Responding to cloud service requests initiated by tenants according to their corresponding work quotas, reducing the amount of cloud service requests that the cloud service system frequently communicates with the above-mentioned working nodes to read and write tenants, thereby reducing the frequent network I/O of the above-mentioned cloud service system
  • the operation and the locking operation of reading and writing public storage improve the response speed of the cloud service request of the system, thereby improving the tenant experience.
  • 1 is a method flowchart of a method for responding to a cloud service request shown in an embodiment of the application
  • FIG. 2 is a schematic diagram of interaction between an AI cloud service system and a tenant according to an embodiment of the application
  • FIG. 3 is a schematic diagram of total cloud service quota allocation shown in an embodiment of the present application.
  • 4A is a schematic diagram of another allocation of total cloud service quotas shown in an embodiment of the application.
  • 4B is a schematic diagram of total cloud service quota allocation shown in an embodiment of the application.
  • FIG. 5 is a schematic structural diagram of a cloud service request response apparatus shown in an embodiment of the application.
  • FIG. 6 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the application.
  • the cloud service system in order to determine whether the cloud service request initiated by the tenant is within the range of the total cloud service quota applied by the tenant, the cloud service system will count the cloud service request amount of the tenant.
  • the above tenant may include multiple users. Users can apply for cloud services using the tenant account assigned to them.
  • the cloud service system may use the number of cloud service invocations as a dimension to count the cloud service request volume of the tenant.
  • the cloud service system can use the number of bytes of processing traffic as the dimension to count the cloud service request volume of the tenant.
  • the following description takes the cloud service request initiated by the tenant as the cloud service invocation request as an example.
  • the above cloud service system can determine whether the currently counted number of cloud service invocation requests of the tenant reaches the total cloud service quota applied by the tenant (the total number of cloud service invocation requests), if not , then respond to the request; otherwise, limit the request.
  • the above-mentioned cloud service system is a single-node system (the above-mentioned single-node system, specifically a system that provides cloud services through only one node), the number of invocations of tenant services or the number of bytes of processing traffic are compared. convenient. Therefore, it is not complicated to count the number of cloud service requests of tenants, and it will not affect the speed of the cloud service system to respond to requests.
  • the above cloud service system is a system based on a distributed architecture, it may be due to the distributed architecture, which makes the statistics of the cloud service requests of the tenants very complicated, which affects the speed of the cloud service system in responding to the requests.
  • the system can allocate a shared space (for example, a shared cache or a shared cache or a shared cache or a shared space) for storing the total cloud service quota applied for by the tenant and a usage quota indicating the number of calls initiated by the tenant to the tenant. shared database).
  • a shared space for example, a shared cache or a shared cache or a shared cache or a shared space
  • the request may be distributed to any node A under the above-mentioned distributed architecture.
  • the node A When the above-mentioned cloud service system receives a cloud service invocation request initiated by a tenant, the request may be distributed to any node A under the above-mentioned distributed architecture.
  • the node A When the node A receives the above request, it will read the total cloud service quota stored in the shared space and the usage quota already used by the tenant (the number of calls initiated by the tenant) through I/O. After reading the above-mentioned total cloud service quota and the above-mentioned usage quota, the tenant can determine whether the above-mentioned total cloud service quota is greater than the above-mentioned usage quota. If so, the above-mentioned node A responds to the call request and increases the above-mentioned usage quota. Then, the node A can write the increased usage quota to the above-mentioned shared space through I/
  • the above cloud service system is a system constructed based on a distributed architecture
  • the cloud service invocation request or traffic processing request initiated by the tenant may be distributed to any node under the distributed architecture. Therefore, the above cloud service system must frequently communicate with each node under the distributed architecture to read and write the cloud service request volume of the tenant. Frequent network I/O operations and locking operations of reading and writing public storage may cause the system's cloud service request response efficiency to become low, with delays, thereby affecting tenant experience.
  • an embodiment of the present application proposes a method for responding to a cloud service request, which is applied to a cloud service system.
  • the above cloud service system includes a system constructed based on a distributed architecture.
  • the method allocates the total cloud service quota applied by the tenant to each working node under the above-mentioned distributed architecture, triggers each working node to independently determine whether to respond to the cloud service request initiated by the tenant, and reduces the frequent communication between the above-mentioned cloud service system and the above-mentioned various working nodes.
  • Communicate to read and write the cloud service request volume of the tenant thereby reducing the frequent network I/O operations of the above cloud service system and the locking operation of reading and writing public storage, improving the response speed of the cloud service request of the system, thereby improving the tenant experience.
  • FIG. 1 is a method flowchart of a method for responding to a cloud service request according to an embodiment of the present application.
  • the method for responding to the cloud service request shown in the embodiment of the present application may include:
  • the above-mentioned cloud service system includes a system constructed based on a distributed architecture
  • S104 Based on the total cloud service quota, assign a work quota to the work nodes included in the distributed architecture and corresponding to the tenants, where the work quotas are used to trigger the work nodes to respond to the cloud service requests initiated by the tenants according to their corresponding work quotas .
  • the above cloud service system (hereinafter referred to as the "system") is specifically a system that provides cloud services to tenants.
  • the foregoing system may include a certain number of hardware devices or software devices to provide cloud services, and the embodiments of the present application do not limit the types of hardware devices and software devices included in the foregoing system.
  • a tenant can apply to the above cloud service system for a certain total cloud service quota.
  • the total number of cloud service calls that can be initiated by the tenant may be used as a dimension to calculate the total cloud service quota.
  • the tenant can initiate a cloud service invocation request to the above cloud service system within the scope of the above-mentioned total cloud service quota, so as to enjoy the services provided by the cloud service system.
  • the above cloud service system includes a system constructed based on a distributed architecture.
  • the above-mentioned distributed architecture may be an architecture including several working nodes.
  • the working node hereinafter referred to as "node”
  • the terminal or server may be a notebook computer, a desktop computer, a tablet computer (Portable Android Device, PAD) terminal, etc., and the embodiments of the present application do not identify the types of devices of the terminal or server. and model number).
  • the above-mentioned distributed architecture provides computing power through its included working nodes, so that the above-mentioned cloud service system can provide cloud services for tenants.
  • the above cloud service type may be cloud service invocation or traffic storage, etc., and the embodiment of the present application does not limit the cloud service type.
  • the above-mentioned cloud service system may include an AI cloud service system.
  • FIG. 2 is a schematic diagram of interaction between an AI cloud service system and a tenant according to an embodiment of the present application.
  • the above AI cloud service system is a system constructed based on a distributed architecture.
  • the above-mentioned distributed architecture includes working nodes A, B, and C.
  • the cloud service system shown in FIG. 2 is only a schematic illustration, and is not particularly limited.
  • the tenant 201 may apply to the AI cloud service system 202 for a total cloud service quota for a certain number of calls. Then, the tenant 201 may initiate a service invocation request such as model training to the above-mentioned AI cloud service system 202 by calling an interface (for example, a Hyper Text Transfer Protocol (Hyper Text Transfer Protocol, HTTP) invocation).
  • a service invocation request such as model training to the above-mentioned AI cloud service system 202 by calling an interface (for example, a Hyper Text Transfer Protocol (Hyper Text Transfer Protocol, HTTP) invocation).
  • HTTP Hyper Text Transfer Protocol
  • the above-mentioned AI cloud service system 202 After the above-mentioned AI cloud service system 202 receives the above-mentioned invocation request, it can distribute the invocation request task to the target working node A under the distributed architecture according to a pre-stored distribution rule (for example, a load balancing distribution rule), so that the node can A can respond to the cloud service request initiated by the tenant according to its corresponding work quota, and return the response result to the tenant.
  • a pre-stored distribution rule for example, a load balancing distribution rule
  • the above-mentioned total amount of cloud services includes the total amount of services provided by the cloud service system that tenants can enjoy.
  • the cloud service system may use the number of cloud service invocations as the dimension to count the above-mentioned total service volume of the tenant. If the cloud service type applied by the tenant is stream data processing, the cloud service system can count the above-mentioned total service volume of the tenant by taking the number of bytes of processed traffic as the dimension.
  • the embodiment of the present application does not limit the statistical dimension of the total amount of cloud services.
  • the following takes the cloud service type as the cloud service invocation request as an example for description.
  • tenants can apply for the above-mentioned total amount through a paid purchase.
  • tenants may apply for the total amount above by applying for a trial.
  • This embodiment of the present application does not limit the manner in which the tenant applies for the total cloud service quota.
  • the above-mentioned work node can respond to the cloud service request initiated by the tenant according to its corresponding work quota.
  • the above-mentioned worker nodes may charge for cloud service requests initiated by tenants.
  • worker nodes can maintain a summary table of quotas corresponding to tenants.
  • the above quota summary table can count the remaining quota, used quota and other information of the combination.
  • the above-mentioned used quota may be increased to complete the billing for the cloud service request initiated by the tenant.
  • the above work quota refers to the amount of cloud service requests that a worker node can respond to.
  • the above-mentioned working node receives the cloud service request initiated by the tenant, it can determine whether it responds to the cloud service request by judging whether there is a remaining work quota.
  • a worker node When a worker node responds to a cloud service request, it can consume the corresponding cloud service request accordingly. For example, when the cloud service request volume is counted by the number of calls, each time a worker node responds to a call request initiated by the tenant, it can respond with a work quota of 1 unit.
  • the above-mentioned work credits may include two credits.
  • the above-mentioned work quota may be the work quota initially allocated by the above-mentioned system to each of the above-mentioned nodes after the tenant applies for the total quota of cloud services, so that each node can operate.
  • the above-mentioned work quota may be the work quota applied to the above-mentioned system when the allocated work quota is exhausted during the operation of each node, so that each node can replenish the work quota and continue to operate.
  • the above cloud service request includes a cloud service request initiated by the tenant to the above system.
  • the above cloud service requests may include cloud service invocation requests and/or stream data processing requests.
  • the type of cloud service request initiated by the tenant is related to the type of cloud service applied for by the tenant.
  • the tenant can initiate a cloud service invocation request.
  • the tenant can initiate both a cloud service invocation request and a traffic processing request.
  • the above-mentioned working node can provide cloud service calculation in response to the above-mentioned cloud service request when the corresponding work quota still remains, and calculate the corresponding consumption quota according to the above-mentioned calculation. Adjust your remaining work quota.
  • the above-mentioned working node A can determine whether its own work quota remains. If its own work quota remains, the above-mentioned node A can respond to the cloud service request and consume 1 unit of work quota. If the above-mentioned work quota is not left, the above-mentioned node A can limit the cloud service request.
  • a worker node may store the amount of work assigned by the system, as well as the amount of cloud service requests that the node has responded to. At this time, when it is determined whether there is a remaining work quota, a corresponding result can be obtained by subtracting the currently responded cloud service request amount from the work quota. If the result is greater than 0, it is determined that the above-mentioned work quota has remaining; otherwise, there is no remaining. In some embodiments, worker nodes may store remaining credits.
  • the initial value of the remaining quota is the work quota allocated by the system, and each time the worker node responds to a cloud service request, the remaining quota value is adjusted. At this time, when it is determined whether there is a remaining amount of work, it can be determined whether the remaining amount is greater than 0, and if so, it is determined that there is a surplus of the above-mentioned amount of work, otherwise, there is no remaining amount.
  • the above-mentioned working node may submit a quota application request to the above-mentioned system, and when the above-mentioned total cloud service quota still remains, Receiving the work quota allocated by the cloud service system to the above-mentioned working node based on the remaining quota in response to the above-mentioned cloud service request.
  • the above-mentioned working node A can determine whether its own work quota remains. If the above-mentioned work quota is not left, the above-mentioned node A may first submit a quota application request to the above-mentioned system. After receiving the above quota application request, the above system can determine whether the total cloud service quota corresponding to the above tenant still remains, and if there is still a residual quota, it will continue to allocate a work quota to the node A. After the above-mentioned node A receives the work quota, it will continue to respond to the cloud service request.
  • the above-mentioned work node submits a quota application request to the above-mentioned cloud service system, if the above-mentioned total cloud service quota is not left, the above-mentioned cloud service request is forwarded to other work nodes with remaining work quotas for processing.
  • the working status of each working node is stored in the above system (the working status refers to whether the node can respond to the request, that is, whether there is still a working quota).
  • the working status refers to whether the node can respond to the request, that is, whether there is still a working quota.
  • the cloud service system can be made to provide cloud services to the tenants within the range of the total amount applied by the tenants as much as possible, thereby improving the experience of the tenants.
  • the cloud service system constructed by the distributed architecture can allocate a work quota to each work node included in the above distributed architecture based on the total cloud service quota applied by the tenant to the system, and each of the above work nodes independently according to their own
  • the corresponding work quota responds to the cloud service request initiated by the tenant, reducing the cloud service request volume of the above-mentioned cloud service system that frequently communicates with the above-mentioned working nodes to read and write the tenant's cloud service, thereby reducing the frequent network I/O operations of the above-mentioned cloud service system and
  • the locking operation of reading and writing public storage improves the response speed of cloud service requests of the system, thereby improving the tenant experience.
  • the above-mentioned total cloud service quota when the above-mentioned system performs the above-mentioned S104, based on the above-mentioned total cloud service quota, when allocating a work quota to the work nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture, the above-mentioned total cloud service quota may be based on part of the quota. , assigning a work quota to the work nodes included in the distributed architecture and corresponding to the tenants.
  • the above-mentioned cloud services are allocated according to the quota weights corresponding to each work node The total quota is distributed to each worker node.
  • the above-mentioned system may first obtain the quota weights corresponding to the working nodes included in the above-mentioned distributed architecture. After determining the quota weights corresponding to the working nodes, the above-mentioned system may allocate a working quota matching the quota weights corresponding to the above-mentioned working nodes to the above-mentioned working nodes based on part of the quotas in the above-mentioned total cloud service quotas.
  • the quota weight corresponding to each of the above working nodes may specifically be a preset fixed value.
  • the quota weight corresponding to each worker node can be set to the same value.
  • the total quota can be equally distributed to each worker node.
  • the quota weight corresponding to each work node is determined based on the configuration information of each work node and according to a preset quota weight determination rule.
  • a configuration information table corresponding to each working node can be maintained. For example, worker node CPU, GPU processing performance, hard disk model, etc.
  • the configuration information table corresponding to each working node may be queried to determine the configuration information of each working node.
  • the system may determine the quota weight corresponding to each working node according to a preset quota weight determination rule.
  • the above-mentioned quota weight determination rule may be to score various configuration information of each working node first. Then the weighted summation of each score is carried out to obtain the total score corresponding to each work node. Finally, the weight of each work node is determined according to the total score corresponding to each work node.
  • the system can determine the quota weight corresponding to each work node based on the configuration information of each work node and a preset quota weight determination rule. Therefore, it is possible to reasonably allocate work quotas to each work node, so that nodes with high configuration can be allocated more work quotas, thereby improving the response speed of the cloud service system and improving the tenant experience.
  • the system may determine the quota weight corresponding to each work node based on the processing capability corresponding to each work node; wherein, the above processing Capability indicates the amount of cloud service request responses that can be achieved within a unit time.
  • the above-mentioned system can determine the cloud service request response amount (processing capacity) that each working node can achieve within a unit time by means of testing. After determining the processing capability corresponding to each working node, the above-mentioned system may determine the quota weight of each working node according to the processing capability corresponding to each working node.
  • the above-mentioned system may, according to the processing capability corresponding to the above-mentioned work node, determine The amount of cloud service request responses that can be reached by the above-mentioned worker nodes within the preset time period.
  • the above-mentioned processing capability indicates the amount of cloud service request responses that can be achieved within a unit time.
  • the system may allocate a work quota to the working node according to the cloud service request response amount corresponding to the working node.
  • the above-mentioned system may determine the value of the above-mentioned partial quota for participating in the initial allocation according to the sum of the cloud service request responses that each working node can reach within a preset time period.
  • the system may determine the cloud service request responses corresponding to the work nodes as the work corresponding to the work nodes. The quota is allocated to each of the above working nodes.
  • the above-mentioned preset duration may specifically be a value set according to experience. For example, 1 minute.
  • FIG. 3 is a schematic diagram of total quota allocation of cloud services according to an embodiment of the present application.
  • the response volume of cloud service requests that can be reached within 1 minute corresponding to the worker node A included in the distributed architecture is a dark gray square.
  • the amount of cloud service request responses that can be reached within 1 minute corresponding to the worker node B is a light gray square.
  • the amount of cloud service request responses that can be reached within 1 minute corresponding to worker node C is a black square.
  • the work node A can be allocated the work quota indicated by the dark gray square
  • the work node B can be allocated the work quota indicated by the light gray square
  • the work node C can be allocated the work quota indicated by the black square. Work quota.
  • the above-mentioned system can determine the above-mentioned work within a preset duration according to the processing capability corresponding to each of the above-mentioned work nodes.
  • the above-mentioned processing capability indicates the amount of cloud service request responses that can be achieved within a unit time.
  • the system may determine the cloud service request responses corresponding to the working nodes as the work quotas corresponding to the working nodes, and assign them to the working nodes. Therefore, it is possible to determine a reasonable part of the quota for participating in the initial allocation, and reasonably allocate the initial work quota to each node, thereby further improving the work efficiency of the cloud service.
  • a worker node with a fast rate can receive work quota assignments multiple times, thereby improving the response speed of the cloud service system and improving the tenant experience.
  • the system may assign work quotas to each work node included in the above-mentioned distributed architecture based on a part of the quota in the above-mentioned total cloud service quota.
  • Worker nodes allocate work quotas.
  • the system receives a quota application request from any of the above-mentioned work nodes, it allocates a work quota to the above-mentioned work nodes based on the remaining quota.
  • the above-mentioned remaining quota includes the remaining quota after deducting the allocated work quota from the above-mentioned total cloud service quota.
  • the value of the initially allocated partial quota and the allocation rule may be determined first. For example, you can specify an initial distribution of one-third of the total quota, as well as an even distribution rule. At this time, the above-mentioned system can evenly distribute one-third of the total quota to each worker node.
  • the above system can check whether there is any remaining quota, and if so, it can allocate the work quota to the above work node.
  • the above-mentioned system may allocate the above-mentioned cloud service to the above-mentioned work node according to the cloud service request response amount that the above-mentioned work node can reach within a preset period of time. The amount of work that matches the request response volume.
  • the above-mentioned preset duration may specifically be a value set according to experience. For example, 1 minute.
  • FIG. 4A is a schematic diagram of another allocation of the total cloud service quota according to an embodiment of the present application.
  • the slashed box represents that the corresponding work quota of the node has been consumed. That is, after the quota on node A is used up, continue to divide a part of the quota from the remaining total quota to this node A until all the total quota is allocated. When the quota of all nodes is exhausted, it means that the total purchase quota of the tenant has been consumed.
  • FIG. 4B is a schematic diagram of total quota allocation of cloud services according to an embodiment of the present application.
  • the slashed box represents that the corresponding work quota of the node has been consumed.
  • the above-mentioned system can divide the cloud service request response amount that node A can reach within 1 minute from the remaining quota. The above system can then allocate a work quota (dark gray box in FIG. 4B ) corresponding to the request response amount to node A.
  • the node When the quota on the node is used up and there is no remaining quota to divide the total quota, the node will write the current state into the shared storage service, and will not process the user's request, but will receive The service request is forwarded to other nodes with quota.
  • the above-mentioned system can allocate the above-mentioned work node to the above-mentioned work quota matching the above-mentioned cloud service request response amount to the above-mentioned working node according to the cloud service request response amount that the above-mentioned working node can reach within a preset time period. Therefore, the above system can allocate work quotas that meet the processing capabilities of the nodes to the working nodes, so that nodes with strong processing capabilities can allocate more work quotas, thereby improving the response speed of the cloud service system and improving the tenant experience.
  • an embodiment of the present application further provides a cloud service request response apparatus.
  • FIG. 5 is a schematic structural diagram of a cloud service request response apparatus according to an embodiment of the present application.
  • the above-mentioned apparatus 50 may include:
  • the obtaining module 51 is configured to obtain the total cloud service quota applied by the tenant to the above-mentioned cloud service system; wherein, the above-mentioned cloud service system includes a system constructed based on a distributed architecture;
  • the allocation module 52 is configured to allocate a work quota to the work nodes corresponding to the tenants included in the distributed architecture based on the above-mentioned total cloud service quota, and the work quotas are used to trigger the above-mentioned work nodes to respond to the above-mentioned tenants according to their corresponding work quotas Initiated cloud service request.
  • the above allocation module 52 is specifically configured as:
  • a work quota is allocated to the worker nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture.
  • the distribution module 52 described above includes:
  • a first determining module configured to determine the cloud service request response volume reached by the working node within a preset duration according to the processing capability corresponding to the working node; wherein the processing capability indicates the cloud service request response volume reached within a unit duration;
  • the allocation sub-module is configured to allocate a work quota to the above-mentioned working nodes according to the above-mentioned cloud service request responses corresponding to the above-mentioned working nodes.
  • the distribution module 52 described above includes:
  • the second determination module is configured to determine the quota weights corresponding to the working nodes included in the distributed architecture
  • the allocation sub-module is configured to allocate a work quota matching the quota weight corresponding to the above-mentioned working node to the above-mentioned working node based on a part of the quota in the above-mentioned total quota of the cloud service.
  • the above-mentioned second determining module is specifically configured as:
  • the quota weight corresponding to each work node is determined.
  • the above allocation module 52 is further configured to:
  • the work quota is allocated to the above-mentioned work node; wherein, the above-mentioned residual quota includes: the above-mentioned total cloud service quota after excluding the work quota that has been allocated, the remaining quota amount of.
  • the above allocation module 52 is specifically configured as:
  • a work quota matching the cloud service request response volume is allocated to the worker node.
  • the above allocation module 52 is specifically configured as:
  • a work quota is allocated to the work nodes corresponding to the above tenants included in the distributed architecture. After the above work nodes receive the cloud service request initiated by the tenant, there are still remaining work quotas corresponding to their own work quotas. In this case, it provides cloud service calculation in response to the above cloud service request, and adjusts its remaining work quota according to the consumption quota corresponding to the above calculation.
  • the above allocation module 52 is specifically configured as:
  • a work quota is allocated to the work nodes corresponding to the above-mentioned tenants included in the distributed architecture. After the above-mentioned work nodes receive the cloud service request initiated by the tenant, there is no remaining work quota corresponding to themselves. Next, submit a quota application request to the cloud service system; and in the case that the total cloud service quota is still remaining, receive the work quota allocated by the cloud service system to the working node based on the remaining quota to respond to the cloud service request.
  • the above allocation module 52 is specifically configured as:
  • a work quota is allocated to the work nodes corresponding to the above-mentioned tenants included in the above-mentioned distributed architecture.
  • the above-mentioned work node submits a quota application request to the above-mentioned cloud service system, the above-mentioned total cloud service quota is not left.
  • forward the above cloud service request to other working nodes with remaining work quotas for processing.
  • the above-mentioned apparatus 50 further includes:
  • the billing module is configured to charge the cloud service request initiated by the tenant for the above-mentioned working node.
  • the above-mentioned cloud service includes an AI cloud service; the above-mentioned obtaining module 51 is specifically configured as:
  • the above allocation module 52 is specifically configured as:
  • a work quota is allocated to the work nodes corresponding to the above-mentioned tenants included in the distributed architecture, and the work quotas are used to trigger the above-mentioned work nodes to respond to the AI cloud services initiated by the above-mentioned tenants according to their corresponding work quotas ask.
  • the embodiments of the cloud service request response apparatus shown in the embodiments of this application may be configured on an electronic device.
  • the embodiment of the present application discloses an electronic device, and the device may include: a processor.
  • a memory configured to store processor executable instructions.
  • the processor is configured to invoke the executable instructions stored in the memory to implement the cloud service request response method shown in any of the foregoing embodiments.
  • FIG. 6 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
  • the electronic device may include a processor for executing instructions, a network interface for network connection, a memory for storing operating data for the processor, and a corresponding instruction for storing the cloud service request response device of non-volatile memory.
  • the embodiment of the cloud service request response apparatus may be implemented by software, or may be implemented by hardware or a combination of software and hardware.
  • a device in a logical sense is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of the electronic device where the device is located.
  • the electronic device in which the apparatus in the embodiment is located may also include other electronic devices according to the actual functions of the electronic device. Hardware, no further details on this.
  • the corresponding instructions of the cloud service request response apparatus may also be directly stored in the memory, which is not limited herein.
  • An embodiment of the present application provides a computer-readable storage medium, where a computer program is stored in the storage medium, and the computer program is used to execute the cloud service request response method shown in any of the foregoing embodiments.
  • one or more of the embodiments of the present application may be provided as a method, a system or a computer program product. Accordingly, one or more of the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, one or more of the embodiments of the present application may be implemented on one or more computer-usable storage media (which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein in the form of a computer program product.
  • computer-usable storage media which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.
  • “And/or” in the embodiments of the present application means at least one of the two.
  • “A and/or B” may include three schemes: A, B, and "A and B”.
  • Embodiments of the subject matter and functional operations described in the embodiments of this application can be implemented in digital electronic circuits, computer software or firmware in tangible embodiment, computers that can include the structures disclosed in the embodiments of this application and their structural equivalents hardware, or a combination of one or more of them.
  • Embodiments of the subject matter described in the embodiments of this application may be implemented as one or more computer programs, ie computer program instructions encoded on a tangible non-transitory program carrier for execution by or to control the operation of a data processing apparatus one or more modules.
  • the program instructions may be encoded on an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver device for interpretation by the data.
  • the processing device executes.
  • the computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of these.
  • the processes and logic flows described in the embodiments of the present application can be executed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating according to input data and generating output.
  • the processes and logic flows described above can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, eg, an FPGA (Field Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit).
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • a computer suitable for the execution of a computer program may include, for example, a general and/or special purpose microprocessor, or any other type of central processing unit.
  • the central processing unit will receive instructions and data from read only memory and/or random access memory.
  • the basic components of a computer may include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operably coupled to, such mass storage devices to receive data therefrom or to include one or more mass storage devices, such as magnetic disks, magneto-optical disks, or optical disks, etc., for storing data. Send data to it, or both.
  • the computer does not have to have such a device.
  • the computer may be embedded in another device, such as a mobile phone, personal digital assistant (PDA), mobile audio or video player, game console, global positioning system (GPS) receiver, or a universal serial bus (USB) ) flash drives for portable storage devices, to name a few.
  • PDA personal digital assistant
  • GPS global positioning system
  • USB universal serial bus
  • Computer readable media suitable for storage of computer program instructions and data may include all forms of non-volatile memory, media, and memory devices, and may include, for example, semiconductor memory devices (eg, EPROM, EEPROM, and flash memory devices), magnetic disks (eg, internal hard disks) or removable discs), magneto-optical discs, and CD-ROM and DVD-ROM discs.
  • semiconductor memory devices eg, EPROM, EEPROM, and flash memory devices
  • magnetic disks eg, internal hard disks
  • removable discs removable discs
  • magneto-optical discs e.g., CD-ROM and DVD-ROM discs.
  • the processor and memory may be supplemented by or incorporated in special purpose logic circuitry.
  • the embodiments of the present application provide a cloud service request response method and device, an electronic device, and a storage medium.
  • the method is executed by the cloud service system.
  • the method may include acquiring the total cloud service quota applied by the tenant to the cloud service system.
  • the above cloud service system includes a system constructed based on a distributed architecture. Based on the total cloud service quota, a work quota is allocated to the work nodes corresponding to the tenants included in the distributed architecture, so that the work nodes respond to cloud service requests initiated by the tenants according to their corresponding work quotas.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer And Data Communications (AREA)

Abstract

L'invention concerne un procédé et un appareil de réponse à une demande de service en nuage, un dispositif électronique et un support de stockage. Le procédé s'applique à un système de service en nuage. Le procédé peut comprendre : l'acquisition du quota de service en nuage total qu'un locataire demande à un système de service en nuage, le système de service en nuage comprenant un système qui est construit sur la base d'une architecture distribuée (S102) ; sur la base du quota de service en nuage total, l'attribution d'un quota de travail à un nœud de travail qui est compris dans l'architecture distribuée et qui correspond au locataire, de telle sorte que le nœud de travail réponde, en fonction du quota de travail qui lui correspond, à une demande de service en nuage émanant du locataire (S104). Au moyen du procédé, la quantité de demandes de service en nuage de locataires que le système de service en nuage lit et écrit fréquemment au moyen d'une communication avec chaque nœud de travail est réduite, de telle sorte que le système de service en nuage effectue moins fréquemment des opérations d'entrée et de sortie de réseau et des opérations de verrouillage pour la lecture et l'écriture d'un dispositif de stockage commun, ce qui permet d'augmenter la vitesse de réponse du système à des demandes de service en nuage, et d'améliorer l'expérience des locataires.
PCT/CN2021/102872 2020-11-24 2021-06-28 Procédé et appareil de réponse à une demande de service en nuage, dispositif électronique et support de stockage WO2022110796A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011331362.0A CN112395091A (zh) 2020-11-24 2020-11-24 云服务请求响应方法及装置、电子设备和存储介质
CN202011331362.0 2020-11-24

Publications (1)

Publication Number Publication Date
WO2022110796A1 true WO2022110796A1 (fr) 2022-06-02

Family

ID=74607060

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/102872 WO2022110796A1 (fr) 2020-11-24 2021-06-28 Procédé et appareil de réponse à une demande de service en nuage, dispositif électronique et support de stockage

Country Status (2)

Country Link
CN (1) CN112395091A (fr)
WO (1) WO2022110796A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112395091A (zh) * 2020-11-24 2021-02-23 上海商汤智能科技有限公司 云服务请求响应方法及装置、电子设备和存储介质
CN114157614A (zh) * 2021-11-30 2022-03-08 上海派拉软件股份有限公司 一种资源管理方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170339008A1 (en) * 2016-05-17 2017-11-23 Microsoft Technology Licensing, Llc Distributed operational control in computing systems
CN107424001A (zh) * 2017-04-17 2017-12-01 中国工商银行股份有限公司 产品销售额度的控制方法及系统
CN108446975A (zh) * 2018-03-28 2018-08-24 上海数据交易中心有限公司 一种额度管理方法及装置
CN109428735A (zh) * 2017-08-31 2019-03-05 中国电信股份有限公司 计费方法和计费系统
CN111651339A (zh) * 2020-06-04 2020-09-11 腾讯科技(深圳)有限公司 一种请求数量的控制方法和相关装置
CN112395091A (zh) * 2020-11-24 2021-02-23 上海商汤智能科技有限公司 云服务请求响应方法及装置、电子设备和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170339008A1 (en) * 2016-05-17 2017-11-23 Microsoft Technology Licensing, Llc Distributed operational control in computing systems
CN107424001A (zh) * 2017-04-17 2017-12-01 中国工商银行股份有限公司 产品销售额度的控制方法及系统
CN109428735A (zh) * 2017-08-31 2019-03-05 中国电信股份有限公司 计费方法和计费系统
CN108446975A (zh) * 2018-03-28 2018-08-24 上海数据交易中心有限公司 一种额度管理方法及装置
CN111651339A (zh) * 2020-06-04 2020-09-11 腾讯科技(深圳)有限公司 一种请求数量的控制方法和相关装置
CN112395091A (zh) * 2020-11-24 2021-02-23 上海商汤智能科技有限公司 云服务请求响应方法及装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN112395091A (zh) 2021-02-23

Similar Documents

Publication Publication Date Title
JP7127010B2 (ja) リソースの割り当て方法、装置、電子設備、コンピュータ可読媒体およびコンピュータプログラム
US9471393B2 (en) Burst-mode admission control using token buckets
KR101948502B1 (ko) 버스트 모드 제어
WO2022110796A1 (fr) Procédé et appareil de réponse à une demande de service en nuage, dispositif électronique et support de stockage
US9218221B2 (en) Token sharing mechanisms for burst-mode operations
CN107688492B (zh) 资源的控制方法、装置和集群资源管理系统
US20140379922A1 (en) Equitable distribution of excess shared-resource throughput capacity
WO2017166643A1 (fr) Procédé et dispositif pour quantifier des ressources de tâche
US11979336B1 (en) Quota-based resource scheduling
WO2019105379A1 (fr) Procédé et appareil de gestion de ressources, dispositif électronique et support de stockage
CN111506434B (zh) 一种任务处理方法、装置及计算机可读存储介质
WO2023174037A1 (fr) Procédé, appareil et système de planification de ressources, dispositif, support et produit-programme
WO2013123650A1 (fr) Procédé d'affectation de machine virtuelle et dispositif d'affectation de machine virtuelle
CN108847981A (zh) 分布式计算机云计算处理方法
Addya et al. A game theoretic approach to estimate fair cost of VM placement in cloud data center
Manikandan et al. Virtualized load balancer for hybrid cloud using genetic algorithm
CN108241535B (zh) 资源管理的方法、装置及服务器设备
US9769022B2 (en) Timeout value adaptation
CN109426561A (zh) 一种任务处理方法、装置及设备
CN110096352A (zh) 进程管理方法、装置及计算机可读存储介质
Kumar et al. Evaluation of load balancing algorithm using cloudsim
CN111427682A (zh) 任务分配方法、系统、装置及设备
CN110875934B (zh) 一种基于多租户服务的业务分组方法和装置
CN110147278A (zh) 数据处理方法及装置
CN116074541B (zh) 一种资源处理方法、系统、装置及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896304

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21896304

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.10.2023)