CN116962396A - Computing resource allocation method and device, electronic equipment and storage medium - Google Patents

Computing resource allocation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116962396A
CN116962396A CN202210796473.1A CN202210796473A CN116962396A CN 116962396 A CN116962396 A CN 116962396A CN 202210796473 A CN202210796473 A CN 202210796473A CN 116962396 A CN116962396 A CN 116962396A
Authority
CN
China
Prior art keywords
server
evaluation value
performance evaluation
target
servers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210796473.1A
Other languages
Chinese (zh)
Inventor
蔺艳斐
王自亮
闫冰
彭伟
袁惺
宋磊
冯延钊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Shandong Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Shandong Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Shandong Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202210796473.1A priority Critical patent/CN116962396A/en
Publication of CN116962396A publication Critical patent/CN116962396A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Multi Processors (AREA)

Abstract

The embodiment of the application provides a computing resource allocation method, a computing resource allocation device, electronic equipment and a storage medium, which can evaluate the performance of each server in an edge node more accurately, so that the server allocated to the current computing task is more suitable, and the user experience is improved. The method for distributing the computing resources comprises the following steps: determining a first performance evaluation value of each server in the M servers, wherein the first performance evaluation value is inversely related to a working state parameter representing the used computing resource quantity in the corresponding server; determining a second performance evaluation value of each server based on the ratio of the number of load connections of each server to the corresponding first performance evaluation value; and in response to the target computing task allocated to the target edge node, allocating computing resources of the target server, of which the second performance evaluation value meets the preset condition, to the target computing task.

Description

Computing resource allocation method and device, electronic equipment and storage medium
[ field of technology ]
The embodiment of the application relates to the technical field of communication, in particular to a computing resource allocation method, a computing resource allocation device, electronic equipment and a storage medium.
[ background Art ]
With the rapid development of the internet, especially the increase of traffic of video, games, downloads, etc., the requirements of users on network experience are getting higher and higher. In the prior art, when a user sends a surfing request, the balancing loader evaluates the performance of each server based on the internal parameters of the server included in the current edge node, so that a corresponding server is allocated to the current surfing request, but evaluating the performance of the server based on the internal parameters of the server only may have a problem of inaccurate evaluation, so that the load balancer may select a server with poor performance, resulting in poor user experience.
[ application ]
The embodiment of the application provides a computing resource allocation method, a computing resource allocation device, electronic equipment and a storage medium, which can evaluate the performance of each server in an edge node more accurately, so that the server allocated to the current computing task is more suitable, and the user experience is improved.
In a first aspect, the present application provides a method for allocating computing resources, the method comprising:
determining a first performance evaluation value of each server in M servers, wherein the first performance parameter is inversely related to a working state parameter representing the used computing resource quantity in a corresponding server, the M servers are all positioned at a target edge node, and M is a positive integer not less than 2;
determining a second performance evaluation value of each server based on the ratio of the number of load connections of each server to the corresponding first performance evaluation value;
and responding to the target computing task distributed to the target edge node, and distributing the computing resource of the target server of which the second performance evaluation value meets the preset condition to the target computing task.
In the embodiment of the present application, the first performance evaluation value may be considered to be determined based on the amount of computing resources that the server has used, i.e., the first performance evaluation value may be considered to be related to the internal parameters of the server; the load connection quantity of the server can be regarded as the external parameter of the server, so that the external parameter of the server and the internal parameter of the server are utilized together to evaluate the performance of the server more accurately, the server distributed to the current computing task is more suitable, and the user experience is improved.
Optionally, determining the first performance evaluation value of each server in the M servers includes:
acquiring working state parameters of each server in the M servers, wherein the working state parameters comprise CPU utilization rate, memory utilization rate, disk utilization rate and network broadband utilization rate;
determining the unused computing resource amount of each server based on the working state parameters of each server;
determining a first performance evaluation value of each server based on the unused calculation resource quantity of each server and a corresponding preset weight value, wherein the calculation formula of the first performance evaluation value is as follows:
wherein :CPU utilization of the ith server, etc. are shown>Representing memory utilization of the ith server, < >>Representing disk utilization of the ith server, < +.>Representing network bandwidth utilization, w, of an ith server 1 ,w 2 ,w 3, w 4 Indexes representing CPU utilization rate, memory utilization rate, disk utilization rate and network bandwidth utilization rate of the ith server respectively occupy weights, w1=w2>w3=w4,P i The first performance evaluation value indicating the ith server, i being a positive integer not exceeding M.
In the embodiment of the application, the working state parameters of each server in the M servers can be considered to comprise CPU utilization rate, memory utilization rate, disk utilization rate and network broadband utilization rate, so that the unused computing resource quantity of each server can be determined, and the first performance evaluation value of each server can be calculated according to the unused computing resource quantity of each server and the corresponding preset weight value.
Optionally, determining the second performance evaluation value of each server based on the ratio of the number of load connections of each server to the corresponding first performance evaluation value includes:
calculating the target ratio of the load connection quantity of each server to the maximum load connection quantity in the M servers;
dividing the target ratio of each server by the corresponding first performance evaluation value to determine a second performance evaluation value of each server, wherein a calculation formula of the second performance evaluation value of each server is as follows:
wherein ,load connection number of i-th server, < >>Indicating the selection of the maximum number of load connections in M servers, < >>Representing calculating the ratio of the number of loads of the ith server to the number of maximum load connections in the M servers, F i Second performance evaluation value, P, representing ith server i The first performance evaluation value indicating the ith server, i being a positive integer not exceeding M.
In the embodiment of the application, the connection duty ratio of the servers is obtained by calculating the ratio of the load quantity of each server to the maximum load connection quantity in the M servers, the connection duty ratio is used for representing the relative size of the connection load quantity of the servers at the moment, and the second performance evaluation value of each server is obtained by calculating the connection duty ratio and the first performance evaluation value of the servers, so that the evaluation value for comprehensively measuring the performance of the servers at the moment is obtained.
Optionally, in response to the target computing task allocated to the target edge node, allocating the computing resource of the target server for which the second performance evaluation value meets the preset condition to the target computing task includes:
and in response to the target computing task allocated to the target edge node, allocating computing resources of any one of N target servers with the second performance evaluation value lower than a first set threshold to the target computing task, wherein N is a positive integer not less than 2 and not more than M.
In the embodiment of the present application, for any server, the smaller the second performance evaluation value, the stronger the performance of that server is indicated. When the server is selected, the servers are divided according to the performance level by the first set threshold value, and any server is selected from a plurality of servers with the second performance evaluation value lower than the first set threshold value, namely a plurality of servers with better performance to take charge of the current calculation task, so that the problem that the user experience is poor due to the fact that the same server is selected each time can be avoided.
Optionally, in response to the target computing task allocated to the target edge node, allocating the computing resource of any one of the N target servers whose second performance evaluation value is lower than the first set threshold to the target computing task includes:
dividing the M servers into a first server queue, a second server queue and a third server queue based on the first set threshold, the second set threshold and second performance evaluation values of the servers, wherein the second performance evaluation value of any server in the first server queue is not greater than the first set threshold, the second performance evaluation value of any server in the second server queue is greater than the first set threshold and less than the second set threshold, the second performance evaluation value of any server in the third server queue is not less than the second set threshold, and the second set threshold is greater than the first set threshold;
in response to a target computing task allocated to the target edge node, computing resources of any target server of the first server queue are allocated to the target computing task.
In the embodiment of the application, each server is divided into the first server queue, the second server queue and the third server queue based on the first set threshold value and the second set threshold value, and when the load balancer selects the server, the load balancer preferably selects the server randomly from the first server queue with smaller second performance value, but not selects the server with optimal performance for multiple times, thereby improving the user experience.
Optionally, the method further comprises:
and if the number of servers in the first server queue is zero, distributing the computing resource of any target server in the second server queue to the target computing task.
In the embodiment of the application, the server in the first server queue can be considered to have the highest performance, the server in the second server queue can be considered to have medium performance, and the server in the third server queue can be considered to have lower performance. When the number of servers in the first server queue is zero, it can be considered that there is no server with higher performance in the current target edge node temporarily, and then any server can be selected from the second server queue with medium performance, so as to determine that the target computing task can be responded.
Optionally, the method further comprises:
and if the number of servers in the second server queue is zero, distributing the computing resource of any target server in the third server queue to the target computing task.
In the embodiment of the present application, if the number of servers in the second server queue is also zero, it may be considered that there is no server with medium performance temporarily in the current target edge node, and then any server may be selected from the third server queue with low performance, so as to determine that the target computing task may be responded.
In a second aspect, the present application provides an apparatus for allocating computing resources, the apparatus comprising:
a first determining unit, configured to determine a first performance evaluation value of each server in M servers, where the first performance parameter is inversely related to a working state parameter characterizing an used computing resource amount in a corresponding server, and M is a positive integer not less than 2, where the M servers are located at the same target edge node;
a second determining unit configured to determine a second performance evaluation value of each server based on a ratio of the number of load connections of each server to the corresponding first performance evaluation value;
and the allocation unit is used for responding to the target computing task allocated to the target edge node and allocating the computing resource of the target server of which the second performance evaluation value meets the preset condition to the target computing task.
Optionally, the first determining unit is specifically configured to:
acquiring working state parameters of each server in the M servers, wherein the working state parameters comprise CPU utilization rate, memory utilization rate, disk utilization rate and network broadband utilization rate;
determining the unused computing resource amount of each server based on the working state parameters of each server;
determining a first performance evaluation value of each server based on the unused calculation resource quantity of each server and a corresponding preset weight value, wherein the calculation formula of the first performance evaluation value is as follows:
wherein :CPU utilization of the ith server, etc. are shown>Representing memory utilization of the ith server, < >>Representing disk utilization of the ith server, < +.>Representing network bandwidth utilization, w, of an ith server 1 ,w 2 ,w 3, w 4 Indexes representing CPU utilization rate, memory utilization rate, disk utilization rate and network bandwidth utilization rate of the ith server respectively occupy weights, w1=w2>w3=w4,P i The first performance evaluation value indicating the ith server, i being a positive integer not exceeding M.
Optionally, the second determining unit is specifically configured to:
calculating the target ratio of the load connection quantity of each server to the maximum load connection quantity in the M servers;
dividing the target ratio of each server by the corresponding first performance evaluation value to determine a second performance evaluation value of each server, wherein a calculation formula of the second performance evaluation value of each server is as follows:
wherein ,load connection number of i-th server, < >>Indicating the selection of the maximum number of load connections in M servers, < >>Representing calculating the ratio of the number of loads of the ith server to the number of maximum load connections in the M servers, F i Second performance evaluation value, P, representing ith server i The first performance evaluation value indicating the ith server, i being a positive integer not exceeding M.
Optionally, the distribution unit includes:
and the resource allocation subunit is used for responding to the target calculation tasks allocated to the target edge nodes, and allocating the calculation resources of any one target server of N target servers with the second performance evaluation value lower than a first set threshold value to the target calculation tasks, wherein N is a positive integer not less than 2 and not more than M.
Optionally, the resource allocation subunit is specifically configured to:
dividing the M servers into a first server queue, a second server queue and a third server queue based on the first set threshold, the second set threshold and second performance evaluation values of the servers, wherein the second performance evaluation value of any server in the first server queue is not greater than the first set threshold, the second performance evaluation value of any server in the second server queue is greater than the first set threshold and less than the second set threshold, the second performance evaluation value of any server in the third server queue is not less than the second set threshold, and the second set threshold is greater than the first set threshold;
in response to a target computing task allocated to the target edge node, computing resources of any target server of the first server queue are allocated to the target computing task.
Optionally, the resource allocation subunit is further configured to:
and if the number of servers in the first server queue is zero, distributing the computing resource of any target server in the second server queue to the target computing task.
Optionally, the resource allocation subunit is further configured to:
and if the number of servers in the second server queue is zero, distributing the computing resource of any target server in the third server queue to the target computing task.
In a third aspect, an embodiment of the present application provides an electronic device, where the electronic device includes at least one processor and a memory connected to the at least one processor, where the at least one processor is configured to implement the steps of the method according to any embodiment of the first aspect when executing a computer program stored in the memory.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method according to any of the embodiments of the first aspect.
It should be understood that, the second to fourth aspects of the embodiments of the present application are consistent with the technical solutions of the first aspect of the embodiments of the present application, and the beneficial effects obtained by each aspect and the corresponding possible implementation manner are similar, and are not repeated.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present specification, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a distribution network system according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a method for allocating computing resources according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a computing resource allocation apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
[ detailed description ] of the application
For a better understanding of the technical solutions of the present specification, the following detailed description of the embodiments of the present application refers to the accompanying drawings.
It should be understood that the described embodiments are only some, but not all, of the embodiments of the present description. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present disclosure.
The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the description. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
With further development of networks, users have increasingly higher demands on network experiences. Referring to fig. 1, a system architecture diagram for a distribution network CDN (Content Delivery Network) is shown. Fig. 1 includes a source station, a CDN center node, a plurality of CDN edge nodes, and a plurality of user components. Each edge node includes a load balancer therein and a plurality of servers in communication with the load balancer.
For example, the CDN edge node 1 includes a load balancer and P servers, and the number of P is not particularly limited here. After a user 1 initiates a surfing request to a source station, a CDN system dispatches the request to a CDN edge node 1 close to the user, and then a load balancer in the edge node 1 selects a server in the edge node 1 according to performance evaluation results of servers in the edge node 1 to provide computing resources for surfing the user.
According to the research of the inventor, the performance of each server is evaluated through the internal parameters of the server, but the performance of the server is evaluated based on the internal parameters of the server only, so that the problem of inaccurate evaluation may exist, a load balancer may select a server with poor performance, the user experience is poor, and the problem of inaccurate performance evaluation of the server exists.
In view of this, an embodiment of the present application provides a method for allocating computing resources, in which a first performance evaluation value may be considered to be determined based on an amount of computing resources that have been used by a server, that is, the first performance evaluation value may be considered to be related to an internal parameter of the server; the load connection quantity of the server can be regarded as the external parameter of the server, so that the external parameter of the server and the internal parameter of the server are utilized together to evaluate the performance of the server more accurately, the server distributed to the current computing task is more suitable, and the user experience is improved.
The following describes the technical scheme provided by the embodiment of the application with reference to the attached drawings. Referring to fig. 2, an embodiment of the present application provides a method for allocating computing resources, where the method is applied to a load balancer, and the flow of the method is described as follows:
step 101: a first performance evaluation value of each of the M servers is determined, the first performance evaluation value being inversely related to an operating state parameter characterizing an amount of used computing resources in the corresponding server.
The M servers may be considered to be located at the same target edge node, and the load balancer in the target edge node may periodically count the operating state parameters of the M servers, where the operating state parameters are used to characterize the amount of computing resources that the servers have currently used. The period of the load balancer for counting the working state parameters of the M servers can be 0.5 hour or 1 hour each time, the working state parameters can be determined according to the number of users in the area where the target edge node is located, and if the number of users is large, the period can be set to be relatively small; conversely, if the number of users is small, the period may be set relatively large.
On the basis, the load balancing can determine the first performance evaluation value of each server according to the working state parameters of each server in the M servers. It should be noted that, for any server, the smaller the amount of computing resources that it has currently used, the more the amount of computing resources that it has available, and the higher the current performance of that server can be considered; conversely, if the greater the amount of computing resources currently in use, the less available computing resources are indicated, the lower the current performance of the server may be considered. That is, the first performance evaluation value is inversely related to the value of the operating state parameter of the server.
How the load balancer determines the corresponding first performance evaluation value based on the operating state parameters of each server is described in detail below.
In the embodiment of the application, the working state parameters of each server acquired by the load balancer comprise: CPU utilization, memory utilization, disk utilization and network broadband utilization, and then determining the unused computing resource amount of each server based on the working state parameters of each server. And finally, determining a first performance evaluation value of each server based on the unused computing resource quantity of each server and the corresponding preset weight value. The calculation formula of the first performance evaluation value of any one server is shown in formula (1):
wherein :CPU utilization of the ith server,/>Indicating the memory utilization of the i-th server,representing disk utilization of the ith server, < +.>Representing network bandwidth utilization, w, of an ith server 1 ,w 2 ,w 3, w 4 Index indicating CPU utilization, memory utilization, disk utilization and network bandwidth utilization of the ith server respectively occupy weights, P i The first performance evaluation value indicating the ith server, i being a positive integer not exceeding M. It should be understood that the two indexes of CPU utilization and memory utilization have relatively large influence on the performance of the server, and the two indexes of disk utilization and network bandwidth utilization have relatively small influence on the performance of the server, so the relationship among w1, w2, w3 and w4 may be set as follows: w1=w2>w3=w4。
For example, the CPU utilization, memory utilization, disk utilization, and network bandwidth utilization of the server a, server B, server C, server D in the same edge node are obtained, respectively, server a [ 20%,40%,25%,28% ], server B [ 35%,55%,44%,60% ], server C [ 15%,22%,24%,18% ], server D [ 60%,68%,74%,77% ], and if the CPU utilization, memory utilization, disk utilization, and network bandwidth utilization corresponding weight ratio is 0.3,0.3,0.2,0.2, the first performance evaluation value at that time of each server is calculated, the calculation result is shown in the following table 1:
TABLE 1
It can be seen from the above table that the first performance evaluation value of the server C is the largest, and the first performance evaluation value of the server D is the smallest, i.e. the performance of the server C is currently the best and the performance of the server D is the worst in terms of the dimension of the internal parameters of the server.
Step 102: a second performance evaluation value for each server is determined based on a ratio of the number of load connections of each server to the corresponding first performance evaluation value.
The CPU utilization, the memory utilization, the disk utilization, and the network broadband utilization of the server may be considered as internal parameters of the server, that is, the first performance evaluation value may be considered as related to the internal parameters of the server.
As a possible implementation manner, the load balancer may also periodically count the number of load connections of the M servers, and then determine the second performance evaluation value of each server based on a ratio of the number of load connections of each server to the corresponding first performance evaluation value.
It should be noted that, for any server, the greater the number of load connections, the performance of the server will be reduced; conversely, if the number of load connections is smaller, the performance of the device can be indicated to be higher. Then when the number of load connections of the server is based on the second performance evaluation value obtained corresponding to the first performance evaluation value, the number of load connections can be considered to be positively correlated with the second performance evaluation value, and the smaller the number of load is, the larger the first performance evaluation value is, the smaller the second performance evaluation value is, and at the moment, the higher the performance of the server can be considered; conversely, the larger the number of loads, the smaller the first performance evaluation value, the larger the second performance evaluation value, and at this time, the lower the performance of the server can be considered.
A detailed description will be given below of how the load balancer determines a corresponding second performance evaluation value based on the first performance evaluation value of each server and the number of load connections.
As a possible implementation, first, the load balancer may calculate a target ratio of the number of load connections of each server to the maximum number of load connections in the M servers. Then, dividing the target ratio of each server by the corresponding first performance evaluation value to determine a second performance evaluation value of each server, wherein the second performance evaluation value of each server has a calculation formula:
wherein ,load connection number of i-th server, < >>Indicating the selection of the maximum number of load connections in M servers, < >>Representing calculating the ratio of the number of loads of the ith server to the number of maximum load connections in the M servers, F i Second performance evaluation value, P, representing ith server i The first performance evaluation value indicating the ith server, i being a positive integer not exceeding M.
Continuing with the above example of servers, the number of load connections of each of server a, server B, server C, and server D is 35, 18, 6, and 42, and the maximum number of load connections of the servers in the same edge node is 68, and the second performance evaluation value of each server at this time is calculated, and the calculation results are shown in table 2 below:
TABLE 2
The second performance evaluation value of the server C is the lowest, which indicates that the performance of the server C is the strongest under the comprehensive evaluation, and the second performance evaluation value of the server D is the largest, which indicates that the performance of the server D is the worst under the comprehensive evaluation.
Step 103: and in response to the target computing task allocated to the target edge node, allocating computing resources of the target server, of which the second performance evaluation value meets the preset condition, to the target computing task.
In the embodiment of the present application, after obtaining the second performance evaluation value of each server, if the target edge node is allocated to a certain computing task at this time, it may be determined which computing resource of the server is allocated to the computing resource based on the second performance evaluation value, and since the obtained second performance evaluation value is more accurate, it may be considered that the server allocated to the computing task is also more reasonable.
As a possible implementation manner, the load balancer may respond to the target computing task allocated to the target edge node, so as to allocate computing resources of the target server, of which the second performance evaluation value meets the preset condition, to the target computing task.
In some embodiments, considering that some servers in the M servers may fail, when reporting their own working state parameters and the number of load connections, the load balancer may always report the preferred parameters, and then may always assign new computing tasks to the same server, thereby causing congestion and affecting the user experience.
In the embodiment of the application, when the load balancer distributes the servers for the new computing tasks, one server can be randomly selected from a plurality of servers meeting the conditions, so that the situation that the new computing tasks are distributed to the same server all the time is avoided, and the user experience is improved.
As one possible implementation, the load balancer may respond to the target computing task allocated to the target edge node, and then allocate computing resources of any of the N target servers whose second performance assessment value is below the first set threshold to the target computing task. It is understood that N is a positive integer not less than 2 and not more than M.
For example, the load balancer may divide M servers into a first server queue, a second server queue, and a third server queue based on the first set threshold, the second set threshold, and the second performance evaluation value of each server. Here, the first set threshold is smaller than the second set threshold. The second performance assessment value of any server located in the first server queue is not greater than the first set threshold, i.e., the performance of the server located in the first server queue may be considered to be the best; the second performance assessment value of any server in the second server queue is greater than the first set threshold and less than the second set threshold, i.e., the performance of the server in the second server queue may be considered to be at a mid-level; the second performance evaluation value of any one of the servers located in the third server queue is not less than the second set threshold, i.e., the performance of the server located in the third server queue may be considered to be weaker.
A detailed description of how computing resources are allocated for new computing tasks based on different policies is provided below.
Strategy one: the priority target computing task allocates computing resources of the server with good performance.
When the target computing task is allocated to the target edge node, the load balancer can allocate the computing personnel of any target server to the target computing task preferentially from the first server queue, so that the use experience of the user is ensured as much as possible.
For example, a random integer may be arbitrarily generated between [1, L ], where L is the total number of servers included in the first server queue, then a target server corresponding to the random integer may be obtained, and then the computing resource of the target server is allocated to the current target computing task.
In some embodiments, the load balancer may also allocate new computing tasks to other server queues, considering that there may not currently be a server in the target edge node that satisfies the first server queue condition for a while in the M servers, thereby ensuring that new computing tasks may be responded to.
As a possible implementation, if the load balancer determines that the number of servers in the first server queue is zero, then computing resources of any target server in the second server queue may be allocated to the target computing task.
Further, if the load balancer determines that the number of servers in the second server queue is zero, computing resources of any target server in the third server queue are allocated to the target computing task.
Strategy II: and distributing the computing resources of the server according to the importance degree of the target computing task.
The load equalizer stores a first corresponding relation between the type of the computing task and the importance level and a second corresponding relation between the importance level and the server queue in advance, and then the target importance level corresponding to the target computing task can be determined based on the first corresponding relation; and determining a target server queue corresponding to the target importance level based on the second corresponding relation, wherein the target server queue is the first server queue, the second server queue or the third server queue. Finally, the load balancer allocates computing resources of any server in the target server queue to the target computing task.
Referring to fig. 3, based on the same inventive concept, an embodiment of the present application provides a computing resource allocation apparatus, which includes: a first determination unit 201, a second determination unit 202, and an allocation unit 203.
A first determining unit 201, configured to determine a first performance evaluation value of each server in M servers, where M servers are located at the same target edge node, and M is a positive integer not less than 2, where the first performance parameter is inversely related to a working state parameter characterizing an amount of used computing resources in a corresponding server;
a second determining unit 202, configured to determine a second performance evaluation value of each server based on a ratio of the number of load connections of each server to the corresponding first performance evaluation value;
an allocation unit 203, configured to allocate, in response to the target computing task allocated to the target edge node, computing resources of the target server whose second performance evaluation value satisfies the preset condition to the target computing task.
Optionally, the first determining unit 201 is specifically configured to:
acquiring working state parameters of each server in the M servers, wherein the working state parameters comprise CPU utilization rate, memory utilization rate, disk utilization rate and network broadband utilization rate;
determining the unused computing resource amount of each server based on the working state parameters of each server;
determining a first performance evaluation value of each server based on the unused computing resource quantity of each server and a corresponding preset weight value, wherein a first performance evaluation value calculation formula is as follows:
wherein :CPU utilization of the ith server, etc. are shown>Representing memory utilization of the ith server, < >>Representing disk utilization of the ith server, < +.>Representing network bandwidth utilization, w, of an ith server 1 ,w 2 ,w 3, w 4 Indexes representing CPU utilization rate, memory utilization rate, disk utilization rate and network bandwidth utilization rate of the ith server respectively occupy weights, w1=w2>w3=w4,P i The first performance evaluation value indicating the ith server, i being a positive integer not exceeding M.
Optionally, the second determining unit 202 is specifically configured to:
calculating the target ratio of the load connection quantity of each server to the maximum load connection quantity in the M servers;
dividing the target ratio of each server by the corresponding first performance evaluation value to determine a second performance evaluation value of each server, wherein a calculation formula of the second performance evaluation value of each server is as follows:
wherein ,load connection number of i-th server, < >>Indicating the selection of the maximum number of load connections in M servers, < >>Representing calculating the ratio of the number of loads of the ith server to the number of maximum load connections in the M servers, F i Second performance evaluation value, P, representing ith server i The first performance evaluation value indicating the ith server, i being a positive integer not exceeding M.
Optionally, the distribution unit 203 includes:
and the resource allocation subunit is used for responding to the target calculation tasks allocated to the target edge nodes and allocating the calculation resources of any one target server in N target servers with the second performance evaluation value lower than the first set threshold value to the target calculation tasks, wherein N is a positive integer not less than 2 and not more than M.
Optionally, the resource allocation subunit is specifically configured to:
dividing the M servers into a first server queue, a second server queue and a third server queue based on a first set threshold, a second set threshold and second performance evaluation values of all the servers, wherein the second performance evaluation value of any server in the first server queue is not larger than the first set threshold, the second performance evaluation value of any server in the second server queue is larger than the first set threshold and smaller than the second set threshold, the second performance evaluation value of any server in the third server queue is not smaller than the second set threshold, and the second set threshold is larger than the first set threshold;
in response to the target computing task being assigned to the target edge node, computing resources of any target server of the first server queue are assigned to the target computing task.
Optionally, the resource allocation subunit is further configured to:
and if the number of servers in the first server queue is zero, distributing the computing resource of any target server in the second server queue to the target computing task.
Optionally, the resource allocation subunit is further configured to:
and if the number of servers in the second server queue is zero, distributing the computing resources of any target server in the third server queue to the target computing task.
Referring to fig. 4, based on the same inventive concept, an electronic device is provided in an embodiment of the present application, where the electronic device includes at least one processor 301, and the processor 301 is configured to execute a computer program stored in a memory, to implement the steps of the flow chart of the method for allocating computing resources according to the embodiment of the present application shown in fig. 2.
Alternatively, the processor 301 may be a central processing unit, a specific ASIC, or one or more integrated circuits for controlling the execution of programs.
Optionally, the electronic device may further comprise a memory 302 coupled to the at least one processor 301, the memory 302 may comprise ROM, RAM and disk memory. The memory 302 is used for storing data required for the operation of the processor 301, i.e. instructions executable by the at least one processor 301, the at least one processor 301 performing the method as shown in fig. 2 by executing the instructions stored by the memory 302. Wherein the number of memories 302 is one or more. The memory 302 is shown in fig. 4, but it should be noted that the memory 302 is not an essential functional block, and is therefore shown in fig. 4 by a broken line.
The physical devices corresponding to the first determining unit 201, the second determining unit 202, and the allocating unit 203 may be the aforementioned processor 301. The electronic device may be used to perform the method provided by the embodiment shown in fig. 2. Therefore, for the functions that can be implemented by each functional module in the electronic device, reference may be made to the corresponding description in the embodiment shown in fig. 2, which is not repeated.
Embodiments of the present application also provide a computer storage medium storing computer instructions that, when executed on a computer, cause the computer to perform a method as described in fig. 2.
The foregoing description of the preferred embodiments is provided for the purpose of illustration only, and is not intended to limit the scope of the disclosure, since any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the disclosure are intended to be included within the scope of the disclosure.

Claims (10)

1. A method of computing resource allocation, the method comprising:
determining a first performance evaluation value of each server in M servers, wherein the first performance evaluation value is inversely related to a working state parameter representing used computing resource quantity in a corresponding server, the M servers are all positioned at a target edge node, and M is a positive integer not less than 2;
determining a second performance evaluation value of each server based on the ratio of the number of load connections of each server to the corresponding first performance evaluation value;
and responding to the target computing task distributed to the target edge node, and distributing the computing resource of the target server of which the second performance evaluation value meets the preset condition to the target computing task.
2. The method of claim 1, wherein determining a first performance evaluation value for each of the M servers comprises:
acquiring working state parameters of each server in the M servers, wherein the working state parameters comprise CPU utilization rate, memory utilization rate, disk utilization rate and network broadband utilization rate;
determining the unused computing resource amount of each server based on the working state parameters of each server;
determining a first performance evaluation value of each server based on the unused calculation resource quantity of each server and a corresponding preset weight value, wherein the calculation formula of the first performance evaluation value is as follows:
wherein :CPU utilization of the ith server, etc. are shown>Represents the memory utilization of the ith server,representing disk utilization of the ith server, < +.>Representing network bandwidth utilization, w, of an ith server 1 ,w 2 ,w 3, w 4 Indexes representing CPU utilization rate, memory utilization rate, disk utilization rate and network bandwidth utilization rate of the ith server respectively occupy weights, w1=w2>w3=w4,P i The first performance evaluation value indicating the ith server, i being a positive integer not exceeding M.
3. The method of claim 1, wherein determining a second performance evaluation value for each server based on a ratio of a number of load connections for each server to a corresponding first performance evaluation value comprises:
calculating the target ratio of the load connection quantity of each server to the maximum load connection quantity in the M servers;
dividing the target ratio of each server by the corresponding first performance evaluation value to determine a second performance evaluation value of each server, wherein a calculation formula of the second performance evaluation value of each server is as follows:
wherein ,load connection number of i-th server, < >>Indicating the selection of the maximum number of load connections in M servers, < >>Representing calculating the ratio of the number of loads of the ith server to the number of maximum load connections in the M servers, F i Second performance evaluation value, P, representing ith server i The first performance evaluation value indicating the ith server, i being a positive integer not exceeding M.
4. The method of claim 1, wherein assigning computing resources of a target server for which the second performance evaluation value satisfies a preset condition to a target computing task assigned to the target edge node in response to the target computing task comprises:
and in response to the target computing task allocated to the target edge node, allocating computing resources of any one of N target servers with the second performance evaluation value lower than a first set threshold to the target computing task, wherein N is a positive integer not less than 2 and not more than M.
5. The method of claim 4, wherein assigning computing resources of any one of the N target servers for which the second performance assessment value is below a first set threshold to the target computing task in response to the target computing task assigned to the target edge node comprises:
dividing the M servers into a first server queue, a second server queue and a third server queue based on the first set threshold, the second set threshold and second performance evaluation values of the servers, wherein the second performance evaluation value of any server in the first server queue is not greater than the first set threshold, the second performance evaluation value of any server in the second server queue is greater than the first set threshold and less than the second set threshold, the second performance evaluation value of any server in the third server queue is not less than the second set threshold, and the second set threshold is greater than the first set threshold;
in response to a target computing task allocated to the target edge node, computing resources of any target server of the first server queue are allocated to the target computing task.
6. The method of claim 5, wherein the method further comprises:
and if the number of servers in the first server queue is zero, distributing the computing resource of any target server in the second server queue to the target computing task.
7. The method of claim 6, wherein the method further comprises:
and if the number of servers in the second server queue is zero, distributing the computing resource of any target server in the third server queue to the target computing task.
8. An apparatus for allocating computing resources, the apparatus comprising:
a first determining unit, configured to determine a first performance evaluation value of each server in M servers, where the first performance parameter is inversely related to a working state parameter characterizing an used computing resource amount in a corresponding server, and M is a positive integer not less than 2, where the M servers are located at the same target edge node;
a second determining unit configured to determine a second performance evaluation value of each server based on a ratio of the number of load connections of each server to the corresponding first performance evaluation value;
and the allocation unit is used for responding to the target computing task allocated to the target edge node and allocating the computing resource of the target server of which the second performance evaluation value meets the preset condition to the target computing task.
9. An electronic device comprising at least one processor and a memory coupled to the at least one processor, the at least one processor being configured to implement the steps of the method of any of claims 1-7 when executing a computer program stored in the memory.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any of claims 1-7.
CN202210796473.1A 2022-07-06 2022-07-06 Computing resource allocation method and device, electronic equipment and storage medium Pending CN116962396A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210796473.1A CN116962396A (en) 2022-07-06 2022-07-06 Computing resource allocation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210796473.1A CN116962396A (en) 2022-07-06 2022-07-06 Computing resource allocation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116962396A true CN116962396A (en) 2023-10-27

Family

ID=88455283

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210796473.1A Pending CN116962396A (en) 2022-07-06 2022-07-06 Computing resource allocation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116962396A (en)

Similar Documents

Publication Publication Date Title
US10848428B2 (en) Method for dynamically allocating resources in an SDN/NFV network based on load balancing
US20230093389A1 (en) Service request allocation method and apparatus, computer device, and storage medium
CN110333937B (en) Task distribution method, device, computer equipment and storage medium
CN110365765B (en) Bandwidth scheduling method and device of cache server
US7882230B2 (en) Method and system for dynamically allocating servers to compute-resources using capacity thresholds
CN108667748B (en) Method, device, equipment and storage medium for controlling bandwidth
CN104580538B (en) A kind of method of raising Nginx server load balancing efficiency
US7467291B1 (en) System and method for calibrating headroom margin
CN109218355A (en) Load equalizing engine, client, distributed computing system and load-balancing method
Bhatia et al. Htv dynamic load balancing algorithm for virtual machine instances in cloud
CN103699445A (en) Task scheduling method, device and system
US10027760B2 (en) Methods, systems, and computer readable media for short and long term policy and charging rules function (PCRF) load balancing
JP2012090258A (en) Method and apparatus for allocating network rate
CN109710412A (en) A kind of Nginx load-balancing method based on dynamical feedback
CN111176840B (en) Distribution optimization method and device for distributed tasks, storage medium and electronic device
CN107707612B (en) Method and device for evaluating resource utilization rate of load balancing cluster
CN111949408A (en) Dynamic allocation method for edge computing resources
CN114205316A (en) Network slice resource allocation method and device based on power service
CN112689007A (en) Resource allocation method, device, computer equipment and storage medium
CN113938435A (en) Data transmission method, data transmission device, electronic device, storage medium, and program product
KR101448413B1 (en) Method and apparatus for scheduling communication traffic in atca-based equipment
CN108200185B (en) Method and device for realizing load balance
CN113556397B (en) Cloud service resource scheduling method facing gateway of Internet of things
CN115168017B (en) Task scheduling cloud platform and task scheduling method thereof
CN117411887A (en) Cross-service cluster grouping theory and processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination