CN114928606B

CN114928606B - Scheduling method and system for server resources

Info

Publication number: CN114928606B
Application number: CN202210111861.1A
Authority: CN
Inventors: 曾浩
Original assignee: Shanghai Handpay Information & Technology Co ltd
Current assignee: Shanghai Handpay Information & Technology Co ltd
Priority date: 2022-01-29
Filing date: 2022-01-29
Publication date: 2024-04-23
Anticipated expiration: 2042-01-29
Also published as: CN114928606A

Abstract

The invention relates to the technical field of server operation and maintenance, in particular to a server resource scheduling method, which comprises the following steps: step S1: acquiring idle hardware resources of at least one server of the executed application instance and required hardware resources of the application instance; step S2: judging whether the idle hardware resources are larger than the required hardware resources; if yes, returning to the step S1; if not, turning to the step S3; step S3: a server is selected from the server resource pool by adopting a capacity expansion method, the application instance is distributed to the server, and then the step S1 is returned. The invention has the beneficial effects that: the hardware resources required by maintaining the normal operation of the application instance in a future time period are effectively judged by acquiring the required resources of the application instance, so that the server is effectively scheduled in a short time. The number of servers for running a certain application instance can be increased or decreased in time according to the change condition of the burst flow, and the overall utilization rate of the server system is improved.

Description

Scheduling method and system for server resources

Technical Field

The invention relates to the technical field of server operation and maintenance, in particular to a scheduling method of server resources.

Background

With the gradual expansion of the internet service scale and the need for actual business, internet enterprises often need to provide normal services for users under conditions of high concurrency and sudden traffic. Concurrency, a concept in the operating system field, refers to the phenomenon of alternating execution of multiple task flows over a period of time. When an internet enterprise provides internet service for a large number of users in a short time based on service requirements, a gateway device and a back-end server of the internet enterprise often bear a large number of access requests, so that the number of the access requests exceeds the upper limit of the load of the server under the normal condition, and the normal operation of the internet service is affected. In general, with service expansion or specific service nodes, the internet enterprise adaptively increases the number of servers to increase the upper load limit, so as to avoid the influence of high concurrency or burst traffic on the internet service. However, excessive server deployment at idle or in general may lead to increased operating and maintenance costs. Therefore, the dynamic regulation and control of the server resources are realized by setting the server scheduling system when the service normally operates, so that the method has great economic value.

In the prior art, the scheduling of the server usually depends on manual operation, and the capacity expansion time is long, so that the response of the whole scheduling process is not timely enough, and burst traffic cannot be effectively treated.

Disclosure of Invention

Aiming at the problems in the prior art, a scheduling method of server resources is provided.

The specific technical scheme is as follows:

The scheduling method of the server resources is applicable to a server resource pool, wherein a plurality of servers are arranged in the server resource pool, and the servers are used for executing at least one application instance, and the application instance runs on at least one server;

the scheduling method includes an allocation method for allocating the server to the running application instance, and the allocation method specifically includes:

step S1: acquiring idle hardware resources of at least one server which has executed the application instance and the hardware resources required by the application instance;

Step S2: judging whether the idle hardware resources are larger than the required hardware resources;

if yes, returning to the step S1;

if not, turning to the step S3;

Step S3: and selecting a server from the server resource pool by adopting a capacity expansion method, distributing the application instance to the server, and returning to the step S1.

Preferably, the capacity expansion method comprises the following steps:

Step A1: acquiring idle hardware resources of a plurality of servers in a current server resource pool;

step A2: selecting at least two servers which can be used for executing the application instance according to the idle hardware resources of the servers and the required hardware resources of the application instance;

Step A3: and selecting the server with the highest score as the server for executing the instance by adopting a server scoring method.

Preferably, the step A2 further includes:

and when the number of the servers which can be used for executing the application instance is less than two, sending out early warning information to operation and maintenance personnel.

Preferably, the hardware resources include: processor idle rate, memory idle rate, hard disk idle rate, network bandwidth idle rate;

said step A3 comprises:

step A31: respectively calculating and generating a processor resource score, a memory resource score, a hard disk resource score and a network resource score according to the processor idle rate, the memory idle rate, the hard disk idle rate and the network bandwidth idle rate;

Step A32: generating a server score for the server according to the processor resource score, the memory resource score, the hard disk resource score and the network resource score;

Step A33: and sequencing a plurality of servers from high to low according to the server scores so as to output the server with the highest score.

Preferably, the scheduling method further includes stopping the application instance from the server using a first or second scaling method

The first capacity reduction method includes:

Acquiring the hardware resources occupied by a plurality of application instances in operation and the running time of the plurality of application instances;

generating an instance hardware occupation score according to the application instance occupation hardware resources;

sorting the application instances according to the running time to generate instance sorting results;

generating an instance time score according to the instance sorting result;

generating an instance score from the instance time score and the instance hardware occupancy score;

stopping the application instance on the server according to the instance score;

the second capacity reduction method comprises the following steps: stopping the application instance from the server according to a preset instance capacity reduction rule.

A server resource scheduling system for implementing the scheduling method, which is characterized by comprising:

The acquisition module is connected with the server resource pool, acquires idle resources in the server resource pool and occupied resources of the application instance, and also acquires the access quantity of the application instance;

the analysis module is connected with the acquisition module and used for judging whether capacity expansion or capacity contraction is needed according to the occupied resources and the access quantity of the application instance;

the scheduling module is connected with the analysis module, and distributes the server to the application instance according to the output result of the analysis module, or stops the application instance from the server;

And the feedback module is connected with the acquisition module, the analysis module and the scheduling module and is used for displaying the scheduling results of the idle resources, the application examples, the occupied resources and the scheduling module.

Preferably, the acquisition module comprises:

The flow monitoring submodule is connected with external routing equipment and acquires the access quantity of the application instance from the routing equipment;

The hardware statistics sub-module is connected with the server resource pool, and acquires the hardware resources of the server from the server resource pool;

and the instance statistics sub-module is connected with the server resource pool and is used for collecting occupied resources of the application instance from the server resource pool.

Preferably, the analysis module comprises:

the rule configuration sub-module is provided with at least one early warning rule in a preset mode;

And the matching submodule is connected with the acquisition module and the rule configuration submodule, judges whether the access quantity, the occupied resources and the idle resources accord with the early warning rule or not and generates a matching result.

Preferably, the scheduling module includes:

The capacity expansion submodule is connected with the server resource pool and the analysis module, and distributes the application instance to the server according to a preset capacity expansion rule;

The contraction Rong Zi module is connected with the server resource pool and the analysis module, and stops the application instance from the server according to a preset contraction rule;

the first notification sub-module is connected with the capacity expansion sub-module and the capacity contraction sub-module, and generates the scheduling result according to the output result of the capacity expansion sub-module or the output result of the capacity contraction Rong Zi module.

Preferably, the feedback module includes:

the display sub-module is connected with an external display screen, and the display screen is used for displaying the idle resources, the application examples, the occupied resources and the scheduling results of the scheduling module;

and the second notification sub-module is connected with the scheduling module and external terminal equipment and is used for sending notification information to the terminal equipment according to the scheduling result of the scheduling module.

The technical scheme has the following advantages or beneficial effects: the hardware resources required by maintaining the normal operation of the application instance in a future time period are effectively judged by acquiring the required resources of the application instance, so that the server is effectively scheduled in a short time. The number of servers for running a certain application instance can be increased or decreased in time according to the change condition of the burst flow, and the overall utilization rate of the server system is improved.

Drawings

Embodiments of the present invention will now be described more fully with reference to the accompanying drawings. The drawings, however, are for illustration and description only and are not intended as a definition of the limits of the invention.

FIG. 1 is a schematic diagram of an allocation method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a capacity expansion method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of the sub-steps of step A3 according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a first method of shrinking volume according to an embodiment of the present invention;

fig. 5 is a system schematic block diagram of an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other.

The invention is further described below with reference to the drawings and specific examples, which are not intended to be limiting.

The invention comprises the following steps:

The scheduling method of the server resources is suitable for a server resource pool, wherein a plurality of servers are arranged in the server resource pool, and the servers are used for executing at least one application instance, and the application instance runs on the at least one server;

The scheduling method includes an allocation method for allocating a server to the running application instance, as shown in fig. 1, where the allocation method specifically includes:

step S1: acquiring idle hardware resources of at least one server of the executed application instance and required hardware resources of the application instance;

if yes, returning to the step S1;

if not, turning to the step S3;

Step S3: a server is selected from the server resource pool by adopting a capacity expansion method, the application instance is distributed to the server, and then the step S1 is returned.

Specifically, the application provides a scheduling method aiming at the problem that the application example in the prior art has slower response speed and cannot respond to burst traffic well by adopting a manual capacity expansion method. When the method is implemented, the method is deployed in a server or other computer equipment as a software embodiment, access requests for accessing corresponding application instances are acquired through a connection gateway, a routing or load balancing device and the like, further, required resources of the application instances in a future time period are presumed, and the server for executing the application instances is expanded or contracted according to a prediction result. The application instances, in implementation, typically appear as processes, services, or other computer programs in a server operating system for an internet enterprise to provide internet services to users. The occupied resources of the application instance refer to hardware resources required for normal operation of the application instance, and include a processor idle rate, a memory idle rate, a hard disk idle rate, a network bandwidth idle rate and the like. The server resource pool may in embodiments be represented as a cloud service provider provided server cluster or as a private cloud built by the user himself. The hardware of each server in the server resource pool is not necessarily the same, and the specific hardware configuration of each server and the current idle hardware resources are stored in the server resource pool in the form of a physical machine list, so that the scheduling method is used as the basis for judgment and scheduling.

In a preferred embodiment, as shown in fig. 2, the capacity expansion method includes:

step A3: a server scoring method is used to select the highest scoring server as the server for executing the instance.

In a preferred embodiment, step A2 further comprises:

When the number of the servers for executing the application examples is less than two, sending an early warning message to the operation and maintenance personnel for reminding the operation and maintenance personnel to perform hardware expansion or reduction on part of the application examples on the server resource pool.

Specifically, the expansion method includes a pre-selection step therein. When the application instance needs to be expanded, the physical machine list of the server resource pool is called to acquire the current idle hardware resource status of all servers in the server hardware resource pool, and the servers which can be used for executing the application instance are screened out. The filtering condition is an and relationship, i.e. the current idle hardware resource of the server must satisfy the processor idle rate, the memory idle rate, the hard disk idle rate and the network bandwidth idle rate for executing the application instance at the same time. And when the number of selected servers is less than two, it indicates that the server load in the current server resource pool is close to the upper limit, a new server should be added, or a portion of the application instance should be narrowed down to free up hardware resources of the server.

In a preferred embodiment, the hardware resources include: processor idle rate, memory idle rate, hard disk idle rate, network bandwidth idle rate;

as shown in fig. 3, step A3 includes:

step A33: and sequencing the servers from high to low according to the server scores so as to output the server with the highest score.

Specifically, in one embodiment, the method for calculating the server score includes:

When the processor idle rate is greater than 50%, the processor resource score is 10 points; when the processor idle rate is less than 50% and greater than 20%, the processor resource score is 5 points; when the processor idle rate is less than 20%, the processor resource score is 0.

When the memory idle rate is greater than 50%, the memory resource score is 10 points; when the memory idle rate is less than 50% and more than 20%, the memory resource score is 5 points; when the memory idle rate is less than 20%, the memory resource score is 0.

When the hard disk idle rate is greater than 50%, the hard disk resource score is 10 points; when the hard disk idle rate is less than 50% and more than 20%, the hard disk resource score is 5 points; when the hard disk idle rate is less than 20%, the hard disk resource score is 0.

When the network bandwidth idle rate is more than 50%, the network resource score is 10 points; when the network bandwidth idle rate is less than 50% and greater than 20%, the network resource score is 5 points; when the network bandwidth idle rate is less than 20%, the network resource score is 0.

In a preferred embodiment, the scheduling method further includes stopping the application instance from the server by using a first or second capacity reduction method;

As shown in fig. 4, the first capacity reduction method includes:

Acquiring hardware resources occupied by a plurality of application instances in operation and the running time of the plurality of application instances;

sequencing the application instances according to the running time to generate an instance sequencing result;

Generating an instance time score according to the instance sequencing result;

generating an instance score according to the instance time score and the instance hardware occupation score;

the second volume shrinking method comprises the following steps: stopping applying the instance from the server according to a preset instance shrinking rule.

Specifically, in the actual implementation process, the sequence between the generation process of the instance time score and the instance hardware occupation score is not limited, and may be executed sequentially, in the opposite direction, or simultaneously.

As an alternative embodiment, the example time score is calculated as follows:

selecting an application instance with the longest service running time, and recording the time score of the application instance as 1 score; and selecting the earliest generated application instance, and recording the time score of the application instance as 20 points.

As an alternative embodiment, the example hardware occupancy score is calculated as follows;

When the memory occupancy rate of the application instance exceeds 85%, recording the instance score as 100; when the memory occupancy rate of the application example is more than 70% and less than 85%, increasing the hardware occupancy score of the application example by 10 points; when the memory occupancy rate of the application example is more than 50% and less than 70%, the hardware occupancy score of the application example is increased by 5 minutes; when the memory occupancy rate of the application example is less than 50%, the hardware occupancy score of the application example is increased by 1 minute.

When the processor occupancy rate of the application instance exceeds 85%, recording the instance score as 100; when the processor occupancy rate of the application example is more than 50% and less than 85%, the hardware occupancy score of the application example is increased by 10 points; when the processor occupancy of the application instance is less than 50%, the hardware occupancy score for the application instance is increased by 1 point.

When the occupancy rate of the hard disk of the application example exceeds 85%, recording the score of the application example as 100; when the occupancy rate of the hard disk of the application example is more than 70% and less than 85%, the hardware occupancy score of the application example is increased by 10 points; when the occupancy rate of the hard disk of the application example is more than 50% and less than 70%, the hardware occupancy score of the application example is increased by 5 points; when the occupancy rate of the hard disk of the application example is less than 50%, the hardware occupancy score of the application example is increased by 1 minute.

When the network bandwidth occupancy rate of the application example exceeds 85%, recording the example score as 100; when the network bandwidth occupancy rate of the application example is more than 50% and less than 85%, increasing the hardware occupancy score of the application example by 10 minutes; when the network bandwidth occupancy rate of the application instance is less than 50%, the hardware occupancy score of the application instance is increased by 1 minute.

A server resource scheduling system for implementing the above scheduling method, as shown in fig. 5, includes:

the acquisition module 1 is connected with the server resource pool A, acquires idle resources in the server resource pool A and occupied resources of the application instance, and the acquisition module 1 also acquires the access quantity of the application instance;

The analysis module 2 is connected with the acquisition module 1, and judges whether capacity expansion or capacity contraction is needed according to the occupied resources and the access amount of the application instance;

The scheduling module 3 is connected with the analysis module 2, and distributes the server to the application instance according to the output result of the analysis module 2, or stops the application instance from the server;

The feedback module 4, the feedback module 3 is connected with the acquisition module 1, the analysis module 2 and the scheduling module 3, and is used for displaying the idle resources, the application examples, the occupied resources and the scheduling results of the scheduling module.

In a preferred embodiment, the acquisition module 1 comprises:

the flow monitoring sub-module 11, the flow monitoring sub-module 11 is connected with an external routing device B, and the access quantity of the application instance is obtained from the routing device B;

the hardware statistics sub-module 12 is connected with the server resource pool A, and the hardware statistics sub-module acquires the hardware resources of the server from the server resource pool A;

The instance statistics sub-module 13, the instance statistics sub-module 13 is connected with the server resource pool A, and the instance statistics sub-module 13 collects the occupied resources of the application instance from the server resource pool A.

In a preferred embodiment, the analysis module 2 comprises:

The rule configuration sub-module 21, at least one early warning rule is preset in the rule configuration sub-module 21;

the matching submodule 22 is connected with the acquisition module 1 and the rule configuration submodule 21, judges whether the access quantity, occupied resources and idle resources accord with the early warning rule or not, and generates a matching result.

In a preferred embodiment, the scheduling module 3 comprises:

the capacity expansion sub-module 31, the capacity expansion sub-module 31 connects the server resource pool A and the analysis module 2, the capacity expansion sub-module 31 distributes the application instance to the server according to a preset capacity expansion rule;

The contraction sub-module 32, the contraction sub-module 32 connects the server resource pool A and the analysis module 2, and the contraction sub-module stops the application instance from the server according to a preset contraction rule;

The first notification sub-module 33, the first notification sub-module 33 connects the capacity expansion sub-module 31 and the capacity contraction sub-module 32, and the first notification sub-module 33 generates a scheduling result according to the output result of the capacity expansion sub-module 31 or the output result of the capacity contraction sub-module 32.

In a preferred embodiment, the feedback module 4 comprises:

the display sub-module 41 is connected with an external display screen, and the display screen is used for displaying idle resources, application examples, occupied resources and scheduling results of the scheduling module;

And the second notification sub-module 42 is connected with the scheduling module 3 and the external terminal equipment C, and is used for sending notification information to the terminal equipment according to the scheduling result of the scheduling module.

The invention has the beneficial effects that: the hardware resources required by maintaining the normal operation of the application instance in a future time period are effectively judged by acquiring the required resources of the application instance, so that the server is effectively scheduled in a short time. The number of servers for running a certain application instance can be increased or decreased in time according to the change condition of the burst flow, and the overall utilization rate of the server system is improved.

The foregoing is merely illustrative of the preferred embodiments of the present invention and is not intended to limit the embodiments and scope of the present invention, and it should be appreciated by those skilled in the art that equivalent substitutions and obvious variations may be made using the description and illustrations of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. The scheduling method of the server resources is characterized by being suitable for a server resource pool, wherein a plurality of servers are arranged in the server resource pool, and the servers are used for executing at least one application instance, and the application instance runs on at least one server;

if yes, returning to the step S1;

if not, turning to the step S3;

Step S3: selecting a server from the server resource pool by adopting a capacity expansion method, distributing the application instance to the server, and returning to the step S1;

The capacity expansion method comprises the following steps:

step A3: selecting a server with highest score as a server for executing the application instance by adopting a server scoring method;

The hardware resources include: processor idle rate, memory idle rate, hard disk idle rate, network bandwidth idle rate;

said step A3 comprises:

Step A33: sequentially sequencing a plurality of servers from high to low according to the server scores so as to output the server with the highest score;

in the step a32, the method for calculating the server score includes:

when the processor idle rate is greater than 50%, the processor resource score is 10 points; when the processor idle rate is less than 50% and greater than 20%, the processor resource score is 5 points; when the processor idle rate is less than 20%, the processor resource score is 0 score;

When the memory idle rate is greater than 50%, the memory resource score is 10 points; when the memory idle rate is less than 50% and greater than 20%, the memory resource score is 5 points; when the memory idle rate is less than 20%, the memory resource score is 0 score;

When the hard disk idle rate is greater than 50%, the hard disk resource score is 10 points; when the hard disk idle rate is less than 50% and more than 20%, the hard disk resource score is 5 points; when the hard disk idle rate is less than 20%, the hard disk resource score is 0 score;

When the network bandwidth idle rate is greater than 50%, the network resource score is 10 points; when the network bandwidth idle rate is less than 50% and greater than 20%, the network resource score is 5 points; when the network bandwidth idle rate is less than 20%, the network resource score is 0 score;

the scheduling method further comprises stopping the application instance from the server by adopting a first capacity reduction method or a second capacity reduction method;

The first capacity reduction method includes:

generating an instance time score according to the instance sorting result;

2. The scheduling method according to claim 1, wherein the step A2 further comprises:

3. A scheduling system for server resources, configured to implement a scheduling method according to any one of claims 1-2, comprising:

4. A scheduling system according to claim 3 wherein the acquisition module comprises:

5. A scheduling system according to claim 3 wherein the analysis module comprises:

6. A scheduling system according to claim 3 wherein the scheduling module comprises:

The first notification sub-module is connected with the capacity expansion sub-module and the capacity contraction sub-module, and the first notification sub-module generates the scheduling result according to the output result of the capacity expansion sub-module or the output result of the capacity contraction Rong Zi module.

7. The scheduling system of claim 6, wherein the feedback module comprises: