CN115269120A - NUMA node scheduling method, device, equipment and storage medium of virtual machine - Google Patents

NUMA node scheduling method, device, equipment and storage medium of virtual machine Download PDF

Info

Publication number
CN115269120A
CN115269120A CN202210916907.7A CN202210916907A CN115269120A CN 115269120 A CN115269120 A CN 115269120A CN 202210916907 A CN202210916907 A CN 202210916907A CN 115269120 A CN115269120 A CN 115269120A
Authority
CN
China
Prior art keywords
node
virtual machine
numa
load
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210916907.7A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Anchao Cloud Software Co Ltd
Original Assignee
Jiangsu Anchao Cloud Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Anchao Cloud Software Co Ltd filed Critical Jiangsu Anchao Cloud Software Co Ltd
Priority to CN202210916907.7A priority Critical patent/CN115269120A/en
Publication of CN115269120A publication Critical patent/CN115269120A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The application relates to a NUMA node scheduling method, device, equipment and storage medium of a virtual machine, in particular to the technical field of computers. The method comprises the following steps: determining a target number of NUMA nodes required by a target virtual machine; selecting a target number of NUMA nodes from each NUMA node, and binding the NUMA nodes with the target virtual machine; monitoring the load condition of a NUMA node bound with the target virtual machine, and switching the binding relation between the target virtual machine and the first node to a second node when the load of the first node is unbalanced; the load condition includes the CPU load and the memory load. According to the scheme, when the balance operation of the NUMA nodes is realized, the conditions of CPU load and memory load are considered at the same time, the condition that the distribution of computing resources is not balanced is avoided as much as possible, and the service performance of the target virtual machine is improved.

Description

NUMA node scheduling method, device, equipment and storage medium of virtual machine
Technical Field
The invention relates to the technical field of computers, in particular to a NUMA node scheduling method, a NUMA node scheduling device, NUMA node scheduling equipment and a NUMA node scheduling storage medium of a virtual machine.
Background
Non-uniform memory access (NUMA) is a computer memory bank design for multiple processors, with memory access times depending on the memory locations of the processors.
With NUMA technology, tens of CPUs (even hundreds of CPUs) can be combined in one server. The basic feature of a NUMA server is to have multiple CPU modules, each CPU module consisting of multiple CPUs (e.g., 4) and having independent local memory, I/O slots, etc. Because the nodes can be connected and information can be exchanged through the interconnection module, each CPU can access the memory of the whole system. Obviously, the speed of accessing local memory will be much higher than the speed of accessing remote memory (memory of other nodes in the system), which is also the origin of non-uniform memory access NUMA. And a Virtual Machine (Virtual Machine) refers to a complete computer system with complete hardware system functions, which is simulated by software and runs in a completely isolated environment. The creation and use of the virtual machine are based on the hardware foundation of the physical machine, and the physical CPU and the memory resource of the physical machine are required to be used as the VCPU and the memory of the virtual machine. When a virtual machine is assigned a NUMA node, the least loaded NUMA node is typically assigned to the virtual machine.
When the computing resources are distributed to the virtual machines through the scheme, unbalanced distribution of the computing resources is easily caused, and the performance of the service is low.
Disclosure of Invention
The application provides a NUMA node scheduling method, device, equipment and storage medium of a virtual machine, which improve the service performance of a target virtual machine.
In one aspect, a method for NUMA node scheduling for a virtual machine is provided, the method comprising:
determining a target number of NUMA nodes required by a target virtual machine;
selecting a target number of NUMA nodes from each NUMA node, and binding the target number of NUMA nodes with the target virtual machine;
monitoring the load condition of a NUMA node bound with the target virtual machine, and switching the binding relationship between the target virtual machine and the first node to a second node when the load of the first node is unbalanced; the load condition includes a CPU load and a memory load.
In yet another aspect, an apparatus for NUMA node scheduling for a virtual machine is provided, the apparatus comprising:
a target number determination module to determine a target number of NUMA nodes required by a target virtual machine;
a node determining module, configured to select a target number of NUMA nodes in each NUMA node, and bind the NUMA nodes to the target virtual machine;
the load balancing module is used for monitoring the load condition of the NUMA node bound with the target virtual machine and switching the binding relationship between the target virtual machine and the first node to a second node when the load of the first node is unbalanced; the load condition includes a CPU load and a memory load.
In a possible implementation manner, the load balancing module is further configured to,
when detecting that at least one of the CPU load and the memory load of the first node exceeds a target threshold, determining that load imbalance exists in the first node, and switching the binding relationship between the target virtual machine and the first node to a second node.
In a possible implementation manner, the load balancing module is further configured to,
acquiring each candidate node except the first node in a first physical machine corresponding to the first node;
and selecting a second node according to the load condition of the candidate node so as to switch the binding relationship between the target virtual machine and the first node to the second node.
In a possible implementation manner, the load balancing module is further configured to select an optimal candidate node with an optimal average load condition from the candidate nodes; the average load condition is the average value of the CPU occupancy rate and the memory occupancy rate;
and when the residual load resources of the optimal candidate node are higher than the occupied resources of the target virtual machine in the first node, determining the optimal candidate node as a second node.
In one possible implementation manner, the load balancing module is further configured to,
when the residual load resources of the optimal candidate node are lower than a resource threshold value, selecting a secondary candidate node from adjacent physical machines of the first physical machine;
and selecting a second node according to the load condition of the secondary candidate node so as to switch the binding relationship between the target virtual machine and the first node to the second node.
In one possible implementation, the node determining module is further configured to,
querying nodes which are not over-configured in each NUMA node;
and when each NUMA node has non-super-matched nodes, selecting a target number of NUMA nodes from the non-super-matched nodes, and binding the non-super-matched nodes with the target virtual machine through cgroup.
In one possible implementation manner, the node determination module is further configured to,
and when no non-super-matched node exists in each NUMA node, selecting a target number of NUMA nodes from the super-matched NUMA nodes, and binding the target number of NUMA nodes with the target virtual machine through cgroup.
In a possible implementation manner, the node determining module is further configured to sort the non-over-allocated NUMA nodes according to load conditions, and select a target number of NUMA nodes with an optimal average load condition.
In a possible implementation manner, the apparatus further includes a database storage module, configured to determine each NUMA node in each physical machine;
and detecting the load condition of each NUMA node according to a specified period, and storing the detection result into a load database.
In yet another aspect, a computer device is provided that includes a processor and a memory having stored therein at least one instruction that is loaded and executed by the processor to implement the NUMA node scheduling method for a virtual machine described above.
In yet another aspect, a computer-readable storage medium is provided having stored therein at least one instruction that is loaded and executed by a processor to implement the above-described NUMA node scheduling method for a virtual machine.
In yet another aspect, a computer program product or computer program is provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions are read from a computer readable storage medium by a processor of a computer device, and the computer instructions are executed by the processor to cause the computer device to execute the NUMA node scheduling method of the virtual machine.
The technical scheme provided by the application can comprise the following beneficial effects:
before NUMA nodes are distributed to a target virtual machine, the target number of the NUMA nodes required by the target virtual machine needs to be calculated, then the NUMA nodes with the target number are selected from each NUMA node and bound with the target virtual machine, after the binding is completed, the load condition of the NUMA nodes bound with the target virtual machine is monitored in real time, and when the condition that the load of a first node is unbalanced is monitored, the binding relation between the target virtual machine and the first node is switched to a second node. According to the scheme, when the balance operation of the NUMA nodes is realized, the conditions of CPU load and memory load are considered at the same time, the condition that the distribution of computing resources is not balanced is avoided as much as possible, and the service performance of the target virtual machine is improved.
Drawings
In order to more clearly illustrate the detailed description of the present application or the technical solutions in the prior art, the drawings needed to be used in the detailed description of the present application or the prior art description will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram showing the structure of a NUMA node scheduling system for a virtual machine in accordance with an exemplary embodiment.
FIG. 2 is a method flow diagram illustrating a method for NUMA node scheduling for a virtual machine in accordance with an illustrative embodiment.
FIG. 3 is a method flow diagram illustrating a NUMA node scheduling method for a virtual machine in accordance with an exemplary embodiment.
Fig. 4 shows a schematic diagram of virtual machine random resource scheduling according to an embodiment of the present application.
Fig. 5 is a schematic diagram illustrating that a virtual machine binds a single resource according to an embodiment of the present application.
Fig. 6 is a schematic diagram illustrating that a virtual machine binds multiple resources according to an embodiment of the present application.
FIG. 7 shows a target virtual machine NUMA node selection flow diagram in accordance with an embodiment of the present application.
Fig. 8 is a schematic diagram illustrating a load database scheme according to an embodiment of the present application.
FIG. 9 illustrates a NUMA node scheduling apparatus for a virtual machine according to an embodiment of the present application.
FIG. 10 is a schematic diagram of a computer device provided in accordance with an exemplary embodiment of the present application.
Detailed Description
The technical solutions of the present application will be described clearly and completely with reference to the accompanying drawings, and it is to be understood that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be understood that "indication" mentioned in the embodiments of the present application may be a direct indication, an indirect indication, or an indication of an association relationship. For example, a indicates B, which may indicate that a directly indicates B, e.g., B may be obtained by a; it may also mean that a indicates B indirectly, e.g. a indicates C, by which B may be obtained; it can also mean that there is an association between a and B.
In the description of the embodiments of the present application, the term "correspond" may indicate that there is a direct correspondence or an indirect correspondence between the two, may also indicate that there is an association between the two, and may also indicate and be indicated, configure and configured, and so on.
In the embodiment of the present application, "predefining" may be implemented by pre-saving a corresponding code, table or other means that can be used to indicate related information in a device (for example, including a terminal device and a network device), and the present application is not limited to a specific implementation manner thereof.
Before describing the various embodiments shown herein, several concepts related to the present application will be described.
1) Virtual Machine (Virtual Machine)
A virtual machine refers to a complete computer system with complete hardware system functionality, which is emulated by software, running in a completely isolated environment. The work that can be done in a physical computer can be implemented in a virtual machine. When creating a virtual machine in a computer, it is necessary to use a part of the hard disk and the memory capacity of the physical machine as the hard disk and the memory capacity of the virtual machine. Each virtual machine has an independent CMOS, hard disk and operating system, and can be operated like a physical machine.
2) Non-Uniform Memory Access architecture (NUMA, non Uniform Memory Access)
NUMA (Non Uniform Memory Access) technology allows many servers to behave as a single system, while retaining the advantages of small systems for ease of programming and management. NUMA also presents challenges to complex architectural designs based on the higher demands placed on memory access by e-commerce applications.
Non-uniform memory access (NUMA) is a computer memory bank design for multiple processors, with memory access times dependent on the memory location of the processor. Under NUMA, a processor accesses its own local memory more quickly than non-local memory (where memory goes to a processor or memory shared between another processor).
NUMA attempts to solve this problem by providing separate memories to the various processors, avoiding the performance penalty that occurs when multiple processors access the same memory. For applications involving scattered data (common in servers and server-like applications), NUMA may increase performance by a factor of n with one shared memory, where n is about the number of processors (or separate memories).
FIG. 1 is a schematic diagram illustrating the structure of a NUMA node scheduling system for a virtual machine in accordance with an illustrative embodiment. The system runs in a target server 110, where the target server 110 runs including a virtual machine 111 and NUMA nodes 112.
Alternatively, the target server 110 may establish a communication connection with the terminal 120 through a wired or wireless network.
Optionally, the terminal 120 may send a user request to the target server, and after the target server 110 receives the user request, a corresponding target virtual machine may be created in the target server according to the user request, so as to implement a function requested to be executed by the user.
Optionally, the terminal may be a terminal device having a data processing function and a data storage function, and the terminal may include one terminal or may include multiple terminals. The terminal may be a terminal device having a data processor and a data storage component, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, and the like, but is not limited thereto.
Optionally, the cloud server may be a cloud server that provides basic computing services such as cloud service, cloud database, cloud computing, cloud function, cloud storage, web service, cloud communication, middleware service, domain name service, security service, CDN, and big data and artificial intelligence platform.
Optionally, the cloud server 110 and the terminal 120 may be connected through a communication network. Alternatively, the communication network may be a wired network or a wireless network.
Optionally, the wireless network or wired network described above uses standard communication techniques and/or protocols. The network is typically the internet, but may be any other network including, but not limited to, a local area network, a metropolitan area network, a wide area network, a mobile, a limited or wireless network, a private network, or any combination of virtual private networks. In some embodiments, data exchanged over the network is represented using technologies and/or formats including hypertext markup language, extensible markup language, and the like. All or some of the links may also be encrypted using conventional encryption techniques such as secure sockets layer, transport layer security, virtual private network, internet protocol security, and the like. In other embodiments, custom and/or dedicated data communication techniques may also be used in place of, or in addition to, the data communication techniques described above.
FIG. 2 is a method flow diagram illustrating a NUMA node scheduling method for a virtual machine in accordance with an illustrative embodiment. The method is performed by a computer device, which may be a target server 110 in a NUMA node scheduling system for a virtual machine as shown in FIG. 1. As shown in FIG. 2, the NUMA node scheduling method for a virtual machine can include the steps of:
at step 201, a target number of NUMA nodes needed by a target virtual machine is determined.
In this embodiment, before the target server establishes the target virtual machine, the computing resources (e.g., CPU resources and memory resources) required by the target virtual machine need to be run first. And the target server is designed for NUMA, and the memory and the CPU in the target server are distributed to each NUMA node, so that after the target server determines the computing resources required by the target virtual machine, the number of NUMA nodes required by the target virtual machine to support and run can be determined according to the resource amount contained in the NUMA nodes of the target virtual machine.
Step 202, in each NUMA node, selects a target number of NUMA nodes to bind to the target virtual machine.
After the number of nodes (i.e., the target number) required for supporting the operation of the target virtual machine is calculated, the target server may perform screening in each NUMA node, select a NUMA node of the target number, and bind with the target virtual machine, thereby supporting the operation of the target virtual machine.
Optionally, in order to ensure the service capability of the target virtual machine, the target server may select a NUMA node with a smaller resource occupancy rate from the NUMA nodes, and select a NUMA node with a target number from the NUMA nodes with a smaller resource occupancy rate to bind with the target virtual machine.
For example, the target server may sort each NUMA node according to CPU occupancy and memory occupancy, and select a target number of NUMA nodes having CPU occupancy and memory occupancy that are both N% (e.g., 50%) top among the NUMA nodes to bind to the target virtual machine.
Step 203, monitoring the load condition of the NUMA node bound with the target virtual machine, and switching the binding relationship between the target virtual machine and the first node to a second node when the load of the first node is unbalanced; the load condition includes the CPU load and the memory load.
After the target virtual machine is created in the target server and the target number of NUMA nodes are bound to the target virtual machine to support the operation of the target virtual machine, the target server needs to monitor the load condition of the NUMA nodes bound to the target virtual machine, and when it is detected that load imbalance exists in a first node bound to the target virtual machine, that is, the load of the first node exceeds an expected load.
At this time, it is explained that the target virtual machine imposes a large burden on the first node, the CPU and the memory in the first node are difficult to support the operation of the target virtual machine, and the supporting of the target virtual machine may also affect other matters that need to be processed by the first node, thereby causing the first node to crash.
Therefore, in order to avoid the above situation, when at least one of the CPU load and the memory load in the first node is exceeded, the target server contacts the binding of the target virtual machine and the first node, selects the second node, and re-binds the target virtual machine and the second node, thereby implementing replacement of the first node by the second node.
In summary, before allocating a NUMA node to a target virtual machine, the target number of NUMA nodes required by the target virtual machine needs to be calculated, then, NUMA nodes of the target number are selected from each NUMA node to be bound with the target virtual machine, after the binding is completed, the load condition of the NUMA node bound with the target virtual machine is monitored in real time, and when the load imbalance of the first node is monitored, the binding relationship between the target virtual machine and the first node is switched to the second node. According to the scheme, when the balance operation of the NUMA nodes is realized, the conditions of CPU load and memory load are considered at the same time, the condition that the distribution of computing resources is not balanced is avoided as much as possible, and the service performance of the target virtual machine is improved.
FIG. 3 is a method flow diagram illustrating a NUMA node scheduling method for a virtual machine in accordance with an illustrative embodiment. The method is performed by a computer device, which may be the target server 110 in a NUMA node scheduling system of a virtual machine as shown in FIG. 1, in conjunction with the terminal 120. As shown in FIG. 3, the NUMA node scheduling for the virtual machine can include the steps of:
step 301 determines the target number of NUMA nodes needed by the target virtual machine.
In one possible implementation, a user sends a user request (e.g., the user request is used to instruct the target server to create the target container) to the target server through the terminal 120, and when the target server 110 receives the user request, a corresponding target virtual machine may be created according to the user request to create the target container.
And the user request includes a target amount of resources needed to create the target virtual machine.
In a possible implementation manner, after the target server receives the user request, the user request is analyzed to obtain a target resource amount, and the target number of the NUMA nodes required by the target virtual machine is determined according to the target resource amount and the resource configuration size of each NUMA node.
For example, when the target resource amount indicates that the target virtual machine requires 36 cores, 128G of memory and the resource configuration size for each NUMA node is 24 cores, 64G of memory, then at least two NUMA nodes are required to provide computing resources in order to support the proper operation of the target virtual machine.
Step 302, select a target number of NUMA nodes in each NUMA node to bind to the target virtual machine.
In an embodiment of the application, after the target number of NUMA nodes is determined, the target number of NUMA nodes can be bound to the target virtual machine through cgroup. Please refer to fig. 4, which illustrates a schematic diagram of a virtual machine random resource scheduling according to an embodiment of the present application. As shown in fig. 4, in the prior art, when a virtual machine is created, a resource scheduling of the virtual machine is not limited by default, at this time, there may be a CPU applying for NUMA0 and a memory applying for NUMA1 in the virtual machine, and at this time, the CPU needs to access resources across NUMAs, which results in lower performance.
According to the embodiment of the application, after the target number of NUMA nodes needed by the target virtual machine are determined, the target number of NUMA nodes are directly bound with the target virtual machine through cgroup, so that the frequency of the condition that a CPU (central processing unit) accesses resources across NUMA is reduced, and the service performance of the target virtual machine is improved. Please refer to fig. 5, which illustrates a schematic diagram of a virtual machine binding a single resource according to an embodiment of the present application. As shown in fig. 5, the physical machine is configured as a 48core 256G memory. The physical machine has 2 NUMA, a single NUMA24 core 128G memory. A12core 64G specification virtual machine is created on a physical machine. And controlling the virtual machine on NUMA0 through cgroup, wherein the CPU and the memory applied by the virtual machine are NUMA0, and the performance of the virtual machine is the highest.
Referring to fig. 6, a schematic diagram of a virtual machine binding multiple resources according to an embodiment of the present application is shown. The physical machine is configured to be a 48core 256G memory. The physical machine has 2 NUMA, a single NUMA24 core 128G memory. A virtual machine of 36core 64G specification is created on the physical machine. The virtual machines are controlled on NUMA0 and NUMA1 through cgroup, the two NUMA respectively bear the resources of 18core 32G, and the CPU and the memory applied by the virtual machine are both NUMA1 and NUMA2 at this time.
In one possible implementation, non-over-allocated nodes are queried in each NUMA node; and when each NUMA node has non-super-matched nodes, selecting a target number of NUMA nodes from the non-super-matched nodes, and binding the non-super-matched nodes with the target virtual machine through cgroup.
When a target virtual machine needs to be created and the target number of NUMA nodes that the target virtual machine needs to allocate has been determined, then the target number of NUMA nodes needs to be selected among the NUMA nodes. To ensure performance of the target virtual machine, the target server may preferentially select NUMA nodes among the nodes that are not over-provisioned. The non-over-allocated node may be defined as a NUMA node, and any one of the bound CPU resource and memory resource does not exceed a maximum value.
That is, the target server preferentially chooses the less loaded node among the NUMA nodes to support the operation of the target virtual machine.
Further, when there is no unallocated node in the respective NUMA nodes, a target number of NUMA nodes are selected from the unallocated NUMA nodes and bound to the target virtual machine through cgroup.
And when the non-over-matched nodes do not exist in the NUMA nodes, selecting a target number of NUMA nodes from the over-matched NUMA nodes.
In a possible implementation manner, when the number of non-super-configured nodes in each NUMA node is smaller than the target number, the non-super-configured nodes may be bound with the target virtual machine first, and the NUMA nodes are selected from the super-configured nodes to be bound with the target virtual machine until the target virtual machine is bound with the NUMA nodes of the target number.
When non-super-matched nodes exist in each NUMA node, in a possible implementation manner, the non-super-matched NUMA nodes are sorted according to the load condition, and the NUMA nodes with the optimal target number of average load conditions are selected.
That is to say, when the target server detects that non-over-allocated nodes exist in each NUMA node, the target server directly sorts the non-over-allocated NUMA nodes according to the load condition, and selects a NUMA node with the optimal average load condition. For example, after each non-super-configured node is obtained, the target server may directly average and sort the CPU occupancy rates and the memory occupancy rates of the non-super-configured nodes, and select an optimal number of NUMA nodes according to a sorting result.
When a target number of NUMA nodes are required to be selected from the super-configured nodes to be bound with the target virtual machine, in one possible implementation manner, the target server may determine a super-configured ratio of CPUs and memories required by the virtual machine bound to each super-configured node, for example, when a CPU core of the super-configured node a is 36, a memory is 128G, and a sum of resources requested by the virtual machine bound to the super-configured node a is 72 cores and 192G, the super-configured ratio of CPUs required by the super-configured node a is 2, and the super-configured ratio of memories is 1.5.
After acquiring the super-match proportion of the CPU and the memory required by the virtual machine bound by each super-match node, optionally, selecting the NUMA nodes with the optimal target number of the super-match proportion to match with the target virtual machine according to the mean value of the super-match proportion.
Reference is made to FIG. 7 which illustrates a target virtual machine NUMA node selection flow diagram in accordance with an embodiment of the present application. As shown in fig. 7, a virtual machine is created, and a specification of the virtual machine is obtained first, where the specification is input by a user when the virtual machine is created, for example, the size of the virtual machine is 4 cores, and 8G memory; judging whether a non-super-distribution node exists in the cluster, if so, directly selecting a proper node from the non-super-distribution nodes to create a virtual machine; otherwise, selecting a proper node from the super-configuration nodes to create the virtual machine.
After the virtual machine is created, the CPU and memory loads of each NUMA node may be monitored, so as to implement load balancing operations, with migration or swapping on the same physical machine being preferred, and migration or swapping across physical machines being followed, as shown in step 303.
Step 303, monitoring a load condition of the NUMA node bound to the target virtual machine, determining that there is a load imbalance in the first node when detecting that at least one of a CPU load and a memory load of the first node exceeds a target threshold, and switching a binding relationship between the target virtual machine and the first node to a second node.
In a possible implementation manner, in a first physical machine corresponding to the first node, acquiring each candidate node except the first node; and selecting a second node according to the load condition of the candidate node so as to switch the binding relationship between the target virtual machine and the first node to the second node.
When the target server detects that the load of the first node is unbalanced, other candidate nodes can be inquired in the first physical machine where the first node is located, and whether the first node can be replaced is judged according to the load condition of the candidate nodes. When the candidate node can replace the first node, for example, the load condition of the optimal candidate node is better, and the first condition is met (for example, both the CPU occupancy rate and the memory occupancy rate are smaller than the first threshold), the binding relationship between the target virtual machine and the first node is directly switched to the second node.
Further, when the load condition of the optimal candidate node is general, the optimal candidate node with the best load condition does not satisfy the first condition but satisfies the second condition (for example, the CPU occupancy and the memory occupancy are greater than the first threshold but less than the second threshold), then the candidate virtual machine with less occupied resources than the target virtual machine may be selected from the optimal candidate node with the best load condition, and the optimal candidate node and the first node exchange virtual machines, so that the binding relationship between the optimal candidate node and the candidate virtual machines is switched to the first node; and switching the binding relation between the target virtual machine and the first node to the optimal candidate node.
At the moment, the optimal candidate node is the second node, after the candidate virtual machine is exchanged with the target virtual machine, on one hand, the resource occupancy rate of the first node is reduced, and on the other hand, the second node binds the target virtual machine but unbinds the candidate virtual machine, so that the resource occupancy rate of the second node is reduced; in another aspect, the first node and the second node are in the same physical machine, so that the occupation amount of resources during the migration of the virtual machine is reduced.
When the optimal candidate node is selected from the candidate nodes, in a possible implementation manner, the target server selects the optimal candidate node with the optimal average load condition from the candidate nodes; the average load condition is the average value of the CPU occupancy rate and the memory occupancy rate; and when the residual load resource of the optimal candidate node is higher than the occupied resource of the target virtual machine in the first node, determining the optimal candidate node as a second node.
When the load condition of the optimal candidate node is general (for example, the CPU occupancy and the memory occupancy are greater than the second threshold), it is indicated that each NUMA node in the first physical machine tends to be saturated at this time, and in a possible implementation, when the remaining load resource of the optimal candidate node is lower than the resource threshold (for example, the CPU occupancy and the memory occupancy are greater than the second threshold at this time), a secondary candidate node is selected from the neighboring physical machines of the first physical machine; and selecting a second node according to the load condition of the candidate node, so as to switch the binding relationship between the target virtual machine and the first node to the second node.
That is, when the second node cannot be found in the first physical machine, so that the first node performs migration or exchange of the target virtual machine, the adjacent physical machine of the first physical machine is selected, and the second node is found on the adjacent physical machine, so as to implement migration or exchange of virtual machines across physical machines.
Based on the above scheme, optionally, the target server may further determine each NUMA node in each physical machine, detect the load condition of each NUMA node according to a specified period, and store the detection result in the load database.
Please refer to fig. 8, which illustrates a schematic diagram of a load database scheme according to an embodiment of the present application. As shown in fig. 8, in the above scheme, each time a virtual machine is created, the load of each NUMA node in the current physical machine cluster is obtained, and then it is comprehensively determined at which NUMA node the virtual machine starts, which is very low in speed and efficiency. All can be modified as follows. And running a program in each physical machine, periodically (for example, every 5 s) inquiring NUMA load under the current physical machine, and storing data in a load database. Therefore, when the virtual machine is created next time, the NUMA of which physical machine the virtual machine is created to can be quickly decided through the information of the database.
It should be noted that the above-described scheme shown in the embodiment of the present application may also be applied to other processes (for example, containers) that occupy a CPU and a memory, similar to a virtual machine, to regulate and control resource configurations of other processes.
In summary, before allocating a NUMA node to a target virtual machine, the target number of NUMA nodes required by the target virtual machine needs to be calculated, then, NUMA nodes of the target number are selected from each NUMA node to be bound with the target virtual machine, after the binding is completed, the load condition of the NUMA node bound with the target virtual machine is monitored in real time, and when the load imbalance of the first node is monitored, the binding relationship between the target virtual machine and the first node is switched to the second node. According to the scheme, when the balance operation of the NUMA nodes is realized, the conditions of CPU load and memory load are considered at the same time, the condition that the distribution of computing resources is not balanced is avoided as much as possible, and the service performance of the target virtual machine is improved.
Refer to FIG. 9, which illustrates a NUMA node scheduling apparatus for a virtual machine according to an embodiment of the present application. The device comprises:
a target number determination module 901 for determining a target number of NUMA nodes required by a target virtual machine;
a node determining module 902, configured to select a target number of NUMA nodes from the NUMA nodes and bind the NUMA nodes with the target virtual machine;
a load balancing module 903, configured to monitor a load condition of a NUMA node bound to the target virtual machine, and switch a binding relationship between the target virtual machine and the first node to a second node when a load imbalance exists in the first node; the load condition includes a CPU load and a memory load.
In one possible implementation manner, the load balancing module is further configured to,
when detecting that at least one of the CPU load and the memory load of the first node exceeds a target threshold, determining that load imbalance exists in the first node, and switching the binding relationship between the target virtual machine and the first node to a second node.
In one possible implementation manner, the load balancing module is further configured to,
acquiring each candidate node except the first node in a first physical machine corresponding to the first node;
and selecting a second node according to the load condition of the candidate node so as to switch the binding relationship between the target virtual machine and the first node to the second node.
In a possible implementation manner, the load balancing module is further configured to select an optimal candidate node with an optimal average load condition from the candidate nodes; the average load condition is the average value of the CPU occupancy rate and the memory occupancy rate;
and when the residual load resources of the optimal candidate node are higher than the occupied resources of the target virtual machine in the first node, determining the optimal candidate node as a second node.
In one possible implementation manner, the load balancing module is further configured to,
when the residual load resources of the optimal candidate node are lower than a resource threshold value, selecting a secondary candidate node from adjacent physical machines of the first physical machine;
and selecting a second node according to the load condition of the secondary candidate node so as to switch the binding relationship between the target virtual machine and the first node to the second node.
In one possible implementation manner, the node determination module is further configured to,
querying nodes which are not over-configured in each NUMA node;
and when each NUMA node has non-super-matched nodes, selecting a target number of NUMA nodes from the non-super-matched nodes, and binding the non-super-matched nodes with the target virtual machine through cgroup.
In one possible implementation manner, the node determination module is further configured to,
and when no non-super-matched node exists in each NUMA node, selecting a target number of NUMA nodes from the super-matched NUMA nodes, and binding the target NUMA nodes with the target virtual machine through cgroup.
In a possible implementation manner, the node determining module is further configured to sort the non-super-configured NUMA nodes according to load conditions, and select a target number of NUMA nodes with an optimal average load condition.
In a possible implementation manner, the apparatus further includes a database storage module, configured to determine each NUMA node in each physical machine;
and detecting the load condition of each NUMA node according to a specified period, and storing the detection result into a load database.
In summary, before allocating a NUMA node to a target virtual machine, the target number of NUMA nodes required by the target virtual machine needs to be calculated, then, NUMA nodes of the target number are selected from each NUMA node to be bound with the target virtual machine, after the binding is completed, the load condition of the NUMA node bound with the target virtual machine is monitored in real time, and when the load imbalance of the first node is monitored, the binding relationship between the target virtual machine and the first node is switched to the second node. According to the scheme, when the balance operation of the NUMA nodes is realized, the conditions of CPU load and memory load are considered at the same time, the condition that the distribution of computing resources is not balanced is avoided as much as possible, and the service performance of the target virtual machine is improved.
Refer to fig. 10, which is a schematic diagram of a computer device including a memory and a processor, the memory storing a computer program, the computer program being executed by the processor to implement the method according to an exemplary embodiment of the present application.
The processor may be a Central Processing Unit (CPU). The Processor may also be other general purpose processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or a combination thereof.
The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods of the embodiments of the present invention. The processor executes the non-transitory software programs, instructions and modules stored in the memory, so as to execute various functional applications and data processing of the processor, that is, to implement the method in the above method embodiment.
The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor, and the like. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and such remote memory may be coupled to the processor via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
In an exemplary embodiment, a computer readable storage medium is also provided for storing at least one computer program, which is loaded and executed by a processor to implement all or part of the steps of the above method. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, a computer program product or a computer program is also provided, which comprises computer instructions, which are stored in a computer readable storage medium. The computer instructions are read by a processor of the computer device from a computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform all or part of the steps of the method shown in any one of the embodiments of fig. 2 or fig. 3.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements that have been described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (12)

1. A NUMA node scheduling method for a virtual machine, the method comprising:
determining a target number of NUMA nodes required by a target virtual machine;
selecting a target number of NUMA nodes from each NUMA node, and binding the NUMA nodes with the target virtual machine;
monitoring the load condition of the NUMA node bound with the target virtual machine, and switching the binding relation between the target virtual machine and the first node to a second node when the load of the first node is unbalanced; the load condition includes a CPU load and a memory load.
2. The method of claim 1, wherein the switching the binding relationship of the target virtual machine and the first node to the second node when the load imbalance exists in the first node comprises:
when detecting that at least one of the CPU load and the memory load of the first node exceeds a target threshold, determining that load imbalance exists in the first node, and switching the binding relationship between the target virtual machine and the first node to a second node.
3. The method of claim 2, wherein switching the binding relationship of the target virtual machine to the first node to the second node comprises:
acquiring each candidate node except the first node in a first physical machine corresponding to the first node;
and selecting a second node according to the load condition of the candidate node so as to switch the binding relationship between the target virtual machine and the first node to the second node.
4. The method of claim 3, wherein selecting the second node according to the load condition of the candidate node comprises:
selecting an optimal candidate node with the optimal average load condition from the candidate nodes; the average load condition is the average value of the CPU occupancy rate and the memory occupancy rate;
and when the residual load resources of the optimal candidate node are higher than the occupied resources of the target virtual machine in the first node, determining the optimal candidate node as a second node.
5. The method of claim 4, further comprising:
when the residual load resources of the optimal candidate node are lower than a resource threshold value, selecting a secondary candidate node from adjacent physical machines of the first physical machine;
and selecting a second node according to the load condition of the secondary candidate node so as to switch the binding relationship between the target virtual machine and the first node to the second node.
6. The method of any of claims 1-5, wherein selecting the target number of NUMA nodes to bind to the target virtual machine comprises:
querying nodes which are not over-matched in each NUMA node;
and when each NUMA node has non-super-matched nodes, selecting a target number of NUMA nodes from the non-super-matched nodes, and binding the non-super-matched nodes with the target virtual machine through cgroup.
7. The method of claim 6, further comprising:
and when no non-super-matched node exists in each NUMA node, selecting a target number of NUMA nodes from the super-matched NUMA nodes, and binding the target number of NUMA nodes with the target virtual machine through cgroup.
8. The method of claim 6, wherein selecting the target number of NUMA nodes comprises:
and sequencing the non-over-allocated NUMA nodes according to the load condition, and selecting the NUMA nodes with the target number of the optimal average load condition.
9. The method of claims 1 to 5, further comprising:
determining each NUMA node in each physical machine;
and detecting the load condition of each NUMA node according to a specified period, and storing the detection result into a load database.
10. A NUMA node scheduling apparatus for a virtual machine, the apparatus comprising:
a target number determination module to determine a target number of NUMA nodes required by a target virtual machine;
a node determining module, configured to select a target number of NUMA nodes in each NUMA node, and bind the NUMA nodes to the target virtual machine;
the load balancing module is used for monitoring the load condition of the NUMA node bound with the target virtual machine and switching the binding relationship between the target virtual machine and the first node to a second node when the load of the first node is unbalanced; the load condition includes a CPU load and a memory load.
11. A computer device comprising a processor and a memory having stored therein at least one instruction that is loaded and executed by the processor to implement a NUMA node scheduling method for a virtual machine as claimed in any one of claims 1 to 9.
12. A computer readable storage medium having stored therein at least one instruction which is loaded and executed by a processor to implement a NUMA node scheduling method for a virtual machine as claimed in any one of claims 1 to 9.
CN202210916907.7A 2022-08-01 2022-08-01 NUMA node scheduling method, device, equipment and storage medium of virtual machine Pending CN115269120A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210916907.7A CN115269120A (en) 2022-08-01 2022-08-01 NUMA node scheduling method, device, equipment and storage medium of virtual machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210916907.7A CN115269120A (en) 2022-08-01 2022-08-01 NUMA node scheduling method, device, equipment and storage medium of virtual machine

Publications (1)

Publication Number Publication Date
CN115269120A true CN115269120A (en) 2022-11-01

Family

ID=83746627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210916907.7A Pending CN115269120A (en) 2022-08-01 2022-08-01 NUMA node scheduling method, device, equipment and storage medium of virtual machine

Country Status (1)

Country Link
CN (1) CN115269120A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090313445A1 (en) * 2008-06-11 2009-12-17 Vmware, Inc. System and Method for Improving Memory Locality of Virtual Machines
US20130073730A1 (en) * 2011-09-20 2013-03-21 International Business Machines Corporation Virtual machine placement within a server farm
CN103036800A (en) * 2012-12-14 2013-04-10 北京高森明晨信息科技有限公司 Virtual machine load balancing system, balancing panel points and balancing method
CN104123171A (en) * 2014-06-10 2014-10-29 浙江大学 Virtual machine migrating method and system based on NUMA architecture
CN104375897A (en) * 2014-10-27 2015-02-25 西安工程大学 Cloud computing resource scheduling method based on minimum relative load imbalance degree
CN111078363A (en) * 2019-12-18 2020-04-28 深信服科技股份有限公司 NUMA node scheduling method, device, equipment and medium for virtual machine
CN114371911A (en) * 2021-12-28 2022-04-19 天翼云科技有限公司 Virtual machine scheduling method and device, electronic equipment and readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090313445A1 (en) * 2008-06-11 2009-12-17 Vmware, Inc. System and Method for Improving Memory Locality of Virtual Machines
US20130073730A1 (en) * 2011-09-20 2013-03-21 International Business Machines Corporation Virtual machine placement within a server farm
CN103036800A (en) * 2012-12-14 2013-04-10 北京高森明晨信息科技有限公司 Virtual machine load balancing system, balancing panel points and balancing method
CN104123171A (en) * 2014-06-10 2014-10-29 浙江大学 Virtual machine migrating method and system based on NUMA architecture
CN104375897A (en) * 2014-10-27 2015-02-25 西安工程大学 Cloud computing resource scheduling method based on minimum relative load imbalance degree
CN111078363A (en) * 2019-12-18 2020-04-28 深信服科技股份有限公司 NUMA node scheduling method, device, equipment and medium for virtual machine
CN114371911A (en) * 2021-12-28 2022-04-19 天翼云科技有限公司 Virtual machine scheduling method and device, electronic equipment and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
许绘香等: "虚拟机云计算资源均衡调度方法研究", 《电子技术与软件工程》, no. 2022 *
青岛英谷教育科技股份有限公司: "《云计算与虚拟化技术高等学校应用型新工科创新人才培养计划指定教材•高等学校云计算与大数据专业"十三五"课改规划教材》", 西安电子科技大学出版社, pages: 126 - 127 *

Similar Documents

Publication Publication Date Title
US11252220B2 (en) Distributed code execution involving a serverless computing infrastructure
US20200137151A1 (en) Load balancing engine, client, distributed computing system, and load balancing method
US10977086B2 (en) Workload placement and balancing within a containerized infrastructure
CN107707622B (en) Method and device for accessing desktop cloud virtual machine and desktop cloud controller
CN111338774A (en) Distributed timing task scheduling system and computing device
US11558311B2 (en) Automated local scaling of compute instances
US20170017511A1 (en) Method for memory management in virtual machines, and corresponding system and computer program product
CN111078369B (en) Virtual machine distribution method and device under cloud computer and server
US20220006879A1 (en) Intelligent scheduling apparatus and method
US20170142203A1 (en) Method for host scheduling for container deployment, electronic device and non-transitory computer-readable storage medium
WO2019105379A1 (en) Resource management method and apparatus, electronic device, and storage medium
US20220329651A1 (en) Apparatus for container orchestration in geographically distributed multi-cloud environment and method using the same
CN111131486A (en) Load adjustment method and device of execution node, server and storage medium
CN114610497A (en) Container scheduling method, cluster system, device, electronic equipment and storage medium
CN110147277B (en) Dynamic resource deployment method and device, server and storage medium
CN117149445B (en) Cross-cluster load balancing method and device, equipment and storage medium
US11726833B2 (en) Dynamically provisioning virtual machines from remote, multi-tier pool
CN111831408A (en) Asynchronous task processing method and device, electronic equipment and medium
CN106330595B (en) Heartbeat detection method and device for distributed platform
CN111522664A (en) Service resource management and control method and device based on distributed service
CN116680078A (en) Cloud computing resource scheduling method, device, equipment and computer storage medium
CN115269120A (en) NUMA node scheduling method, device, equipment and storage medium of virtual machine
CN112631766A (en) Dynamic adjustment method and device for project environment resources
CN107562510B (en) Management method and management equipment for application instances
CN115484129A (en) Multi-process data processing method and device, gateway and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20221101

RJ01 Rejection of invention patent application after publication