CN115705247A - Process running method and related equipment - Google Patents

Process running method and related equipment Download PDF

Info

Publication number
CN115705247A
CN115705247A CN202110937787.4A CN202110937787A CN115705247A CN 115705247 A CN115705247 A CN 115705247A CN 202110937787 A CN202110937787 A CN 202110937787A CN 115705247 A CN115705247 A CN 115705247A
Authority
CN
China
Prior art keywords
resource
target
numa node
numa
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110937787.4A
Other languages
Chinese (zh)
Inventor
林星
陈渊
王宇超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202110937787.4A priority Critical patent/CN115705247A/en
Priority to PCT/CN2022/090190 priority patent/WO2023020010A1/en
Publication of CN115705247A publication Critical patent/CN115705247A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Abstract

The embodiment of the application discloses a method for running a process, which is applied to a computer system, wherein the computer system comprises a target NUMA node and a controller, and the method comprises the following steps: the controller obtains resource allocation information, where the resource allocation information is used to instruct the target NUMA node to run a plurality of processes using the computing resources corresponding to the plurality of processor cores, and when each process is run, the computing resources corresponding to each of the plurality of processor cores can be used. The controller runs a plurality of processes on the target NUMA node according to the resource allocation information. In this manner, each process can run on any of the processor cores selected by the NUMA node. Therefore, in the NUMA node, different processes can also run on the same processor core, so that the utilization rate of each processor core in the NUMA node is improved, and the waste of resources is reduced.

Description

Process running method and related equipment
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a process running method and related equipment.
Background
With the development of computing technology, hardware resources of various types of computer devices are becoming more abundant. Especially for computer devices that need to process a large number of process tasks, the hardware resources often include multiple processors and multiple memories.
A computer device may employ a non-uniform memory access (NUMA) system to configure various processors and various memories as a plurality of different NUMA nodes. The length of time it takes for different processors to access different memory is different, and the processor in each NUMA node needs less time to access memory within the node than it does on other NUMA nodes. Therefore, in order to reduce the above cross-node access during the process running process, each process is generally bound to a corresponding processor to run, so as to isolate and limit each process.
Most processes require a non-integer number of processor cores at runtime, such as 0.5, 0.8, or 1.2, etc., and for this non-integer number of processor cores, the number of processor cores required by the process is generally rounded up. For example, when a process needs 0.5 processor cores, 1 processor is allocated to the process, and for example, when a process needs 1.2 processor cores, 2 processors are allocated to the process. However, in this resource allocation mode, most processes do not fully utilize the resources of the processor, and there is a lot of resource waste.
Disclosure of Invention
The embodiment of the application provides a process running method and related equipment, which are used for improving the utilization rate of each processor core in a NUMA node and reducing the waste of resources.
In a first aspect, an embodiment of the present application provides a method for running a process, where the method is applied to a computer system, the computer system includes a target NUMA node and a controller, and the controller acquires resource allocation information, where the resource allocation information is used to instruct the target NUMA node to run a plurality of processes using computing resources corresponding to a plurality of processor cores, and when each process is run, the computing resource corresponding to each of the plurality of processor cores may be used. The controller runs a plurality of processes on the target NUMA node according to the resource allocation information.
The number of processor cores used for running a process in the target NUMA node is not limited in this embodiment. That is, in the target NUMA node, the "plurality of processor cores" may be a part of the processor cores in the target NUMA node or may be all the processor cores in the target NUMA node, and this is not particularly limited herein.
It should be noted that, for convenience of description, in the embodiment of the present application, for a computing resource used by a process during running, a share of a processor core occupied by the process is quantized. For example, if the computing resource used by a process during running is a computing resource corresponding to 1.5 processor cores, for convenience of description, the process may occupy 1.5 processor cores.
The application provides a method for running a process in a target NUMA node, and the method for running the process can be also suitable for other NUMA nodes in a computer system, so that the same technical effect is achieved.
In this way, each process can run on any of the processor cores selected by the NUMA node. Therefore, in the NUMA node, different processes can also run on the same processor core, so that the utilization rate of each processor core in the NUMA node is improved, and the waste of resources is reduced. On the other hand, each process only runs on the NUMA node allocated to the process, so that the condition that a certain process accesses resources across nodes is avoided, and the running efficiency of the process is improved.
Further, in the actual running process of the process, the resource demand of the process often fluctuates up and down within a certain range. For example, the number of processor cores occupied by some processes may exceed the resource requirements of the process in a standard running state, and at this time, the process may use the computing resources of other idle processor cores in the current NUMA node to complete running. For another example, if the number of processor cores occupied by some processes is lower than the resource requirement of the processes in the standard operating state, the computing resources released by the processes may also be used by other processes. Therefore, the method for running the processes in the application can meet the resource requirements of the processes in different running states, and further improves the resource utilization rate of the computer equipment.
Based on the first aspect, in an optional implementation manner, the plurality of processor cores may be all processor cores in the target NUMA node, that is, all processor cores in the target NUMA node are all used to run a plurality of processes in the target NUMA node, and in the running of each process, a computing resource corresponding to each processor core in the target NUMA node may be used.
Based on the first aspect, in an alternative embodiment, only a part of the processor cores in the NUMA node may be shared, so that other processor cores that are not shared may be used to execute other specific processes, and the specific processes have independent processor cores available for use. Specifically, in this embodiment of the present application, a specific number of processor cores (i.e., a first processor core) may be selected at the target NUMA node, and the computing resources of the first processor core and the computing resources of other processor cores in the target NUMA node are not shared but isolated from each other. The resource allocation information in this application is also used to instruct the target NUMA node to use the first processor core to run a specific process, and the computing resource corresponding to the first processor core can be used by the specific process only. Therefore, the specific processes have independent processor cores (first processor cores) available in the running process, the computing resources of other processor cores do not need to be used, and the computing resources of the independent processor cores are not used by other processes, so that the resource requirements of the specific processes in the running process are ensured.
Based on the first aspect, in an optional implementation manner, the multiple processes in the target NUMA node include a target process, and when the target process needs to be run, the controller determines, according to the resource allocation information, multiple processor cores that are available for running the target process in the target NUMA node. Further, the controller determines an idle processor core from the plurality of processor cores, wherein a part or all of the computing resources in the idle processor core are unused, and the controller may run the target process on the idle processor core.
Based on the first aspect, in an optional implementation manner, the computer system includes multiple NUMA nodes, and each of the NUMA nodes may use the method for running a process in this application. Before running a process, the controller needs to allocate the various processes among the appropriate NUMA nodes. In the embodiment of the application, the allocation logic adopted by the controller when allocating the NUMA nodes to the processes is consistent. The following explains an allocation flow of a first process among the plurality of processes as an example.
In the process that the controller allocates the corresponding NUMA node to each process, the controller needs to acquire current allocable resource information of each NUMA node in the plurality of NUMA nodes, where the allocable resource information includes a first computing resource and a first memory resource, and then calculates a ratio between the first computing resource and the first memory resource of each NUMA node as a first ratio of the NUMA node. The configurable resource information for a NUMA node includes the compute resources (number of processor cores) and memory resources (memory capacity) for the NUMA node, thereby determining how much resources remain currently available in each NUMA node for allocation to each process.
The controller needs to obtain resource requirement information of each process, where the resource requirement information of a process indicates the computing resources (the number of processor cores) and the storage resources (the memory capacity) that the process needs to occupy when running. Of course, in the actual operation process, the resource demand of the process often fluctuates up and down within a certain range, and in the present application, the resource demand in the standard operation state of each process is used as the reference for allocating the resource. The controller acquires the current resource demand information of each process. At this time, the controller obtains resource demand information of the first process, where the resource demand information includes a first computing resource demand and a first memory resource demand, and a ratio between the first computing resource demand and the first memory resource demand is a second ratio.
The controller compares the first ratios corresponding to the NUMA nodes with the second ratio of the first process, and then selects the NUMA node with the minimum difference from the NUMA nodes as a target NUMA node. In other words, the difference between the first ratio of the target NUMA node and the second ratio of the first process is the one for which the difference between the first ratio of the respective NUMA node and the second ratio of the first process is the smallest.
Through the above manner, the controller can allocate the processes to the corresponding NUMA nodes according to the uniform allocation logic, and when each process allocates a NUMA node, the corresponding resource demand ratio is closest to the allocable resource occupation ratio of the NUMA node. Thus, allocable resource occupancy for the NUMA node is least influential after the corresponding process is allocated, thereby ensuring the suitability of the NUMA node.
Based on the first aspect, in an optional implementation manner, after the first process is allocated to the target NUMA node, the allocable resource of the target NUMA node is correspondingly decreased, and therefore, in order to continue using the target NUMA node for a subsequent process allocation flow, the allocable resource information of the target NUMA node needs to be updated. The distributable resource information updated by the target NUMA node comprises a second computing resource, and the second computing resource is a difference value between the first computing resource and the demand quantity of the first computing resource. The allocable resource information of the target NUMA node is also available for allocation to other processes after it has been updated. And in the subsequent process distribution process, the latest distributable resource information of the target NUMA node is used for calculation.
Through the mode, after each process is distributed to the corresponding NUMA node, the distributable resource information of the NUMA node is updated in time, the timeliness of the distributable resource information of the NUMA node is improved, and the NUMA node can participate in the subsequent process distribution process continuously.
Based on the first aspect, in an optional implementation manner, the updated allocable resource information includes a second memory resource, where the second memory resource is a difference between the first memory resource and a demand of the first memory resource.
Based on the first aspect, in an optional implementation, the plurality of processes further includes a second process, and after the controller allocates the first process to the target NUMA node, the target NUMA node is also available for allocation of other processes (including the second process).
Specifically, similar to the foregoing process of allocating the first process, the controller obtains allocable resource information of each NUMA node in the plurality of NUMA nodes, where the allocable resource information includes a second computing resource and a second memory resource, and calculates a ratio between the second computing resource and the second memory resource as a third ratio. And the allocable resource information of the target NUMA node is updated allocable resource information. It should be noted that, after the target NUMA node is allocated by the first process, it is possible that the next NUMA node is allocated to the second process, and then in each NUMA node of the computer device, only the allocable resource information of the target NUMA node is changed, but other NUMA nodes are not used for allocating other processes, and then the allocable resource information of other NUMA nodes is not changed, that is, at this time, the first computing resource and the first memory resource of other NUMA nodes are the same as the second computing resource and the second memory resource; on the other hand, after the assignment of the first process to the target NUMA node is passed, it is possible that the assignment of the corresponding NUMA node for other processes (not including the second process) may follow before it is time to assign the NUMA node for the second process. Then, in each NUMA node of the computer device, all NUMA nodes participating in resource allocation of other processes have a change in the corresponding allocable resource information, that is, the first computing resource and the first memory resource of the NUMA nodes are different from the second computing resource and the second memory resource. And only those NUMA nodes for which no other processes are allocated during those periods have their corresponding allocatable resource information unchanged.
Therefore, in the present application, the value of the first computing resource and the value of the second computing resource may be the same or different; the value of the first memory resource and the value of the second memory resource may be the same or different.
The controller obtains resource demand information of the second process, the resource demand information includes a second computing resource demand and a second memory resource demand, and a ratio between the second computing resource demand and the second memory resource demand is a fourth ratio. The controller determines a NUMA node with the minimum difference as a target NUMA node from the NUMA nodes according to the difference between the third ratio and the fourth ratio corresponding to each NUMA node in the NUMA nodes. After the target NUMA node corresponding to the second process is determined, the second process can be assigned to the target NUMA node so that the target NUMA node can be used to run the second process.
Through the mode, the same NUMA node can be used for being distributed to a plurality of different processes, and therefore the resource utilization rate of the NUMA node is improved.
In an alternative embodiment based on the first aspect, the first computing resource requirement is expressed as M processor cores, where M is a positive number comprising a fractional number. In other words, the number of processor cores required by each process may be an integer or a non-integer. The method for running the process is not influenced by whether the number of the processor cores required by the process is an integer or not.
In a second aspect, an embodiment of the present application provides a method for running a process, where the method is applied to a computer system, the computer system includes a target non-uniform memory access NUMA node and a controller, and the target NUMA node includes a plurality of processor cores, and the method includes: when the first process requests a runtime in the target NUMA node. The controller may determine a target processor core from the plurality of processor cores, the target processor core including a first computing resource and a second computing resource, wherein the first computing resource is already used to run a second process and the second computing resource is an idle resource. The controller may run the first process using a second computing resource of the target processor core.
In the above manner, for the processes to be executed in the NUMA node, the processes to be executed can be continuously allocated to the processor cores that have already executed other processes to be executed. In other words, in the present application, the same processor core can simultaneously run a plurality of different processes, and a situation that the remaining idle processes of a certain processor core cannot be used by other processes after a process is run is avoided, so that the utilization rate of each processor core in the NUMA node is improved, and the waste of resources is reduced. On the other hand, each process only runs on the NUMA node allocated to the process, so that the condition that a certain process accesses resources across nodes is avoided, and the running efficiency of the process is improved.
Based on the second aspect, in an optional implementation, the plurality of processor cores may be all processor cores in the target NUMA node, that is, all processor cores in the target NUMA node are all used to run the plurality of processes in the target NUMA node, where the corresponding computing resource in each processor core in the target NUMA node may be used to run each process in the plurality of processes.
Based on the second aspect, in an alternative embodiment, only a portion of the processor cores in the NUMA node may be shared, so that other unshared processor cores may be used to execute other specific processes, which have independent processor cores available. Specifically, in this embodiment of the present application, a specific number of processor cores (i.e., a first processor core) may be selected at the target NUMA node, and the computing resources of the first processor core and the computing resources of other processor cores in the target NUMA node are not shared with each other but are isolated from each other. The resource allocation information in this application is also used to instruct the target NUMA node to use the first processor core to run a specific process, and the computing resource corresponding to the first processor core can be used by the specific process only. Therefore, the independent processor core (the first processor core) is available for the specific processes in the running process, the computing resources of other processor cores are not required to be used, and the computing resources of the independent processor cores are not used by other processes, so that the resource requirements of the specific processes in the running process are ensured.
In a third aspect, an embodiment of the present application provides a computer device, including:
the resource allocation unit is used for acquiring resource allocation information, wherein the resource allocation information is used for indicating the target NUMA node to use the computing resources corresponding to the plurality of processor cores to run a plurality of processes, and the computing resources corresponding to each of the plurality of processor cores can be used when each process is run;
and the running unit is used for running a plurality of processes on the target NUMA node according to the resource allocation information.
In an alternative embodiment based on the third aspect, the plurality of processor cores are all processor cores in the target NUMA node.
Based on the third aspect, in an optional embodiment, the target NUMA node further includes a first processor core, the resource allocation information is further used to instruct the target NUMA node to run a specific process using the first processor core, and the computing resource corresponding to the first processor core can only be used by the specific process.
Based on the third aspect, in an optional implementation manner, the multiple processes include a target process, and the running unit is specifically configured to:
determining a plurality of processor cores according to the resource allocation information;
determining an idle processor core from a plurality of processor cores;
and running the target process on the idle processor core.
Based on the third aspect, in an optional implementation, the computer device includes a plurality of NUMA nodes, the plurality of processes includes a first process, and the computer device further includes a determining unit;
the device comprises an acquiring unit and a processing unit, wherein the acquiring unit is further used for acquiring distributable resource information of each NUMA node in a plurality of NUMA nodes, the distributable resource information comprises a first computing resource and a first memory resource, and the ratio between the first computing resource and the first memory resource is a first ratio;
the acquiring unit is further configured to acquire resource demand information of the first process, where the resource demand information includes a first computing resource demand and a first memory resource demand, and a ratio between the first computing resource demand and the first memory resource demand is a second ratio;
the determining unit is configured to determine, according to a difference between a first ratio and a second ratio corresponding to each NUMA node in the plurality of NUMA nodes, a NUMA node with a smallest difference as a target NUMA node from the plurality of NUMA nodes, where the target NUMA node is used to run a first process.
Based on the third aspect, in an optional implementation, the computer device further includes:
and the updating unit is used for updating the distributable resource information of the target NUMA node according to the resource demand information of the first process to obtain updated distributable resource information, wherein the updated distributable resource information comprises a second computing resource, and the second computing resource is a difference value between the first computing resource and the first computing resource demand.
Based on the third aspect, in an optional implementation manner, the updated allocable resource information includes a second memory resource, where the second memory resource is a difference between the first memory resource and a demand of the first memory resource.
In an optional implementation manner based on the third aspect, the plurality of processes further includes a second process;
the acquiring unit is further configured to acquire distributable resource information of each NUMA node in the NUMA nodes, where the distributable resource information includes a second computing resource and a second memory resource, a ratio between the second computing resource and the second memory resource is a third ratio, and the distributable resource information of the target NUMA node is updated distributable resource information;
the acquiring unit is further configured to acquire resource demand information of the second process, where the resource demand information includes a second calculation resource demand and a second memory resource demand, and a ratio between the second calculation resource demand and the second memory resource demand is a fourth ratio;
and the determining unit is used for determining the NUMA node with the minimum difference as a target NUMA node from the NUMA nodes according to the difference between the third ratio and the fourth ratio corresponding to each NUMA node in the NUMA nodes, and the target NUMA node is used for running a second process.
In an alternative embodiment based on the third aspect, the first computing resource requirement is expressed as M processor cores, where M is a positive number including a fractional number.
In a fourth aspect, an embodiment of the present application provides a computer device, where the computer device includes a target non-uniform memory access NUMA node, the target NUMA node includes a plurality of processor cores, and the computer device includes:
the device comprises a determining unit, a judging unit and a judging unit, wherein the determining unit is used for determining a first process, and the first process is a process to be operated in a target NUMA node;
the determining unit is further used for determining a target processor core from the plurality of processor cores, wherein the target processor core comprises a first computing resource and a second computing resource, the first computing resource is used for running a second process, and the second computing resource is an idle resource;
an execution unit to execute the first process using a second computing resource of the target processor core.
In an optional embodiment according to the fourth aspect, the plurality of processor cores are all processor cores in the target NUMA node.
In an optional embodiment according to the fourth aspect, the target NUMA node further includes a first processor core, the first processor core is configured to run a specific process, and a computing resource corresponding to the first processor core is usable only by the specific process.
In a fifth aspect, an embodiment of the present application provides a computer device, where the computer device includes a target non-uniform memory access NUMA node, and the target NUMA node includes multiple processors;
a plurality of processors to provide computing resources for a target NUMA node;
the target NUMA node is to run a plurality of processes using computing resources corresponding to the plurality of processors, and the computing resources corresponding to each of the plurality of processors may be used while running each process.
In a sixth aspect, an embodiment of the present application provides a computer device, where the computer device includes a target non-uniform memory access NUMA node and a controller, and the target NUMA node includes multiple processors;
a plurality of processors to provide computing resources for a target NUMA node;
the target NUMA node is to run a plurality of processes using computing resources corresponding to the plurality of processors,
the controller is used for determining a first process, wherein the first process is a process to be operated in a plurality of processes;
the controller is further used for determining a target processor core from the plurality of processor cores, wherein the target processor core comprises a first computing resource and a second computing resource, the first computing resource is used for running a second process, and the second computing resource is an idle resource;
the controller is further configured to run the first process using the second computing resource of the target processor core.
In a seventh aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the method for executing a process according to any one of the above aspects.
In an eighth aspect, the present application provides a computer program product or a computer program, which comprises computer instructions, when run on a computer, cause the computer to perform the method for executing a process according to any one of the above aspects.
In a ninth aspect, embodiments of the present application provide a chip system, which includes a processor, configured to implement the functions recited in the above aspects, for example, to transmit or process data and/or information recited in the above methods. In one possible design, the system-on-chip further includes a memory for storing program instructions and data necessary for the server or the communication device. The chip system may be formed by a chip, or may include a chip and other discrete devices.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only the embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic architecture diagram of a NUMA system 100 provided by an embodiment of the present application;
FIG. 2 is a system framework diagram of a method for running a process according to an embodiment of the present application;
fig. 3 is a schematic flowchart of an operation process according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating process allocation according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of another computer device according to an embodiment of the present application.
Detailed Description
The embodiment of the application provides a process running method and related equipment, which are used for improving the utilization rate of each processor core in a NUMA node and reducing the waste of resources.
The embodiments of the present invention will be described below with reference to the drawings. The terminology used in the description of the embodiments of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As can be known to those skilled in the art, with the development of technology and the emergence of new scenes, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, an application scenario of the present application is described below, and the method for running a process in the embodiment of the present application may be applied to a computer system based on NUMA. Referring to fig. 1, fig. 1 is a schematic structural diagram of a NUMA system 100 according to an embodiment of the present disclosure. Among the architectures of NUMA system 100, there may be a multi-socket system 100. As shown in fig. 1, NUMA system 100 includes slot (socket) 101a and slot 101b. Slot 101a and slot 101b may be collectively referred to herein as slots and may be used to mount a Central Processing Unit (CPU). The slots may be communicatively coupled to each other by an interconnect 104. Illustratively, each socket may be connected to each of the other sockets via a point-to-point path interconnect (QPI) link. It should be noted that QPI is an interconnect architecture, and the interconnection between slots in the embodiment of the present application may also be implemented by other interconnect architectures, such as other point-to-point architectures, ring architectures, and bus architectures, which are not limited herein. The number of slots depicted in NUMA system 100 is merely an example, and those skilled in the art will appreciate that there may be different numbers of slots. For example, NUMA system 100 may include a number of slots such as six, four, or fewer, or NUMA system 100 may include a number of slots such as sixteen, thirty-two, or more.
A socket may include multiple nodes (nodes), each having its own CPU and memory, connected and communicating via interconnect 104, as shown in fig. 1, socket 101a including node 102a and node 102b, and socket 101b including node 103a and node 103b, each including one memory and six CPUs. It is noted that the number of nodes and CPUs depicted in NUMA system 100 is merely an example, and those skilled in the art will appreciate that each slot may include other numbers of nodes and each node may include other numbers of CPUs.
The NUMA system shown in FIG. 1 described above is intended for use with a computer device configured with multiple processors and multiple memories. In a computer device, the time required for a processor in each NUMA node to access memory within the node is much less than the time it takes to access memory on other NUMA nodes. In the prior art, each process is generally bound to a corresponding processor to run, so that each process is isolated and limited. That is, each process can only run on the processor to which it is bound, and the process cannot run on other processors in the local NUMA node or processors on other NUMA nodes. On the other hand, each processor can only be used by bound processes, but the processor cannot be used by other processes in the local NUMA node or processes on other NUMA nodes. However, most processes require a non-integer number of processor cores, such as 0.5, 0.8, or 1.2, and so on, and for this non-integer number of processor cores, the number of processor cores required by the process is generally rounded up. For example, when a process needs 0.5 processor cores, 1 processor is allocated to the process, and for example, when a process needs 1.2 processor cores, 2 processors are allocated to the process. With this allocation, the computing resources of the processor are often not fully utilized after the processor is bound to the process. Further, generally, processes are not always in the highest resource appeal, and the process has a high or low resource appeal at various different time periods. For example, the number of processor cores that a process may need when allocating is 1.2, but in an actual operation process, the resource appeal of the process may be lower than expected in some cases, and only 0.8 processor cores need to be occupied. However, the process has bound to 2 processor cores, and then 1.2 (2-0.8 = 1.2) logical resources are wasted. Therefore, in this resource allocation mode, the existing resources are not fully utilized, and more resources are wasted than expected.
In view of this, embodiments of the present application provide a method for running processes, which can reasonably allocate each process to a corresponding NUMA node for running, so as to reduce the above-mentioned cross-node access during the running process of the processes, on the other hand, improve the utilization rate of each processor core in the NUMA node, and reduce the waste of resources.
In order to enable each process to reasonably and efficiently run on a local NUMA node, in this embodiment of the present application, each process needs to be allocated to a suitable NUMA node before running, so that the computing resources (the number of processor cores) and the storage resources (the memory capacity) in each NUMA node can meet the running needs of all processes in the node, and meanwhile, the waste of resources due to surplus computing resources and storage resources allocated to the process is avoided. Referring to fig. 2, fig. 2 is a system framework diagram of a method for running a process according to an embodiment of the present application. As shown in fig. 2, the system framework mainly includes a hardware resource layer (processor core and memory layer), an application resource management layer, a resource partitioning rule and allocation policy technology layer, and an application layer. Wherein, the division of work of each level is as follows:
hardware resource layer: this layer is the set of all processor cores and memory in the computer device. Specifically, all processor cores and memory in a computer device are partitioned into NUMA nodes. The "NUMA node" indicates the closest distance between the processor cores and the memory, that is, the combination of the processor cores and the local memory is a NUMA node, which is determined by the hardware resources of the computer device. Under the condition that hardware resources are fixed, the processor core and the memory are fixed, the time for the processor core in the node to access the memory is shortest, and the access efficiency is highest. The processor and the memory between the nodes can be accessed mutually, and the access time depends on the distance between the memory and the processor.
Application resource management layer: the application resource management layer mainly divides a processor core and a memory in the hardware resource for the second time, namely determines an allocation path according to the calculation result of the resource division rule and the allocation strategy technical layer, thereby arranging the use resources of each process in the application layer. The application resource management layer needs to acquire the current allocable resource information of each NUMA node and the resource demand information of each process, and calculates the corresponding resource allocation principle by the resource partition rule and the allocation policy technology layer.
Resource partitioning rules and allocation policy technology layer: in order to improve the efficiency and time of a processor core for accessing a memory and improve the utilization rate of the processor, various resource arranging strategies and resource using strategies are invented. Specifically, the optimal resource allocation principle may be calculated according to the resource demand information of each process and the allocable resources of each NUMA node, so as to allocate a corresponding NUMA node to each process.
An application layer: the layer is provided with processes corresponding to the services, and each process needs to consume certain computing resources and storage resources when running.
Next, a method for executing the process proposed in the present application will be explained. Referring to fig. 3, fig. 3 is a schematic flowchart of an operation process provided in the embodiment of the present application, and as shown in fig. 3, a method for operating a process in the embodiment of the present application includes:
301. the controller acquires the configurable resource information of each NUMA node;
the method for running the process provided by the application is applied to a computer device (such as a server) configured with a plurality of processor cores and a plurality of memories. The plurality of processor cores and the plurality of memories are divided into a plurality of NUMA nodes by using a NUMA system, each NUMA node includes a plurality of processor cores and memories, the number of the processor cores and the memory capacity between the NUMA nodes may be different from each other, and the deployment conditions of the computing resources (the number of the processor cores) and the storage resources (the memory capacity) in the NUMA nodes may be configured according to actual needs, which is not limited herein.
Further, the computer device includes a controller, which may be an operating system of the computer device, and the functions of the application resource management layer and the resource partitioning rule and allocation policy technology layer as shown in fig. 2 may be executed by the controller of the computer device.
In the embodiment of the application, before the processes are run, the controller needs to allocate each process to a suitable NUMA node, so that the computing resources (the number of processor cores) and the storage resources (the memory capacity) in each NUMA node can meet the running needs of all the processes in the node, and meanwhile, the surplus of the computing resources and the storage resources allocated to the processes is avoided. And the controller needs to acquire the configurable resource information of each NUMA node in the process of allocating the corresponding NUMA node to each process. In this embodiment, the configurable resource information of the NUMA node includes a computing resource (number of processor cores) and a storage resource (memory capacity) of the NUMA node, so as to determine how many resources are left in each NUMA node currently for allocation to each process.
It should be noted that, for convenience of description, in the embodiment of the present application, for a computing resource used by a process during running, a share of a processor core occupied by the process is quantized. For example, if the computing resource used by a process during running is a computing resource corresponding to 1.5 processor cores, for convenience of description, the process may occupy 1.5 processor cores.
For example, taking three NUMA nodes, i.e., a node a, a node B, and a node C, as an example, assuming that the configurable resource information of the node a is 10 processor cores and 100G memory capacity in the acquired configurable resource information of each NUMA node, it indicates that currently, 10 processor cores and 100G memory capacity remain in the node a and are available for allocation to each process; if the configurable resource information of the node B is 15 processor cores and 120G memory capacity, it indicates that 15 processor cores and 120G memory capacity are still left in the node B currently for allocation to each process; the configurable resource information of the C node is 8 processor cores and 100G memory capacity, which means that currently, 8 processor cores and 100G memory capacity are still left in the C node for allocation to each process.
302. The controller acquires resource demand information of each process;
and (4) process: when a program is executed, it changes from a binary file on disk to a collection of data in computer memory, values in registers, instructions in a stack, files opened, and various state information for the computer device. The sum of the execution environments of the computer device, in which such a program is executed, is a process.
For a process, its static behavior is a program; once running, it becomes the sum of the data and state in the computer, which is a dynamic representation of the process.
A container is a dynamic representation of a process by constraining and modifying it to create a "boundary" for the process so that resources, files, states, or configurations accessed by the process are constrained by the container at runtime. That is, a container is a special process, and each container has its own independent process space and is isolated from other processes.
In combination with the above description directed to the process and the container, in the embodiment of the present application, dynamic representation of the process during running is not limited, that is, the process in the method for running the process provided by the present application may exist in a program state or in a container mode, and is not limited here specifically.
The controller also needs to acquire resource demand information of each process in the process of allocating corresponding NUMA nodes to each process. The resource requirement information of a process indicates the computing resources (the number of processor cores) and the storage resources (the memory capacity) required by the process when the process runs. Of course, in the actual operation process, the resource demand of the process often fluctuates up and down within a certain range, and in the present application, the resource demand in the standard operation state of each process is used as the reference for allocating the resource.
Exemplarily, taking a process No. 1, a process No. 2, and a process No. 3 as an example, assuming that resource demand information of the process No. 1 is 1.5 processor cores and 10G memory capacity in the acquired resource demand information of each process, it indicates that the process No. 1 needs to consume 1.5 processor cores and 10G memory capacity when running; if the resource demand information of the process 2 is 1.8 processor cores and 15G memory capacity, it indicates that the process 2 needs to consume 1.8 processor cores and 15G memory capacity during operation; the resource requirement information of the process 3 is 2 processor cores and 20G memory capacity, which means that the process 3 needs to consume 2 processor cores and 20G memory capacity when running.
It should be noted that, in the embodiment of the present application, the timing relationship between step 301 and step 302 is not limited, and the controller may first execute step 301 and then execute step 302; step 302 may be executed first, and then step 301 may be executed, which is not limited herein.
303. Distributing corresponding NUMA nodes for each process;
in the embodiment of the application, after the configurable resource information of each NUMA node and the resource demand information of each process are acquired, the process can be allocated to each NUMA node. Theoretically, a process can be allocated to a NUMA node as long as its configurable resources meet the resource requirements of the process. However, in practical applications, the number of processes in a computer device is large, and therefore, it is necessary to prioritize the processes and allocate NUMA nodes one by one in the order of priority. On the other hand, the configurable resource information of each NUMA node is different, and the resource requirements of each process are also different, for example, the configurable resource information of a certain NUMA node is 6 processor cores and 8G memory capacity, and if the processor cores with large demand and the processes with small memory demand are allocated to the NUMA node, it is likely that all the processor cores of the NUMA node are occupied, but the memory capacity of the NUMA node is insufficient. Therefore, in the application, each process is allocated to each NUMA node according to a certain allocation standard, and the waste of computing resources and storage resources caused by unreasonable allocation is avoided.
Next, description is made of allocation logic for allocating processes to NUMA nodes in the embodiment of the present application. Referring to fig. 4, fig. 4 is a schematic flowchart of a process allocation in the embodiment of the present application, and as shown in fig. 4, the process allocation in the embodiment of the present application includes:
3031. determining the priority of the process;
in practical application, the number of processes in a computer device is large, so that it is necessary to prioritize the processes and allocate NUMA nodes one by one according to the priority order. Specifically, in the present application, the priority of the process may be determined by using the memory demand of the process, and the larger the memory demand is, the higher the corresponding priority is; the priority of the process can also be determined by adopting the size of the demand of the processor core of the process, and the larger the demand of the processor core is, the higher the corresponding priority is; and when the NUMA node needs to be distributed to the process, searching the priority relation table to determine the priority of the process. It should be understood that, in practical applications, the determination criteria of the process priority may also be made according to practical needs, and is not limited herein.
In the following description, the process description will be given by taking as an example the assignment of a target NUMA node to a first process and a second process, where the first process has a higher priority than the second process.
3032. Calculating the ratio of computing resources to storage resources in the plurality of NUMA nodes;
through step 301, the controller obtains current allocable resource information of each NUMA node in the NUMA nodes, where the allocable resource information includes a first computing resource and a first memory resource, and further calculates a ratio between the first computing resource and the first memory resource of each NUMA node as a first ratio of the NUMA node.
For example, assuming that the first computing resource of a NUMA node is 10 processor cores and the first memory resource is 100G memory capacity, the current first ratio of the NUMA node is 10; if the first computing resource of a NUMA node is 20 processor cores and the first memory resource is 100G memory capacity, the current first ratio of the NUMA node is 20.
3033. Calculating the ratio of the computing resources required by the process to the storage resources;
through step 301, the controller obtains current resource demand information of each process, where the resource demand information includes a first computational resource demand and a first memory resource demand, and further calculates a ratio of the first computational resource demand to the first memory resource demand of each process as a second ratio.
Because the NUMA node corresponding to the first process needs to be the NUMA node corresponding to the first process, the second ratio corresponding to the first process needs to be acquired.
For example, assuming that the first resource requirement of the first process is 2.5 processor cores and 25G memory, the second ratio of the first process is 2.5; the first resource demand of the second process is 2 processor cores and 10G memory, and the second ratio of the first process is 2:10=0.2.
It should be noted that, in this embodiment of the application, the timing relationship between the step 3032 and the step 3033 is not limited, and the controller may execute the step 3032 first and then execute the step 3033; step 3033 may be executed first, and then step 402 may be executed, which is not limited herein.
3034. Distributing each process to a corresponding NUMA node;
and preferentially distributing the first process to the target NUMA node because the priority of the first process is higher than that of the second process. Specifically, for a target NUMA node allocated by a first process, a first computing resource of the target NUMA node should meet a first computing resource demand of the first process, and a first memory resource of the target NUMA node should meet a first memory resource demand of the first process. Further, after the step 402, the controller already obtains the first ratios of the NUMA nodes, compares the first ratios corresponding to the NUMA nodes with the second ratio of the first process for difference, and then selects a NUMA node with the smallest difference from the NUMA nodes as a target NUMA node, in other words, the difference between the first ratio of the target NUMA node and the second ratio of the first process is the one with the smallest difference between the first ratios of the NUMA nodes and the second ratio of the first process.
Illustratively, assume that the computer device has three NUMA nodes A, B, and C, the first ratio for the A node is 0.2, the first ratio for the B node is 0.15, the first ratio for the C node is 0.25, and the second ratio for the first process is 0.1. After comparison, the difference between the first ratio of the node B and the second ratio of the first process (i.e., |0.1-0.15| = 0.05) is the closest, that is, the node B can be determined to be the target NUMA node corresponding to the first process.
After the target NUMA node to which the first process corresponds is determined, the first process can then be assigned to the target NUMA node so that the target NUMA node can be used to run the first process.
After the first process is allocated to the target NUMA node, the allocable resources of the target NUMA node are correspondingly reduced, and therefore, in order to continue using the target NUMA node for a subsequent process allocation flow, the allocable resource information of the target NUMA node needs to be updated. The distributable resource information updated by the target NUMA node comprises a second computing resource and a second memory resource, the second computing resource is a difference value between the first computing resource and the demand quantity of the first computing resource, and the second memory resource is a difference value between the first memory resource and the demand quantity of the first memory resource.
The allocable resource information of the target NUMA node, after being updated, is also available for allocation to other processes (including the second process). And in the subsequent process distribution flow, the latest distributable resource information of the target NUMA node is used for calculation. Because the priority of the first process is higher than that of the second process, the corresponding NUMA node needs to be allocated to the second process after the resource allocation is performed to the first process. In this application, a case will be described where the second process is the same as the first process and is allocated to the target NUMA node in the same manner.
Specifically, similar to the foregoing process of allocating the first process, the controller obtains allocable resource information of each NUMA node in the plurality of NUMA nodes, where the allocable resource information includes a second computing resource and a second memory resource, and calculates a ratio between the second computing resource and the second memory resource as a third ratio. And the allocable resource information of the target NUMA node is updated allocable resource information. It should be noted that, after the target NUMA node is allocated by the first process, it is possible that the next NUMA node is allocated to the second process, and then in each NUMA node of the computer device, only the allocable resource information of the target NUMA node is changed, but other NUMA nodes are not used for allocating other processes, and then the allocable resource information of other NUMA nodes is not changed, that is, at this time, the first computing resource and the first memory resource of other NUMA nodes are the same as the second computing resource and the second memory resource; on the other hand, after the assignment of the first process to the target NUMA node is passed, it is possible that the assignment of the corresponding NUMA node for other processes (not including the second process) may follow before it is time to assign the NUMA node for the second process. Then, in each NUMA node of the computer device, all NUMA nodes participating in resource allocation of other processes have a change in the corresponding allocable resource information, that is, the first computing resource and the first memory resource of the NUMA nodes are different from the second computing resource and the second memory resource. And only those NUMA nodes for which no other processes are allocated during those periods have their corresponding allocatable resource information unchanged.
Therefore, in the present application, the value of the first computing resource and the value of the second computing resource may be the same or different; the value of the first memory resource and the value of the second memory resource may be the same or different.
The controller obtains resource demand information of the second process, the resource demand information includes a second computing resource demand and a second memory resource demand, and a ratio between the second computing resource demand and the second memory resource demand is a fourth ratio. The controller determines a NUMA node with the minimum difference as a target NUMA node from the NUMA nodes according to the difference between the third ratio and the fourth ratio corresponding to each NUMA node in the NUMA nodes. After the target NUMA node corresponding to the second process is determined, the second process can be assigned to the target NUMA node so that the target NUMA node can be used to run the second process.
It should be understood that the above steps 3031 to 3034 describe the process assignment flow in the present application, and the process assignment flow is applicable to any process. If other process tasks are newly added to the computer device in the process of running the subsequent process, the newly added process may also be allocated by using the process allocation flow described in steps 3031 to 3034, which is not described herein again.
In this embodiment of the application, each process in the computer device may be allocated to a corresponding NUMA node to run through the process allocation flows shown in steps 301 to 303. For the flow of allocating other processes to respective NUMA nodes, please refer to the description of step 301 to step 303 specifically, which is not described herein again.
304. Sharing computing resources in the NUMA node to each process in the node;
after each process is allocated to a corresponding NUMA node, the controller may obtain resource allocation information, where the resource allocation information is used to indicate that, when each process in the NUMA node is running, the computing resources corresponding to each processor core in the NUMA node may be used. Accordingly, the controller runs a corresponding process on each NUMA node according to the resource allocation information.
Taking the example that the plurality of processes distributed to the target NUMA node include the target process, when the target process needs to be run, the controller determines a plurality of processor cores, which can be used for running the target process, in the target NUMA node according to the resource distribution information. Further, the controller determines an idle processor core from the plurality of processor cores, wherein a part or all of the computing resources in the idle processor core are not used, and the controller may run the target process on the idle processor core.
In the embodiment of the application, each process is not bound to a certain fixed processor core to run, but the computing resources in the NUMA node are shared to each process in the node, and each process can use the computing resources of any processor core of the NUMA node to complete running. On the other hand, each process only runs on the NUMA node allocated to the process, so that the condition that a certain process accesses resources across nodes is avoided, and the running efficiency of the process is improved.
In other words, in the embodiment of the present application, during the process running, the same processor core may be used by multiple processes. Taking an example of a first process requesting to run in a target NUMA node, the controller may determine a target processor core from the plurality of processor cores, the target processor core including a first computing resource and a second computing resource, where the first computing resource is already used to run a second process and the second computing resource is an idle resource. The controller may run the first process using the second computing resources of the target processor core. In the above manner, for the processes to be executed in the NUMA node, the processes to be executed can be continuously allocated to the processor cores that have already executed other processes to be executed. In other words, in the present application, the same processor core can simultaneously run a plurality of different processes, and a situation that the remaining idle processes of a certain processor core cannot be used by other processes after a process is run is avoided, so that the utilization rate of each processor core in the NUMA node is improved, and the waste of resources is reduced.
Further, in the actual running process of the process, the resource demand of the process often fluctuates up and down within a certain range. For example, the number of processor cores occupied by some processes may exceed the resource requirements of the processes in a standard running state, and at this time, the processes may use the computing resources of other idle processor cores in the current NUMA node to complete running. For another example, if the number of processor cores occupied by some processes is lower than the resource requirement of the process in the standard operating state, the computing resources released by the process may be used by other processes. Therefore, the method for running the processes in the application can meet the resource requirements of the processes in different running states, and further improves the resource utilization rate of the computer equipment.
Further, in practical applications, only a portion of the processor cores in the NUMA node may be shared, so that other processor cores that are not shared may be used to execute other specific processes, and the specific processes have independent processor cores available for use. Specifically, in this embodiment of the present application, a specific number of processor cores (i.e., a first processor core) may be selected at the target NUMA node, and the computing resources of the first processor core and the computing resources of other processor cores in the target NUMA node are not shared but isolated from each other. The resource allocation information in this application is also used to instruct the target NUMA node to use the first processor core to run a specific process, and the computing resource corresponding to the first processor core can be used by the specific process only. Therefore, the specific processes have independent processor cores (first processor cores) available in the running process, the computing resources of other processor cores do not need to be used, and the computing resources of the independent processor cores are not used by other processes, so that the resource requirements of the specific processes in the running process are ensured.
On the basis of the embodiments corresponding to fig. 2 to fig. 4, in order to better implement the above-mentioned solution of the embodiment of the present application, the following also provides related equipment for implementing the above-mentioned solution. Specifically, referring to fig. 5, fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure, where the computer device includes:
an obtaining unit 501, configured to obtain resource allocation information, where the resource allocation information is used to instruct a target NUMA node to run multiple processes using computing resources corresponding to multiple processor cores, and when each process is run, the computing resource corresponding to each processor core in the multiple processor cores may be used;
a running unit 502, configured to run multiple processes on the target NUMA node according to the resource allocation information.
In one possible design, the plurality of processor cores are all processor cores in the target NUMA node.
In one possible design, the target NUMA node further includes a first processor core, the resource allocation information is further to indicate that the target NUMA node uses the first processor core to run a particular process, and that computing resources corresponding to the first processor core are usable only by the particular process.
In a possible design, the multiple processes include a target process, and the execution unit 502 is specifically configured to:
determining the plurality of processor cores according to the resource allocation information; determining an idle processor core from the plurality of processor cores; and running the target process on the idle processor core.
In one possible design, the computer device includes a plurality of NUMA nodes, the plurality of processes includes a first process, the computer device further includes a determining unit 503;
the acquiring unit 501 is further configured to acquire distributable resource information of each NUMA node in the NUMA nodes, where the distributable resource information includes a first computing resource and a first memory resource, and a ratio between the first computing resource and the first memory resource is a first ratio;
the obtaining unit 501 is further configured to obtain resource demand information of the first process, where the resource demand information includes a first computing resource demand and a first memory resource demand, and a ratio between the first computing resource demand and the first memory resource demand is a second ratio;
a determining unit 503, configured to determine, according to a difference between the first ratio and the second ratio corresponding to each NUMA node in the plurality of NUMA nodes, a NUMA node with a smallest difference as a target NUMA node from the plurality of NUMA nodes, where the target NUMA node is used to run the first process.
In one possible design, the computer device further includes:
an updating unit 504, configured to update allocable resource information of the target NUMA node according to the resource demand information of the first process to obtain updated allocable resource information, where the updated allocable resource information includes a second computing resource, and the second computing resource is a difference between the first computing resource and the first computing resource demand.
In one possible design, the updated allocable resource information includes a second memory resource, where the second memory resource is a difference between the first memory resource and a demand of the first memory resource.
In one possible design, the plurality of processes further includes a second process;
the obtaining unit 501 is further configured to obtain allocable resource information of each NUMA node in the plurality of NUMA nodes, where the allocable resource information includes a second computing resource and a second memory resource, a ratio between the second computing resource and the second memory resource is a third ratio, and the allocable resource information of the target NUMA node is updated allocable resource information;
the obtaining unit 501 is further configured to obtain resource demand information of a second process, where the resource demand information includes a second computing resource demand and a second memory resource demand, and a ratio between the second computing resource demand and the second memory resource demand is a fourth ratio;
a determining unit 503, configured to determine, according to a difference between the third ratio and the fourth ratio corresponding to each NUMA node in the plurality of NUMA nodes, a NUMA node with a smallest difference as a target NUMA node from the plurality of NUMA nodes, where the target NUMA node is used to run the second process.
In one possible design, the first computing resource requirement is represented as M processor cores, where M is a positive number comprising a fractional number.
It should be noted that, the contents of information interaction, execution process, and the like between the modules/units in the computer device are based on the same concept as the method embodiments corresponding to fig. 2 to fig. 4 in the present application, and specific contents may refer to the description in the foregoing method embodiments in the present application, and are not described herein again.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a computer device provided in this embodiment, a computer described in the embodiment corresponding to fig. 5 may be disposed on the computer device 600 to implement the functions of the controller in the embodiment corresponding to fig. 3 or fig. 4, specifically, the computer device 600 is implemented by one or more servers, and the computer device 600 may generate relatively large differences due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 622 (e.g., one or more processors) and a memory 632, and one or more storage media 630 (e.g., one or more mass storage devices) storing the application program 642 or the data 644. Memory 632 and storage medium 630 may be, among other things, transient or persistent storage. The program stored in the storage medium 630 may include one or more modules (not shown), each of which may include a sequence of instructions operating on a computer device. Still further, the central processor 622 may be configured to communicate with the storage medium 630 to execute a series of instruction operations in the storage medium 630 on the computer device 600.
The computer apparatus 600 may also include one or more power supplies 626, one or more wired or wireless network interfaces 650, one or more input-output interfaces 658, and/or one or more operating systems 641, such as a Windows Server TM ,Mac OS X TM ,Unix TM ,Linux TM ,FreeBSD TM And so on.
Embodiments of the present application also provide a computer program product, which when executed on a computer causes the computer to execute the steps performed by the controller in the method described in the foregoing embodiments shown in fig. 3 or fig. 4.
Also provided in an embodiment of the present application is a computer-readable storage medium, which stores a program for signal processing, and when the program is executed on a computer, the computer is caused to execute the steps executed by the controller in the method described in the embodiment shown in fig. 3 or fig. 4.
It should be noted that the above-described embodiments of the apparatus are merely schematic, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiments of the apparatus provided in the present application, the connection relationship between the modules indicates that there is a communication connection therebetween, and may be implemented as one or more communication buses or signal lines.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus necessary general-purpose hardware, and certainly can also be implemented by special-purpose hardware including special-purpose integrated circuits, special-purpose CPUs, special-purpose memories, special-purpose components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, for the present application, the implementation of a software program is more preferable. Based on such understanding, the technical solutions of the present application or portions thereof that contribute to the prior art may be embodied in the form of a software product, which is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, an exercise device, or a network device) to execute the method according to the embodiments of the present application.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, training device, or data center to another website site, computer, training device, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that a computer can store or a data storage device, such as a training device, data center, etc., that includes one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others.

Claims (21)

1. A method of running a process, the method applied to a computer system including a target non-uniform memory access (NUMA) node and a controller, the method comprising:
the controller acquires resource allocation information, wherein the resource allocation information is used for indicating the target NUMA node to use computing resources corresponding to a plurality of processor cores to run a plurality of processes, and when each process is run, the computing resources corresponding to each of the plurality of processor cores can be used;
and the controller runs the plurality of processes on the target NUMA node according to the resource allocation information.
2. The method of claim 1, wherein the plurality of processor cores are all of the processor cores in the target NUMA node.
3. The method of claim 1, wherein the target NUMA node further comprises a first processor core, wherein the resource allocation information is further to indicate that the target NUMA node uses the first processor core to run a particular process, and wherein the computing resource corresponding to the first processor core is usable only by the particular process.
4. The method of claim 1, 2, or 3, wherein the plurality of processes comprises a target process, and wherein the controller running the plurality of processes on the target NUMA node based on the resource allocation information comprises:
the controller determines the plurality of processor cores according to the resource allocation information;
the controller determines an idle processor core from the plurality of processor cores;
the controller runs the target process on the idle processor core.
5. The method of claim 1, 2, or 3, wherein the computer system comprises a plurality of NUMA nodes, the plurality of processes comprises a first process, and before the controller obtains the resource allocation information, the method further comprises:
the controller obtains distributable resource information of each NUMA node in a plurality of NUMA nodes, wherein the distributable resource information comprises a first computing resource and a first memory resource, and the ratio between the first computing resource and the first memory resource is a first ratio;
the controller acquires resource demand information of the first process, wherein the resource demand information comprises a first computing resource demand and a first memory resource demand, and the ratio of the first computing resource demand to the first memory resource demand is a second ratio;
the controller determines, from the plurality of NUMA nodes, a NUMA node with the smallest difference as the target NUMA node according to a difference between the first ratio and the second ratio corresponding to each NUMA node in the plurality of NUMA nodes, where the target NUMA node is used to run the first process.
6. A method of running a process, the method applied to a computer system, the computer system including a target non-uniform memory access (NUMA) node and a controller, the target NUMA node including a plurality of processor cores, the method comprising:
the controller determines a first process, wherein the first process is a process to be run in the target NUMA node;
the controller determines a target processor core from the plurality of processor cores, wherein the target processor core comprises a first computing resource and a second computing resource, the first computing resource is used for running a second process, and the second computing resource is an idle resource;
the controller runs the first process using the second computing resource of the target processor core.
7. The method in accordance with claim 6, wherein the plurality of processor cores are all processor cores in the target NUMA node.
8. The method of claim 6, wherein the target NUMA node further comprises a first processor core, wherein the first processor core is to run a particular process, and wherein computing resources corresponding to the first processor core are only available for use by the particular process.
9. A computer device, characterized in that the computer device comprises:
the non-uniform memory access NUMA node comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring resource allocation information which is used for indicating a target non-uniform memory access NUMA node to use computing resources corresponding to a plurality of processor cores to run a plurality of processes, and when each process is run, the computing resources corresponding to each of the plurality of processor cores can be used;
and the running unit is used for running the processes on the target NUMA node according to the resource allocation information.
10. The computer device in accordance with claim 9, wherein the plurality of processor cores are all processor cores in the target NUMA node.
11. The computer device of claim 9, wherein the target NUMA node further includes a first processor core, wherein the resource allocation information is further to instruct the target NUMA node to run a particular process using the first processor core, and wherein the computing resource corresponding to the first processor core is usable only by the particular process.
12. The computer device according to claim 9, 10, or 11, wherein the plurality of processes includes a target process, and the execution unit is specifically configured to:
determining the plurality of processor cores according to the resource allocation information;
determining an idle processor core from the plurality of processor cores;
running the target process on the idle processor core.
13. The computer device of claims 9, 10, 11, wherein the computer device comprises a plurality of NUMA nodes, the plurality of processes comprising a first process, the computer device further comprising a determining unit;
the acquiring unit is further configured to acquire distributable resource information of each NUMA node in the plurality of NUMA nodes, where the distributable resource information includes a first computing resource and a first memory resource, and a ratio between the first computing resource and the first memory resource is a first ratio;
the acquiring unit is further configured to acquire resource demand information of the first process, where the resource demand information includes a first computing resource demand and a first memory resource demand, and a ratio between the first computing resource demand and the first memory resource demand is a second ratio;
the determining unit is configured to determine, according to a difference between the first ratio and the second ratio corresponding to each NUMA node in the plurality of NUMA nodes, a NUMA node with a smallest difference from among the plurality of NUMA nodes as the target NUMA node, where the target NUMA node is configured to run the first process.
14. A computer device, wherein the computer device comprises a target non-uniform memory access (NUMA) node, wherein the target NUMA node comprises a plurality of processor cores, and wherein the computer device comprises:
a determining unit, configured to determine a first process, where the first process is a process to be run in the target NUMA node;
the determining unit is further configured to determine a target processor core from the plurality of processor cores, where the target processor core includes a first computing resource and a second computing resource, the first computing resource is used to run a second process, and the second computing resource is an idle resource;
an execution unit to execute the first process using the second computing resource of the target processor core.
15. The computer device in accordance with claim 14, wherein the plurality of processor cores are all processor cores in the target NUMA node.
16. The computer device of claim 14, wherein the target NUMA node further comprises a first processor core, wherein the first processor core is to run a particular process, and wherein computing resources corresponding to the first processor core are only available for use by the particular process.
17. A computer device comprising a target non-uniform memory access (NUMA) node, the target NUMA node comprising a plurality of processors;
the plurality of processors to provide computing resources for the target NUMA node;
the target NUMA node is to run a plurality of processes using computing resources corresponding to the plurality of processors, and the computing resources corresponding to each of the plurality of processors may be used while each process is running.
18. A computer device comprising a target non-uniform memory access (NUMA) node and a controller, the target NUMA node comprising a plurality of processors;
the plurality of processors to provide computing resources for the target NUMA node;
the target NUMA node is used for running a plurality of processes by using computing resources corresponding to the plurality of processors;
the controller is configured to determine a first process, where the first process is a process to be run among the plurality of processes;
the controller is further configured to determine a target processor core from the plurality of processor cores, where the target processor core includes a first computing resource and a second computing resource, the first computing resource is used to run a second process, and the second computing resource is an idle resource;
the controller is further configured to run the first process using the second computing resource of the target processor core.
19. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method of any of claims 1 to 5, or which, when executed by a processor, implements the method of any of claims 6 to 8.
20. A computer program product having stored therein computer readable instructions for implementing the method of any one of claims 1 to 5 when executed by a processor or for implementing the method of any one of claims 6 to 8 when executed by a processor.
21. A chip system, characterized in that the chip system comprises at least one processor, which when program instructions are executed in the at least one processor causes the method according to any of claims 1 to 5 to be performed, or causes the method according to any of claims 6 to 8 to be performed.
CN202110937787.4A 2021-08-16 2021-08-16 Process running method and related equipment Pending CN115705247A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110937787.4A CN115705247A (en) 2021-08-16 2021-08-16 Process running method and related equipment
PCT/CN2022/090190 WO2023020010A1 (en) 2021-08-16 2022-04-29 Process running method, and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110937787.4A CN115705247A (en) 2021-08-16 2021-08-16 Process running method and related equipment

Publications (1)

Publication Number Publication Date
CN115705247A true CN115705247A (en) 2023-02-17

Family

ID=85180393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110937787.4A Pending CN115705247A (en) 2021-08-16 2021-08-16 Process running method and related equipment

Country Status (2)

Country Link
CN (1) CN115705247A (en)
WO (1) WO2023020010A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116483013A (en) * 2023-06-19 2023-07-25 成都实时技术股份有限公司 High-speed signal acquisition system and method based on multichannel collector

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117389749B (en) * 2023-12-12 2024-03-26 深圳市吉方工控有限公司 Task processing method, device, equipment and storage medium based on double mainboards

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018032519A1 (en) * 2016-08-19 2018-02-22 华为技术有限公司 Resource allocation method and device, and numa system
CN107479976A (en) * 2017-08-14 2017-12-15 郑州云海信息技术有限公司 A kind of multiprogram example runs lower cpu resource distribution method and device simultaneously
CN110597639B (en) * 2019-09-23 2021-07-30 腾讯科技(深圳)有限公司 CPU distribution control method, device, server and storage medium
JP7440739B2 (en) * 2019-11-25 2024-02-29 富士通株式会社 Information processing device and parallel computing program
CN112486679A (en) * 2020-11-25 2021-03-12 北京浪潮数据技术有限公司 Pod scheduling method, device and equipment for kubernets cluster

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116483013A (en) * 2023-06-19 2023-07-25 成都实时技术股份有限公司 High-speed signal acquisition system and method based on multichannel collector
CN116483013B (en) * 2023-06-19 2023-09-05 成都实时技术股份有限公司 High-speed signal acquisition system and method based on multichannel collector

Also Published As

Publication number Publication date
WO2023020010A1 (en) 2023-02-23

Similar Documents

Publication Publication Date Title
CN108701059B (en) Multi-tenant resource allocation method and system
US8261281B2 (en) Optimizing allocation of resources on partitions of a data processing system
JP3944175B2 (en) Dynamic processor reallocation between partitions in a computer system.
CN108519917B (en) Resource pool allocation method and device
JP2004062911A (en) System for managing allocation of computer resource
CN111966500A (en) Resource scheduling method and device, electronic equipment and storage medium
CN110221920B (en) Deployment method, device, storage medium and system
CN108900626B (en) Data storage method, device and system in cloud environment
WO2023020010A1 (en) Process running method, and related device
JP7506096B2 (en) Dynamic allocation of computing resources
CN110990154A (en) Big data application optimization method and device and storage medium
WO2020108337A1 (en) Cpu resource scheduling method and electronic equipment
CN114116173A (en) Method, device and system for dynamically adjusting task allocation
CN113204421A (en) Serverless co-distribution of functions and storage pools
CN107590000B (en) Secondary random resource management method/system, computer storage medium and device
CN116360973A (en) Data processing system and method of operation thereof
CN116302327A (en) Resource scheduling method and related equipment
WO2022063273A1 (en) Resource allocation method and apparatus based on numa attribute
US7426622B2 (en) Rapid locality selection for efficient memory allocation
CN115202859A (en) Memory expansion method and related equipment
CN117632457A (en) Method and related device for scheduling accelerator
CN112416538B (en) Multi-level architecture and management method of distributed resource management framework
CN111796932A (en) GPU resource scheduling method
KR101989033B1 (en) Appratus for managing platform and method for using the same
CN116483536B (en) Data scheduling method, computing chip and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication