WO2012163024A1

WO2012163024A1 - Memory management method and apparatus for multi-step non-uniform memory access numa architecture

Info

Publication number: WO2012163024A1
Application number: PCT/CN2011/081440
Authority: WO
Inventors: 章晓峰; 王伟
Original assignee: 华为技术有限公司
Priority date: 2011-10-27
Filing date: 2011-10-27
Publication date: 2012-12-06
Also published as: CN102439570A

Abstract

A memory management method and apparatus for multi-step non-uniform memory access NUMA architectures, the method comprises: when a system is initialized, determines the node group according to the delay information of the memory access of each node and user configuration information; obtains the memory use condition of each group and nodes in the node group, the memory use condition includes memory use ratios and free state indications; when a system initiates node memory configuration request, selects the free node group having the minimum memory access delay to distribute the memory according to the memory use conditions of each node group; and in the selected node group, distributes the memory to the node of the node group according to the memory use condition of the nodes in the node group. By the method, the characteristic of the multi-step non-uniform memory access architecture can be utilized, thereby both covering the efficiency and bandwidth effectively, and improving the memory management performance of the system.

Description

Memory management method and device for multi-step non-uniform memory access NUMA architecture

Technical field

The present invention relates to the field of memory management technologies, and in particular, to a memory management method and apparatus for multi-step non-uniform memory access NUMA architecture.

Background of the invention

Currently, non-uniform memory access (Non-Uniform Memory Access, NUMA) architecture has become the mainstream system architecture in the server field. In order to solve the problem of large difference in memory access costs between the far and near ends under the Numa architecture, the new multi-step and long-level NUMA architecture is more and more widely used. In the prior art, the memory hierarchy structure is determined according to the memory topology in the BIOS of the basic input/output system during system initialization, and the allocation principle is determined before the memory allocation, that is, the principle of proximity (efficiency priority) or the principle of uniform distribution (bandwidth priority); If the principle of proximity is adopted, the memory is allocated on the node with the smallest access delay. If the memory of the node is full, try to release the memory first. If the requirement is still not met after the release, try to allocate on the node with the second smallest access delay. Memory until the memory is successfully allocated; if the principle of even distribution is adopted, the memory is evenly distributed on all nodes.

In the multi-step multi-hop memory architecture, the memory access latency is non-uniformly distributed, assuming that the access delays are: 100ns, 140ns, 500ns, 900ns, 1040ns. In the prior art solution, if one node is set for each step, the simple use of the nearest allocation or the average allocation principle cannot balance the current situation of efficiency and bandwidth, that is, if the efficiency is prioritized, the nearest allocation principle is adopted, first in the The 100ns node fails to allocate memory. It needs to try to release the memory first, and then allocate the memory in the 140ns delay node. The difference between the 100ns delay and the 140ns delay has little effect on the performance, which makes the allocation overhead large and can only utilize the single node bandwidth. Bandwidth priority, using the principle of uniform distribution, requires a large number of cross-node access, affecting system efficiency.

Summary of the invention

The object of the invention is a memory management method and device for multi-step non-uniform memory access NUMA architecture, which can utilize the characteristics of multi-step non-uniform memory access architecture, effectively balance efficiency and bandwidth, and improve memory management performance of the system. .

A memory management method for a multi-step non-uniform memory access NUMA architecture, the method comprising:

When the system is initialized, the node group is determined according to the memory access delay information of each node and the user configuration information;

Obtaining a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle status indication;

When the system initiates a node memory allocation request, according to the memory usage status of each node group, the memory is allocated on the idle node group with the smallest memory access delay;

Within the selected node group, memory is allocated to nodes within the node group according to the memory usage status of the nodes in the node group.

A memory management device for multi-step non-uniform memory access NUMA architecture, the device comprising:

a node group setting module, configured to determine a node group according to memory access delay information and user configuration information of each node during system initialization;

a memory usage monitoring module, configured to acquire a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle status indication;

a node group selection module, configured to allocate memory on a free node group with the smallest memory access delay according to the memory usage status of each node group when the system initiates a node memory allocation request;

And a memory allocation module, configured to allocate, in the selected node group, a memory to a node in the node group according to a memory usage status of a node in the node group.

As can be seen from the technical solution provided above, the method includes: determining, during system initialization, a node group according to memory access delay information and user configuration information of each node; acquiring memory usage of each node group and nodes in the group a status, the memory usage status includes a memory usage ratio and an idle status indication; when the system initiates a node memory allocation request, according to the memory usage status of each node group, selecting to allocate memory on the idle node group with the smallest memory access delay; Within the selected node group, memory is allocated to nodes in the node group according to the memory usage status of the nodes in the node group. Through this method, it is possible to utilize the characteristics of the multi-step non-uniform memory access architecture, effectively balancing efficiency and bandwidth, and improving the memory management performance of the system.

BRIEF DESCRIPTION OF THE DRAWINGS

1 is a schematic flowchart of a memory management method for a multi-step non-uniform memory access NUMA architecture according to an embodiment of the present invention;

2 is a schematic diagram of an original memory topology in a specific example according to an embodiment of the present invention;

3 is a schematic diagram of a topology structure of an improved node group in a specific example according to an embodiment of the present invention;

4 is another schematic structural diagram of an improved node group topology in a specific example according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a memory management apparatus according to an embodiment of the present invention.

Mode for carrying out the invention

The embodiment of the invention provides a memory management method and device for multi-step non-uniform memory access NUMA architecture, and the method and the device can utilize the characteristics of the multi-step non-uniform memory access architecture to effectively balance efficiency and bandwidth. Improves the memory management performance of the system.

A specific embodiment of the present invention is described in detail below with reference to the accompanying drawings. FIG. 1 is a schematic flowchart of a memory management method for a multi-step non-uniform memory access NUMA architecture according to an embodiment of the present invention. include:

Step 11: When the system is initialized, the node group is determined according to the memory access delay information of each node and the user configuration information.

In this step, when the operating system is initialized, the node group Node is determined according to the memory access delay information of each node and the user configuration information setting. Group structure, specifically:

Due to the location of the memory hardware, the delay of the CPU accessing different memory is different, and different access delays will separate one node. Firstly, the memory delay information of each node is obtained, and then the nodes with similar delay differences are combined into one node group according to the configuration strategy of the user. For example, nodes with delay differences less than 50% can be combined into one node group.

The following is a specific example. FIG. 2 is a schematic diagram of an original memory topology in a specific example according to an embodiment of the present invention:

If the original memory topology in the multi-step multi-hop architecture is shown in Figure 2, if Node A, Node B...Node The access delay in the H node is 100 ns; Node Group AB, Node Group CD, Node Group EF, Node Group The access delay in GH is 140ns; Node Group ABCD, the access delay in Node Group EFGH is 900ns; Node Group The access delay in ABCDEFGH is 1040 ns; in the prior art, different access delays will separate one node, which will generate the original memory topology as shown in FIG. 2. Here, the access delays of 100ns and 140ns, 900ns and 1040ns are not much different. The original overly complex hierarchical structure affects the allocation efficiency and management complexity.

Example 1: The solution of the embodiment of the present invention can generate an optimized Node by using a configuration policy (Config user configuration or system initialization automatic configuration). Group topology 1, FIG. 3 is a schematic diagram of an improved node group topology structure in a specific example of the embodiment of the present invention, if we formulate rules in the Config user configuration or system initialization policy: the access delay difference is 50% The merge within is a node group. First, the memory access delays are 100ns, 140ns, 900ns, and 1040ns respectively. According to the above policy rules, the nodes of 100ns and 140ns are merged into one node, and the nodes of 900ns and 1040ns are merged into one node, thus generating the graph as shown in Figure 3. The topology shown, in Figure 3: Node Group AB, Node Group CD, Node Group EF, Node Group GH delay is less than 140ns, Node Group The delay within ABCDEFGH is less than 1040 ns.

As an example, if the Node A, Node B...Node H node has an access delay of 100 ns; Node Group AB, Node Group CD, Node Group EF, Node Group GH internal access delay is 140ns; Node Group ABCD, Node The access delay in Group EFGH is 300ns; the access delay in Node Group ABCDEFGH is 1040ns.

Example 2: In the same manner as the first case, the embodiment of the present invention can generate the optimized topology structure 2 according to the configuration policy. FIG. 4 is another schematic structural diagram of the improved node group topology in the specific example of the embodiment of the present invention. If the configuration strategy is set to: merge nodes with delay differences less than 3 times into one node group. The system memory latency is 100ns, 140ns, 300ns, and 1040ns, respectively. Then, according to the above configuration strategy, the nodes of 100 ns, 140 ns, and 300 ns are merged into one node group, thereby generating a topology as shown in FIG. 4, in FIG. 4: Node Group ABCD, Node Group The delay within EFGH is less than 300ns, and the delay within Node Group ABCDEFGH is less than 1040ns.

Step 12: Obtain the memory usage status of each node group and the nodes in the group.

In this step, each node group Node can be monitored through the memory usage monitoring module by setting a memory usage monitoring module in the memory management architecture. The memory usage of each Node in the Group and the node group. The memory usage status includes the memory usage ratio and the idle status indication, such as the monitoring status shown in Table 1 below:

	Node ABCD Node ABCD	Node EFGH Node EFGH	Node AB Node AB	Node CD Node CD	Node EF Node EF	Node GH Node GH	......	Node G Node G	Node H Node H
内存使用比例 Memory usage ratio	40% 40%	60%60%	20%20%	60%60%	80%80%	40%40%	......	10%10%	25%25%

Table 1

Step 13: When the system initiates a node memory allocation request, according to the memory usage status of each node group, the memory is allocated on the idle node group with the smallest memory access delay.

In this step, when the system initiates a node memory allocation request and needs to allocate memory to the nodes in the system, first select the corresponding node group according to the memory usage status of each node group, that is, select the idle node with the smallest memory access delay. Allocate memory on the group and then perform subsequent operations within the selected node group.

Step 14: In the selected node group, allocate memory to the nodes in the node group according to the memory usage status of the nodes in the node group.

In this step, after performing the operation of step 13 above, a memory allocation policy may be selected according to the memory usage status of the nodes in the node group in the selected node group, and the memory is allocated to the node group. Inside the node.

Specifically, the following situations can be included:

If the selected memory allocation policy is bandwidth priority, the memory is evenly distributed on each node according to the memory usage status of each node in the selected node group;

If the selected memory allocation policy is delay priority, the memory is allocated on the node with the smallest delay in the selected node group;

If the selected memory allocation policy is the default Default, the memory is randomly assigned to the node within the selected node group.

Through the implementation of the above technical solution, the characteristics of the multi-step non-uniform memory access architecture can be utilized, the efficiency and bandwidth are effectively considered, and the memory management performance of the system is improved.

The embodiment of the present invention further provides a memory management device for a multi-step non-uniform memory access NUMA architecture. FIG. 5 is a schematic structural diagram of a memory management device according to an embodiment of the present invention, where the device includes:

The node group setting module is configured to determine a node group according to the memory access delay information and the user configuration information of each node during system initialization, and the specific implementation manner is as described in the foregoing method embodiment.

The memory usage monitoring module is configured to obtain a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle state indication, and the specific implementation manner is as described in the foregoing method embodiment.

The node group selection module is configured to allocate memory on the idle node group with the smallest memory access delay according to the memory usage status of each node group when the system initiates the node memory allocation request, and the specific implementation manner is as described in the foregoing method embodiment. .

a memory allocation module, configured to allocate a memory to a node in the node group according to a memory usage status of a node in the node group in a selected node group, as described in the foregoing method embodiment. .

In addition, in a specific implementation process, the memory allocation module may further include:

The bandwidth priority allocation module is configured to allocate the memory evenly on each node according to the memory usage status of each node in the selected node group when the selected memory allocation policy is bandwidth priority.

Or, the delay priority allocation module is configured to allocate memory to the node with the smallest delay in the selected node group when the selected memory allocation policy is delay priority.

Or, the default allocation module is configured to randomly allocate memory to the node within the selected node group when the selected memory allocation policy is the default Default.

It should be noted that, in the foregoing device embodiment, each module included is only divided according to functional logic, but is not limited to the above division, as long as the corresponding function can be implemented; in addition, the specific name of each functional unit It is also for convenience of distinguishing from each other and is not intended to limit the scope of protection of the present invention.

In addition, those skilled in the art can understand that all or part of the steps of implementing the foregoing embodiments may be performed by a program to instruct related hardware, and the corresponding program may be stored in a computer readable storage medium, as mentioned above. The storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

In summary, the method and the device can utilize the characteristics of the multi-step non-uniform memory access architecture, effectively balancing efficiency and bandwidth, and improving the memory management performance of the system.

The above is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of it within the technical scope disclosed by the embodiments of the present invention. Variations or substitutions are intended to be covered by the scope of the invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims

A memory management method for a multi-step non-uniform memory access NUMA architecture, the method comprising:

When the system is initialized, the node group is determined according to the memory access delay information of each node and the user configuration information;

Obtaining a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle status indication;

When the system initiates a node memory allocation request, according to the memory usage status of each node group, the memory is allocated on the idle node group with the smallest memory access delay;

Within the selected node group, memory is allocated to nodes within the node group according to the memory usage status of the nodes in the node group.
The method according to claim 1, wherein the memory is allocated to the nodes in the node group according to the memory usage status of the nodes in the node group in the selected node group, specifically including :

If the selected memory allocation policy is bandwidth priority, the memory is evenly distributed on each node according to the memory usage status of each node in the selected node group;

If the selected memory allocation policy is delay priority, the memory is allocated on the node with the smallest delay in the selected node group;

If the selected memory allocation policy is the default Default, the memory is randomly assigned to the node within the selected node group.
The method according to claim 1, wherein the group of idle nodes is a group of nodes whose memory allocation is empty.
The method according to claim 1, wherein the determining the node group according to the memory access delay information and the user configuration information of each node includes:

Obtain the memory access delay information of each node, and merge the nodes with similar memory access delays into one node group.
The method according to claim 1, wherein the obtaining the memory usage status of each node group and the nodes in the group includes:

The memory usage status of each node group and the nodes in the group is obtained by using the memory usage monitoring module set in the system.
The method according to claim 1, wherein the node memory allocation request is specifically:

A request to allocate memory to a node within the system.
A memory management device for a multi-step non-uniform memory access NUMA architecture, characterized in that the device comprises:

a node group setting module, configured to determine a node group according to memory access delay information and user configuration information of each node during system initialization;

a memory usage monitoring module, configured to acquire a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle status indication;

a node group selection module, configured to allocate memory on a free node group with the smallest memory access delay according to the memory usage status of each node group when the system initiates a node memory allocation request;

And a memory allocation module, configured to allocate, in the selected node group, a memory to a node in the node group according to a memory usage status of a node in the node group.
The memory management device of claim 7, wherein the memory allocation module comprises:

The bandwidth priority allocation module is configured to allocate the memory evenly on each node according to the memory usage status of each node in the selected node group when the selected memory allocation policy is bandwidth priority.
The memory management device of claim 7, wherein the memory allocation module comprises:

The delay priority allocation module is configured to allocate memory to the node with the smallest delay in the selected node group when the selected memory allocation policy is delay priority.
The memory management device of claim 7, wherein the memory allocation module comprises:

The default allocation module is used to randomly allocate memory within the selected node group to the node when the selected memory allocation policy is the default Default.