WO2012163024A1 - Memory management method and apparatus for multi-step non-uniform memory access numa architecture - Google Patents

Memory management method and apparatus for multi-step non-uniform memory access numa architecture Download PDF

Info

Publication number
WO2012163024A1
WO2012163024A1 PCT/CN2011/081440 CN2011081440W WO2012163024A1 WO 2012163024 A1 WO2012163024 A1 WO 2012163024A1 CN 2011081440 W CN2011081440 W CN 2011081440W WO 2012163024 A1 WO2012163024 A1 WO 2012163024A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
node
node group
group
usage status
Prior art date
Application number
PCT/CN2011/081440
Other languages
French (fr)
Chinese (zh)
Inventor
章晓峰
王伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2011/081440 priority Critical patent/WO2012163024A1/en
Priority to CN2011800021962A priority patent/CN102439570A/en
Publication of WO2012163024A1 publication Critical patent/WO2012163024A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • G06F12/0646Configuration or reconfiguration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/25Using a specific main memory architecture
    • G06F2212/254Distributed memory
    • G06F2212/2542Non-uniform memory access [NUMA] architecture

Definitions

  • the present invention relates to the field of memory management technologies, and in particular, to a memory management method and apparatus for multi-step non-uniform memory access NUMA architecture.
  • non-uniform memory access Non-Uniform Memory Access
  • NUMA Non-Uniform Memory Access
  • the memory hierarchy structure is determined according to the memory topology in the BIOS of the basic input/output system during system initialization, and the allocation principle is determined before the memory allocation, that is, the principle of proximity (efficiency priority) or the principle of uniform distribution (bandwidth priority); If the principle of proximity is adopted, the memory is allocated on the node with the smallest access delay. If the memory of the node is full, try to release the memory first. If the requirement is still not met after the release, try to allocate on the node with the second smallest access delay. Memory until the memory is successfully allocated; if the principle of even distribution is adopted, the memory is evenly distributed on all nodes.
  • the memory access latency is non-uniformly distributed, assuming that the access delays are: 100ns, 140ns, 500ns, 900ns, 1040ns.
  • the simple use of the nearest allocation or the average allocation principle cannot balance the current situation of efficiency and bandwidth, that is, if the efficiency is prioritized, the nearest allocation principle is adopted, first in the The 100ns node fails to allocate memory. It needs to try to release the memory first, and then allocate the memory in the 140ns delay node. The difference between the 100ns delay and the 140ns delay has little effect on the performance, which makes the allocation overhead large and can only utilize the single node bandwidth.
  • Bandwidth priority using the principle of uniform distribution, requires a large number of cross-node access, affecting system efficiency.
  • the object of the invention is a memory management method and device for multi-step non-uniform memory access NUMA architecture, which can utilize the characteristics of multi-step non-uniform memory access architecture, effectively balance efficiency and bandwidth, and improve memory management performance of the system. .
  • a memory management method for a multi-step non-uniform memory access NUMA architecture comprising:
  • the node group is determined according to the memory access delay information of each node and the user configuration information;
  • the memory is allocated on the idle node group with the smallest memory access delay
  • memory is allocated to nodes within the node group according to the memory usage status of the nodes in the node group.
  • a memory management device for multi-step non-uniform memory access NUMA architecture comprising:
  • a node group setting module configured to determine a node group according to memory access delay information and user configuration information of each node during system initialization
  • a memory usage monitoring module configured to acquire a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle status indication;
  • a node group selection module configured to allocate memory on a free node group with the smallest memory access delay according to the memory usage status of each node group when the system initiates a node memory allocation request;
  • a memory allocation module configured to allocate, in the selected node group, a memory to a node in the node group according to a memory usage status of a node in the node group.
  • the method includes: determining, during system initialization, a node group according to memory access delay information and user configuration information of each node; acquiring memory usage of each node group and nodes in the group a status, the memory usage status includes a memory usage ratio and an idle status indication; when the system initiates a node memory allocation request, according to the memory usage status of each node group, selecting to allocate memory on the idle node group with the smallest memory access delay; Within the selected node group, memory is allocated to nodes in the node group according to the memory usage status of the nodes in the node group.
  • FIG. 1 is a schematic flowchart of a memory management method for a multi-step non-uniform memory access NUMA architecture according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of an original memory topology in a specific example according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a topology structure of an improved node group in a specific example according to an embodiment of the present invention.
  • FIG. 4 is another schematic structural diagram of an improved node group topology in a specific example according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a memory management apparatus according to an embodiment of the present invention.
  • the embodiment of the invention provides a memory management method and device for multi-step non-uniform memory access NUMA architecture, and the method and the device can utilize the characteristics of the multi-step non-uniform memory access architecture to effectively balance efficiency and bandwidth. Improves the memory management performance of the system.
  • FIG. 1 is a schematic flowchart of a memory management method for a multi-step non-uniform memory access NUMA architecture according to an embodiment of the present invention. include:
  • Step 11 When the system is initialized, the node group is determined according to the memory access delay information of each node and the user configuration information.
  • the node group Node is determined according to the memory access delay information of each node and the user configuration information setting.
  • Group structure specifically:
  • the delay of the CPU accessing different memory is different, and different access delays will separate one node.
  • the memory delay information of each node is obtained, and then the nodes with similar delay differences are combined into one node group according to the configuration strategy of the user. For example, nodes with delay differences less than 50% can be combined into one node group.
  • FIG. 2 is a schematic diagram of an original memory topology in a specific example according to an embodiment of the present invention:
  • Example 1 The solution of the embodiment of the present invention can generate an optimized Node by using a configuration policy (Config user configuration or system initialization automatic configuration).
  • Group topology 1 FIG. 3 is a schematic diagram of an improved node group topology structure in a specific example of the embodiment of the present invention, if we formulate rules in the Config user configuration or system initialization policy: the access delay difference is 50%
  • the merge within is a node group.
  • the memory access delays are 100ns, 140ns, 900ns, and 1040ns respectively.
  • the nodes of 100ns and 140ns are merged into one node, and the nodes of 900ns and 1040ns are merged into one node, thus generating the graph as shown in Figure 3.
  • Node A Node B...Node H node has an access delay of 100 ns
  • Node Group AB Node Group CD
  • Node Group EF Node Group GH internal access delay is 140ns
  • Node Group ABCD Node The access delay in Group EFGH is 300ns
  • the access delay in Node Group ABCDEFGH is 1040ns.
  • Example 2 In the same manner as the first case, the embodiment of the present invention can generate the optimized topology structure 2 according to the configuration policy.
  • FIG. 4 is another schematic structural diagram of the improved node group topology in the specific example of the embodiment of the present invention. If the configuration strategy is set to: merge nodes with delay differences less than 3 times into one node group. The system memory latency is 100ns, 140ns, 300ns, and 1040ns, respectively. Then, according to the above configuration strategy, the nodes of 100 ns, 140 ns, and 300 ns are merged into one node group, thereby generating a topology as shown in FIG. 4, in FIG. 4: Node Group ABCD, Node Group The delay within EFGH is less than 300ns, and the delay within Node Group ABCDEFGH is less than 1040ns.
  • Step 12 Obtain the memory usage status of each node group and the nodes in the group.
  • each node group Node can be monitored through the memory usage monitoring module by setting a memory usage monitoring module in the memory management architecture.
  • the memory usage status includes the memory usage ratio and the idle status indication, such as the monitoring status shown in Table 1 below:
  • Step 13 When the system initiates a node memory allocation request, according to the memory usage status of each node group, the memory is allocated on the idle node group with the smallest memory access delay.
  • this step when the system initiates a node memory allocation request and needs to allocate memory to the nodes in the system, first select the corresponding node group according to the memory usage status of each node group, that is, select the idle node with the smallest memory access delay. Allocate memory on the group and then perform subsequent operations within the selected node group.
  • Step 14 In the selected node group, allocate memory to the nodes in the node group according to the memory usage status of the nodes in the node group.
  • a memory allocation policy may be selected according to the memory usage status of the nodes in the node group in the selected node group, and the memory is allocated to the node group. Inside the node.
  • the selected memory allocation policy is bandwidth priority
  • the memory is evenly distributed on each node according to the memory usage status of each node in the selected node group
  • the selected memory allocation policy is delay priority, the memory is allocated on the node with the smallest delay in the selected node group;
  • the selected memory allocation policy is the default Default, the memory is randomly assigned to the node within the selected node group.
  • the characteristics of the multi-step non-uniform memory access architecture can be utilized, the efficiency and bandwidth are effectively considered, and the memory management performance of the system is improved.
  • FIG. 5 is a schematic structural diagram of a memory management device according to an embodiment of the present invention, where the device includes:
  • the node group setting module is configured to determine a node group according to the memory access delay information and the user configuration information of each node during system initialization, and the specific implementation manner is as described in the foregoing method embodiment.
  • the memory usage monitoring module is configured to obtain a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle state indication, and the specific implementation manner is as described in the foregoing method embodiment.
  • the node group selection module is configured to allocate memory on the idle node group with the smallest memory access delay according to the memory usage status of each node group when the system initiates the node memory allocation request, and the specific implementation manner is as described in the foregoing method embodiment. .
  • a memory allocation module configured to allocate a memory to a node in the node group according to a memory usage status of a node in the node group in a selected node group, as described in the foregoing method embodiment.
  • the memory allocation module may further include:
  • the bandwidth priority allocation module is configured to allocate the memory evenly on each node according to the memory usage status of each node in the selected node group when the selected memory allocation policy is bandwidth priority.
  • the delay priority allocation module is configured to allocate memory to the node with the smallest delay in the selected node group when the selected memory allocation policy is delay priority.
  • the default allocation module is configured to randomly allocate memory to the node within the selected node group when the selected memory allocation policy is the default Default.
  • each module included is only divided according to functional logic, but is not limited to the above division, as long as the corresponding function can be implemented; in addition, the specific name of each functional unit It is also for convenience of distinguishing from each other and is not intended to limit the scope of protection of the present invention.
  • the storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
  • the method and the device can utilize the characteristics of the multi-step non-uniform memory access architecture, effectively balancing efficiency and bandwidth, and improving the memory management performance of the system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

A memory management method and apparatus for multi-step non-uniform memory access NUMA architectures, the method comprises: when a system is initialized, determines the node group according to the delay information of the memory access of each node and user configuration information; obtains the memory use condition of each group and nodes in the node group, the memory use condition includes memory use ratios and free state indications; when a system initiates node memory configuration request, selects the free node group having the minimum memory access delay to distribute the memory according to the memory use conditions of each node group; and in the selected node group, distributes the memory to the node of the node group according to the memory use condition of the nodes in the node group. By the method, the characteristic of the multi-step non-uniform memory access architecture can be utilized, thereby both covering the efficiency and bandwidth effectively, and improving the memory management performance of the system.

Description

针对多步长非一致性内存访问NUMA架构的内存管理方法及装置  Memory management method and device for multi-step non-uniform memory access NUMA architecture
技术领域Technical field
本发明涉及内存管理技术领域,尤其涉及一种针对多步长非一致性内存访问NUMA架构的内存管理方法及装置。The present invention relates to the field of memory management technologies, and in particular, to a memory management method and apparatus for multi-step non-uniform memory access NUMA architecture.
发明背景Background of the invention
目前,非一致性内存访问(Non-Uniform Memory Access,NUMA)架构已经成为服务器领域的主流系统架构,为了解决Numa架构下远近端内存访问开销差别大的问题,新型的多步长层次型NUMA架构的应用越来越广泛。现有技术中,内存层级结构在系统初始化时根据基础输入输出系统BIOS中的内存拓扑结构来确定,内存分配前先确定分配原则,即就近原则(效率优先)或均匀分配原则(带宽优先);若采取就近原则,则优先在访问延迟最小的节点上分配内存,若该节点内存已满,先尝试释放内存,若释放后仍达不到要求,再尝试在访问延迟第二小的节点上分配内存,直至成功分配内存;若采取平均分配的原则,则在所有的节点上平均分配内存。Currently, non-uniform memory access (Non-Uniform Memory Access, NUMA) architecture has become the mainstream system architecture in the server field. In order to solve the problem of large difference in memory access costs between the far and near ends under the Numa architecture, the new multi-step and long-level NUMA architecture is more and more widely used. In the prior art, the memory hierarchy structure is determined according to the memory topology in the BIOS of the basic input/output system during system initialization, and the allocation principle is determined before the memory allocation, that is, the principle of proximity (efficiency priority) or the principle of uniform distribution (bandwidth priority); If the principle of proximity is adopted, the memory is allocated on the node with the smallest access delay. If the memory of the node is full, try to release the memory first. If the requirement is still not met after the release, try to allocate on the node with the second smallest access delay. Memory until the memory is successfully allocated; if the principle of even distribution is adopted, the memory is evenly distributed on all nodes.
在多步长Multi-hop的内存架构下,内存访问延迟是非均匀分布的,假设访问延迟分别为:100ns,140ns, 500ns, 900ns, 1040ns。现有技术方案中,若对每一步长均设置一个节点,则单纯的使用就近分配或平均分配原则就无法兼顾效率和带宽的现状,也就是说:若效率优先,采用就近分配原则,首先在100ns节点分配内存失败,需先尝试释放内存,再在140ns延迟节点分配内存,而100ns延迟和140ns延迟的差别对性能影响并不大,这样就使得分配开销大且仅能利用单节点带宽;若带宽优先,采用均匀分配原则,则需要大量的跨节点访问,影响了系统效率。 In the multi-step multi-hop memory architecture, the memory access latency is non-uniformly distributed, assuming that the access delays are: 100ns, 140ns, 500ns, 900ns, 1040ns. In the prior art solution, if one node is set for each step, the simple use of the nearest allocation or the average allocation principle cannot balance the current situation of efficiency and bandwidth, that is, if the efficiency is prioritized, the nearest allocation principle is adopted, first in the The 100ns node fails to allocate memory. It needs to try to release the memory first, and then allocate the memory in the 140ns delay node. The difference between the 100ns delay and the 140ns delay has little effect on the performance, which makes the allocation overhead large and can only utilize the single node bandwidth. Bandwidth priority, using the principle of uniform distribution, requires a large number of cross-node access, affecting system efficiency.
发明内容Summary of the invention
本发明的目的是针对多步长非一致性内存访问NUMA架构的内存管理方法及装置,能够利用多步长非一致性内存访问架构的特点,有效兼顾效率和带宽,提高了系统的内存管理性能。The object of the invention is a memory management method and device for multi-step non-uniform memory access NUMA architecture, which can utilize the characteristics of multi-step non-uniform memory access architecture, effectively balance efficiency and bandwidth, and improve memory management performance of the system. .
一种针对多步长非一致性内存访问NUMA架构的内存管理方法,所述方法包括:A memory management method for a multi-step non-uniform memory access NUMA architecture, the method comprising:
在系统初始化时,根据各节点的内存访问延时信息及用户配置信息,确定节点组;When the system is initialized, the node group is determined according to the memory access delay information of each node and the user configuration information;
获取各个节点组及组内节点的内存使用状况,所述内存使用状况包括内存使用比例和空闲状态指示;Obtaining a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle status indication;
当系统发起节点内存分配请求时,根据各个节点组的内存使用状况,选择在内存访问延迟最小的空闲节点组上分配内存;When the system initiates a node memory allocation request, according to the memory usage status of each node group, the memory is allocated on the idle node group with the smallest memory access delay;
在所选择的节点组内,根据所述节点组内节点的内存使用状况,将内存分配到所述节点组内的节点上。Within the selected node group, memory is allocated to nodes within the node group according to the memory usage status of the nodes in the node group.
一种针对多步长非一致性内存访问NUMA架构的内存管理装置,所述装置包括:A memory management device for multi-step non-uniform memory access NUMA architecture, the device comprising:
节点组设置模块,用于在系统初始化时,根据各节点的内存访问延时信息及用户配置信息,确定节点组;a node group setting module, configured to determine a node group according to memory access delay information and user configuration information of each node during system initialization;
内存使用监控模块,用于获取各个节点组及组内节点的内存使用状况,所述内存使用状况包括内存使用比例和空闲状态指示;a memory usage monitoring module, configured to acquire a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle status indication;
节点组选择模块,用于当系统发起节点内存分配请求时,根据各个节点组的内存使用状况,选择在内存访问延迟最小的空闲节点组上分配内存;a node group selection module, configured to allocate memory on a free node group with the smallest memory access delay according to the memory usage status of each node group when the system initiates a node memory allocation request;
内存分配模块,用于在所选择的节点组内,根据所述节点组内节点的内存使用状况,将内存分配到所述节点组内的节点上。And a memory allocation module, configured to allocate, in the selected node group, a memory to a node in the node group according to a memory usage status of a node in the node group.
由上述所提供的技术方案可以看出,所述方法包括:在系统初始化时,根据各节点的内存访问延时信息及用户配置信息,确定节点组;获取各个节点组及组内节点的内存使用状况,所述内存使用状况包括内存使用比例和空闲状态指示;当系统发起节点内存分配请求时,根据各个节点组的内存使用状况,选择在内存访问延迟最小的空闲节点组上分配内存;在所选择的节点组内,根据所述节点组内节点的内存使用状况,将内存分配到所述节点组内的节点上。通过该方法,就能够利用多步长非一致性内存访问架构的特点,有效兼顾效率和带宽,提高了系统的内存管理性能。As can be seen from the technical solution provided above, the method includes: determining, during system initialization, a node group according to memory access delay information and user configuration information of each node; acquiring memory usage of each node group and nodes in the group a status, the memory usage status includes a memory usage ratio and an idle status indication; when the system initiates a node memory allocation request, according to the memory usage status of each node group, selecting to allocate memory on the idle node group with the smallest memory access delay; Within the selected node group, memory is allocated to nodes in the node group according to the memory usage status of the nodes in the node group. Through this method, it is possible to utilize the characteristics of the multi-step non-uniform memory access architecture, effectively balancing efficiency and bandwidth, and improving the memory management performance of the system.
附图简要说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明实施例所提供的针对多步长非一致性内存访问NUMA架构的内存管理方法的流程示意图;1 is a schematic flowchart of a memory management method for a multi-step non-uniform memory access NUMA architecture according to an embodiment of the present invention;
图2为本发明实施例所举出的具体实例中原有内存拓扑结构的示意图;2 is a schematic diagram of an original memory topology in a specific example according to an embodiment of the present invention;
图3为本发明实施例所举出的具体实例中改进的节点组拓扑结构示意图;3 is a schematic diagram of a topology structure of an improved node group in a specific example according to an embodiment of the present invention;
图4为本发明实施例所举出的具体实例中改进的节点组拓扑另一结构示意图;4 is another schematic structural diagram of an improved node group topology in a specific example according to an embodiment of the present invention;
图5为本发明实施例所提供的内存管理装置的结构示意图。FIG. 5 is a schematic structural diagram of a memory management apparatus according to an embodiment of the present invention.
实施本发明的方式Mode for carrying out the invention
本发明实施方式提供了一种针对多步长非一致性内存访问NUMA架构的内存管理方法及装置,通过该方法及装置能够利用多步长非一致性内存访问架构的特点,有效兼顾效率和带宽,提高了系统的内存管理性能。The embodiment of the invention provides a memory management method and device for multi-step non-uniform memory access NUMA architecture, and the method and the device can utilize the characteristics of the multi-step non-uniform memory access architecture to effectively balance efficiency and bandwidth. Improves the memory management performance of the system.
下面结合附图来对本发明的具体实施例进行详细说明,如图1所示为本发明实施例所提供的针对多步长非一致性内存访问NUMA架构的内存管理方法的流程示意图,所述方法包括:A specific embodiment of the present invention is described in detail below with reference to the accompanying drawings. FIG. 1 is a schematic flowchart of a memory management method for a multi-step non-uniform memory access NUMA architecture according to an embodiment of the present invention. include:
步骤11:在系统初始化时,根据各节点的内存访问延时信息及用户配置信息,确定节点组。Step 11: When the system is initialized, the node group is determined according to the memory access delay information of each node and the user configuration information.
在该步骤中,在操作系统初始化时,根据各节点的内存访问延时信息及用户配置信息设置,确定节点组Node Group结构,具体来说:In this step, when the operating system is initialized, the node group Node is determined according to the memory access delay information of each node and the user configuration information setting. Group structure, specifically:
由于内存硬件所处位置的不同,CPU访问不同内存的延迟是不同的,不同的访问延迟就会独立出一个节点。首先获取各节点的内存延迟信息,然后根据用户的配置策略,将延迟差别相近的节点合并为一个节点组,例如可以将延迟差别小于50%的节点合并为一个节点组。Due to the location of the memory hardware, the delay of the CPU accessing different memory is different, and different access delays will separate one node. Firstly, the memory delay information of each node is obtained, and then the nodes with similar delay differences are combined into one node group according to the configuration strategy of the user. For example, nodes with delay differences less than 50% can be combined into one node group.
下面以具体的实例来进行说明,如图2所示为本发明实施例所举出的具体实例中原有内存拓扑结构的示意图:The following is a specific example. FIG. 2 is a schematic diagram of an original memory topology in a specific example according to an embodiment of the present invention:
若多步长Multi-hop架构下的原内存拓扑如图2所示,若Node A,Node B…Node H节点内访问延迟为100ns;Node Group AB,Node Group CD,Node Group EF,Node Group GH内访问延迟为140ns;Node Group ABCD,Node Group EFGH内访问延迟为900ns;Node Group ABCDEFGH内访问延迟为1040ns;现有技术中,不同的访问延迟就会独立出一个节点,这样就会生成如图2所示的原有内存拓扑结构。这里100ns和140ns,900ns和1040ns的访问延迟差别并不大,原有过于复杂的层级结构影响了分配效率,和管理复杂度。If the original memory topology in the multi-step multi-hop architecture is shown in Figure 2, if Node A, Node B...Node The access delay in the H node is 100 ns; Node Group AB, Node Group CD, Node Group EF, Node Group The access delay in GH is 140ns; Node Group ABCD, the access delay in Node Group EFGH is 900ns; Node Group The access delay in ABCDEFGH is 1040 ns; in the prior art, different access delays will separate one node, which will generate the original memory topology as shown in FIG. 2. Here, the access delays of 100ns and 140ns, 900ns and 1040ns are not much different. The original overly complex hierarchical structure affects the allocation efficiency and management complexity.
实例1:本发明实施例的方案可通过配置策略(Config用户配置或系统初始化自动配置),生成优化后的Node Group拓扑结构1,图3所示为本发明实施例所举出的具体实例中改进的节点组拓扑结构示意图,假如我们在Config用户配置或系统初始化策略里制定规则:将访问延迟差距在50%以内的合并为一个节点组。首先获得内存访问延迟分别为100ns,140ns,900ns,1040ns;再根据以上的策略规则,将100ns和140ns的节点合并为一个节点,将900ns和1040ns的节点合并为一个节点,从而生成如图3所示的拓扑结构,图3中:Node Group AB,Node Group CD, Node Group EF, Node Group GH内延迟为小于140ns, Node Group ABCDEFGH内延迟为小于1040ns。Example 1: The solution of the embodiment of the present invention can generate an optimized Node by using a configuration policy (Config user configuration or system initialization automatic configuration). Group topology 1, FIG. 3 is a schematic diagram of an improved node group topology structure in a specific example of the embodiment of the present invention, if we formulate rules in the Config user configuration or system initialization policy: the access delay difference is 50% The merge within is a node group. First, the memory access delays are 100ns, 140ns, 900ns, and 1040ns respectively. According to the above policy rules, the nodes of 100ns and 140ns are merged into one node, and the nodes of 900ns and 1040ns are merged into one node, thus generating the graph as shown in Figure 3. The topology shown, in Figure 3: Node Group AB, Node Group CD, Node Group EF, Node Group GH delay is less than 140ns, Node Group The delay within ABCDEFGH is less than 1040 ns.
在举一个例子,若 Node A,Node B…Node H节点内访问延迟为100ns;Node Group AB,Node Group CD,Node Group EF,Node Group GH内访问延迟为140ns;Node Group ABCD,Node Group EFGH内访问延迟为300ns;Node Group ABCDEFGH内访问延迟为1040ns。As an example, if the Node A, Node B...Node H node has an access delay of 100 ns; Node Group AB, Node Group CD, Node Group EF, Node Group GH internal access delay is 140ns; Node Group ABCD, Node The access delay in Group EFGH is 300ns; the access delay in Node Group ABCDEFGH is 1040ns.
实例2:同案例一,本发明实施例可根据配置策略生成优化后的拓扑结构2,如图4所示为本发明实施例所举出的具体实例中改进的节点组拓扑另一结构示意图,假如制定的配置策略为:将延迟差别小于3倍以内的节点合并成一个节点组。系统内存延迟分别为100ns,140ns,300ns,1040ns。那么根据以上的配置策略,将100ns,140ns,300ns的节点合并成一个节点组,从而生成如图4所示的拓扑结构,图4中:Node Group ABCD,Node Group EFGH内延迟小于300ns,Node Group ABCDEFGH内延迟小于1040ns。Example 2: In the same manner as the first case, the embodiment of the present invention can generate the optimized topology structure 2 according to the configuration policy. FIG. 4 is another schematic structural diagram of the improved node group topology in the specific example of the embodiment of the present invention. If the configuration strategy is set to: merge nodes with delay differences less than 3 times into one node group. The system memory latency is 100ns, 140ns, 300ns, and 1040ns, respectively. Then, according to the above configuration strategy, the nodes of 100 ns, 140 ns, and 300 ns are merged into one node group, thereby generating a topology as shown in FIG. 4, in FIG. 4: Node Group ABCD, Node Group The delay within EFGH is less than 300ns, and the delay within Node Group ABCDEFGH is less than 1040ns.
步骤12:获取各个节点组及组内节点的内存使用状况。Step 12: Obtain the memory usage status of each node group and the nodes in the group.
在该步骤中,可以通过在内存管理架构中设置内存使用状况监控模块,通过该内存使用状况监控模块来监控各个节点组Node Group和节点组内各个Node 的内存使用情况,所述内存使用状况包括内存使用比例和空闲状态指示,例如如下表1所示的监控状态:In this step, each node group Node can be monitored through the memory usage monitoring module by setting a memory usage monitoring module in the memory management architecture. The memory usage of each Node in the Group and the node group. The memory usage status includes the memory usage ratio and the idle status indication, such as the monitoring status shown in Table 1 below:
Node ABCD Node ABCD Node EFGH Node EFGH Node AB Node AB Node CD Node CD Node EF Node EF Node GH Node GH ...... Node G Node G Node H Node H
内存使用比例 Memory usage ratio 40% 40% 60%60% 20%20% 60%60% 80%80% 40%40% ...... 10%10% 25%25%
表1Table 1
步骤13:当系统发起节点内存分配请求时,根据各个节点组的内存使用状况,选择在内存访问延迟最小的空闲节点组上分配内存。Step 13: When the system initiates a node memory allocation request, according to the memory usage status of each node group, the memory is allocated on the idle node group with the smallest memory access delay.
在该步骤中,当系统发起节点内存分配请求,需要将内存分配到系统内节点时,首先根据各个节点组的内存使用状况,先选择相应的节点组,即选择在内存访问延迟最小的空闲节点组上分配内存,然后在所选择的节点组内进行后继的操作。In this step, when the system initiates a node memory allocation request and needs to allocate memory to the nodes in the system, first select the corresponding node group according to the memory usage status of each node group, that is, select the idle node with the smallest memory access delay. Allocate memory on the group and then perform subsequent operations within the selected node group.
步骤14:在所选择的节点组内,根据所述节点组内节点的内存使用状况,将内存分配到所述节点组内的节点上。Step 14: In the selected node group, allocate memory to the nodes in the node group according to the memory usage status of the nodes in the node group.
在该步骤中,在进行上述步骤13的操作之后,就可以在所选择的节点组内,根据所述节点组内节点的内存使用状况,选择相应的内存分配策略将内存分配到所述节点组内的节点上。In this step, after performing the operation of step 13 above, a memory allocation policy may be selected according to the memory usage status of the nodes in the node group in the selected node group, and the memory is allocated to the node group. Inside the node.
具体来说,可以包括以下几种情况:Specifically, the following situations can be included:
若选择的内存分配策略为带宽优先,则根据所选择的节点组内各节点的内存使用状况,将内存平均分配在各个节点上;If the selected memory allocation policy is bandwidth priority, the memory is evenly distributed on each node according to the memory usage status of each node in the selected node group;
若选择的内存分配策略为延迟优先,则将内存分配在所选择的节点组内延迟最小的节点上;If the selected memory allocation policy is delay priority, the memory is allocated on the node with the smallest delay in the selected node group;
若选择的内存分配策略为默认Default,则将内存在所选择的节点组内随机分配到节点上。If the selected memory allocation policy is the default Default, the memory is randomly assigned to the node within the selected node group.
通过上述技术方案的实施,就可以利用多步长非一致性内存访问架构的特点,有效兼顾效率和带宽,提高了系统的内存管理性能。Through the implementation of the above technical solution, the characteristics of the multi-step non-uniform memory access architecture can be utilized, the efficiency and bandwidth are effectively considered, and the memory management performance of the system is improved.
本发明实施方式还提供了一种针对多步长非一致性内存访问NUMA架构的内存管理装置,如图5所示为本发明实施例所提供的内存管理装置的结构示意图,所述装置包括:The embodiment of the present invention further provides a memory management device for a multi-step non-uniform memory access NUMA architecture. FIG. 5 is a schematic structural diagram of a memory management device according to an embodiment of the present invention, where the device includes:
节点组设置模块,用于在系统初始化时,根据各节点的内存访问延时信息及用户配置信息,确定节点组,具体实现方式见以上方法实施例中所述。The node group setting module is configured to determine a node group according to the memory access delay information and the user configuration information of each node during system initialization, and the specific implementation manner is as described in the foregoing method embodiment.
内存使用监控模块,用于获取各个节点组及组内节点的内存使用状况,所述内存使用状况包括内存使用比例和空闲状态指示,具体实现方式见以上方法实施例中所述。The memory usage monitoring module is configured to obtain a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle state indication, and the specific implementation manner is as described in the foregoing method embodiment.
节点组选择模块,用于当系统发起节点内存分配请求时,根据各个节点组的内存使用状况,选择在内存访问延迟最小的空闲节点组上分配内存,具体实现方式见以上方法实施例中所述。The node group selection module is configured to allocate memory on the idle node group with the smallest memory access delay according to the memory usage status of each node group when the system initiates the node memory allocation request, and the specific implementation manner is as described in the foregoing method embodiment. .
内存分配模块,用于在所选择的节点组内,根据所述节点组内节点的内存使用状况,将内存分配到所述节点组内的节点上,具体实现方式见以上方法实施例中所述。a memory allocation module, configured to allocate a memory to a node in the node group according to a memory usage status of a node in the node group in a selected node group, as described in the foregoing method embodiment. .
另外,在具体实现过程中,所述内存分配模块还可以包括:In addition, in a specific implementation process, the memory allocation module may further include:
带宽优先分配模块,用于当所选择的内存分配策略为带宽优先时,根据所选择的节点组内各节点的内存使用状况,将内存平均分配在各个节点上。The bandwidth priority allocation module is configured to allocate the memory evenly on each node according to the memory usage status of each node in the selected node group when the selected memory allocation policy is bandwidth priority.
或,延迟优先分配模块,用于当所选择的内存分配策略为延迟优先时,将内存分配在所选择的节点组内延迟最小的节点上。Or, the delay priority allocation module is configured to allocate memory to the node with the smallest delay in the selected node group when the selected memory allocation policy is delay priority.
或,默认分配模块,用于当所选择的内存分配策略为默认Default时,将内存在所选择的节点组内随机分配到节点上。Or, the default allocation module is configured to randomly allocate memory to the node within the selected node group when the selected memory allocation policy is the default Default.
值得注意的是,上述装置实施例中,所包括的各个模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的具体名称也只是为了便于相互区分,并不用于限制本发明的保护范围。It should be noted that, in the foregoing device embodiment, each module included is only divided according to functional logic, but is not limited to the above division, as long as the corresponding function can be implemented; in addition, the specific name of each functional unit It is also for convenience of distinguishing from each other and is not intended to limit the scope of protection of the present invention.
另外,本领域普通技术人员可以理解实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件完成,相应的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。In addition, those skilled in the art can understand that all or part of the steps of implementing the foregoing embodiments may be performed by a program to instruct related hardware, and the corresponding program may be stored in a computer readable storage medium, as mentioned above. The storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
综上所述,通过该方法及装置就能够利用多步长非一致性内存访问架构的特点,有效兼顾效率和带宽,提高了系统的内存管理性能。In summary, the method and the device can utilize the characteristics of the multi-step non-uniform memory access architecture, effectively balancing efficiency and bandwidth, and improving the memory management performance of the system.
以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明实施例揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应该以权利要求的保护范围为准。The above is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of it within the technical scope disclosed by the embodiments of the present invention. Variations or substitutions are intended to be covered by the scope of the invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims (10)

  1. 一种针对多步长非一致性内存访问NUMA架构的内存管理方法,其特征在于,所述方法包括:A memory management method for a multi-step non-uniform memory access NUMA architecture, the method comprising:
    在系统初始化时,根据各节点的内存访问延时信息及用户配置信息,确定节点组;When the system is initialized, the node group is determined according to the memory access delay information of each node and the user configuration information;
    获取各个节点组及组内节点的内存使用状况,所述内存使用状况包括内存使用比例和空闲状态指示;Obtaining a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle status indication;
    当系统发起节点内存分配请求时,根据各个节点组的内存使用状况,选择在内存访问延迟最小的空闲节点组上分配内存;When the system initiates a node memory allocation request, according to the memory usage status of each node group, the memory is allocated on the idle node group with the smallest memory access delay;
    在所选择的节点组内,根据所述节点组内节点的内存使用状况,将内存分配到所述节点组内的节点上。Within the selected node group, memory is allocated to nodes within the node group according to the memory usage status of the nodes in the node group.
  2. 根据权利要求1所述的方法,其特征在于,所述在所选择的节点组内,根据所述节点组内节点的内存使用状况,将内存分配到所述节点组内的节点上,具体包括:The method according to claim 1, wherein the memory is allocated to the nodes in the node group according to the memory usage status of the nodes in the node group in the selected node group, specifically including :
    若选择的内存分配策略为带宽优先,则根据所选择的节点组内各节点的内存使用状况,将内存平均分配在各个节点上;If the selected memory allocation policy is bandwidth priority, the memory is evenly distributed on each node according to the memory usage status of each node in the selected node group;
    若选择的内存分配策略为延迟优先,则将内存分配在所选择的节点组内延迟最小的节点上;If the selected memory allocation policy is delay priority, the memory is allocated on the node with the smallest delay in the selected node group;
    若选择的内存分配策略为默认Default,则将内存在所选择的节点组内随机分配到节点上。If the selected memory allocation policy is the default Default, the memory is randomly assigned to the node within the selected node group.
  3. 根据权利要求1所述的方法,其特征在于,所述空闲节点组为内存分配为空的节点组。The method according to claim 1, wherein the group of idle nodes is a group of nodes whose memory allocation is empty.
  4. 根据权利要求1所述的方法,其特征在于,所述根据各节点的内存访问延时信息及用户配置信息,确定节点组,具体包括:The method according to claim 1, wherein the determining the node group according to the memory access delay information and the user configuration information of each node includes:
    获取各节点的内存访问延迟信息,将内存访问延迟差别相近的节点合并为一个节点组。Obtain the memory access delay information of each node, and merge the nodes with similar memory access delays into one node group.
  5. 根据权利要求1所述的方法,其特征在于,所述获取各个节点组及组内节点的内存使用状况,具体包括:The method according to claim 1, wherein the obtaining the memory usage status of each node group and the nodes in the group includes:
    通过在系统中所设置的内存使用监控模块来获取各个节点组及组内节点的内存使用状况。The memory usage status of each node group and the nodes in the group is obtained by using the memory usage monitoring module set in the system.
  6. 根据权利要求1所述的方法,其特征在于,所述节点内存分配请求具体为:The method according to claim 1, wherein the node memory allocation request is specifically:
    将内存分配到系统内节点的请求。A request to allocate memory to a node within the system.
  7. 一种针对多步长非一致性内存访问NUMA架构的内存管理装置,其特征在于,所述装置包括:A memory management device for a multi-step non-uniform memory access NUMA architecture, characterized in that the device comprises:
    节点组设置模块,用于在系统初始化时,根据各节点的内存访问延时信息及用户配置信息,确定节点组;a node group setting module, configured to determine a node group according to memory access delay information and user configuration information of each node during system initialization;
    内存使用监控模块,用于获取各个节点组及组内节点的内存使用状况,所述内存使用状况包括内存使用比例和空闲状态指示;a memory usage monitoring module, configured to acquire a memory usage status of each node group and a node in the group, where the memory usage status includes a memory usage ratio and an idle status indication;
    节点组选择模块,用于当系统发起节点内存分配请求时,根据各个节点组的内存使用状况,选择在内存访问延迟最小的空闲节点组上分配内存;a node group selection module, configured to allocate memory on a free node group with the smallest memory access delay according to the memory usage status of each node group when the system initiates a node memory allocation request;
    内存分配模块,用于在所选择的节点组内,根据所述节点组内节点的内存使用状况,将内存分配到所述节点组内的节点上。And a memory allocation module, configured to allocate, in the selected node group, a memory to a node in the node group according to a memory usage status of a node in the node group.
  8. 如权利要求7所述的内存管理装置,其特征在于,所述内存分配模块包括:The memory management device of claim 7, wherein the memory allocation module comprises:
    带宽优先分配模块,用于当所选择的内存分配策略为带宽优先时,根据所选择的节点组内各节点的内存使用状况,将内存平均分配在各个节点上。The bandwidth priority allocation module is configured to allocate the memory evenly on each node according to the memory usage status of each node in the selected node group when the selected memory allocation policy is bandwidth priority.
  9. 如权利要求7所述的内存管理装置,其特征在于,所述内存分配模块包括:The memory management device of claim 7, wherein the memory allocation module comprises:
    延迟优先分配模块,用于当所选择的内存分配策略为延迟优先时,将内存分配在所选择的节点组内延迟最小的节点上。The delay priority allocation module is configured to allocate memory to the node with the smallest delay in the selected node group when the selected memory allocation policy is delay priority.
  10. 如权利要求7所述的内存管理装置,其特征在于,所述内存分配模块包括:The memory management device of claim 7, wherein the memory allocation module comprises:
    默认分配模块,用于当所选择的内存分配策略为默认Default时,将内存在所选择的节点组内随机分配到节点上。The default allocation module is used to randomly allocate memory within the selected node group to the node when the selected memory allocation policy is the default Default.
PCT/CN2011/081440 2011-10-27 2011-10-27 Memory management method and apparatus for multi-step non-uniform memory access numa architecture WO2012163024A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2011/081440 WO2012163024A1 (en) 2011-10-27 2011-10-27 Memory management method and apparatus for multi-step non-uniform memory access numa architecture
CN2011800021962A CN102439570A (en) 2011-10-27 2011-10-27 Memory management method and device aiming at multi-step length non conformance memory access numa framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2011/081440 WO2012163024A1 (en) 2011-10-27 2011-10-27 Memory management method and apparatus for multi-step non-uniform memory access numa architecture

Publications (1)

Publication Number Publication Date
WO2012163024A1 true WO2012163024A1 (en) 2012-12-06

Family

ID=45986239

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/081440 WO2012163024A1 (en) 2011-10-27 2011-10-27 Memory management method and apparatus for multi-step non-uniform memory access numa architecture

Country Status (2)

Country Link
CN (1) CN102439570A (en)
WO (1) WO2012163024A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103136110B (en) 2013-02-18 2016-03-30 华为技术有限公司 EMS memory management process, memory management device and NUMA system
CN104166596B (en) * 2013-05-17 2018-06-26 华为技术有限公司 A kind of memory allocation method and node
CN103365784B (en) * 2013-06-27 2016-03-30 华为技术有限公司 The method of Memory recycle and distribution and device
CN105389211B (en) * 2015-10-22 2018-10-30 北京航空航天大学 Memory allocation method and delay perception-Memory Allocation device suitable for NUMA architecture
CN105959176B (en) * 2016-04-25 2019-05-28 浪潮(北京)电子信息产业有限公司 Consistency protocol test method and system based on Gem5 simulator
CN110245135B (en) * 2019-05-05 2021-05-18 华中科技大学 Large-scale streaming graph data updating method based on NUMA (non uniform memory access) architecture

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5345575A (en) * 1991-09-26 1994-09-06 Hewlett-Packard Company Write optimized disk storage device
CN1963788A (en) * 2005-11-08 2007-05-16 中兴通讯股份有限公司 A managing method for EMS memory
US20070226449A1 (en) * 2006-03-22 2007-09-27 Nec Corporation Virtual computer system, and physical resource reconfiguration method and program thereof
CN101158927A (en) * 2007-10-25 2008-04-09 中国科学院计算技术研究所 EMS memory sharing system, device and method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7313795B2 (en) * 2003-05-27 2007-12-25 Sun Microsystems, Inc. Method and system for managing resource allocation in non-uniform resource access computer systems
EP2323036A4 (en) * 2008-08-04 2011-11-23 Fujitsu Ltd Multiprocessor system, management device for multiprocessor system, and computer-readable recording medium in which management program for multiprocessor system is recorded

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5345575A (en) * 1991-09-26 1994-09-06 Hewlett-Packard Company Write optimized disk storage device
CN1963788A (en) * 2005-11-08 2007-05-16 中兴通讯股份有限公司 A managing method for EMS memory
US20070226449A1 (en) * 2006-03-22 2007-09-27 Nec Corporation Virtual computer system, and physical resource reconfiguration method and program thereof
CN101158927A (en) * 2007-10-25 2008-04-09 中国科学院计算技术研究所 EMS memory sharing system, device and method

Also Published As

Publication number Publication date
CN102439570A (en) 2012-05-02

Similar Documents

Publication Publication Date Title
WO2012163024A1 (en) Memory management method and apparatus for multi-step non-uniform memory access numa architecture
EP3487149A1 (en) Data shard storage method, device and system
US7047337B2 (en) Concurrent access of shared resources utilizing tracking of request reception and completion order
DE69924475T2 (en) Multi-channel DMA with traffic planning on the outputs
EP3009930A1 (en) Lock management method and system, and lock management system configuration method and device
WO2019158010A1 (en) Resource management method, device and system
KR20170056350A (en) NFV(Network Function Virtualization) resource requirement verifier
JP2005332402A (en) Method and apparatus for treating processing error in multiprocessor system
WO2000038062A1 (en) Object hashing with incremental changes
CN1536842A (en) Equipment for controlling access of facilities according to the type of application
EP2632111B1 (en) Method and apparatus for realizing load balance in distributed hash table network
CN101493785A (en) Support for transitioning to a virtual machine monitor based upon the privilege level of guest software
CA2229399A1 (en) Systems and method for managing the processing of relatively large data objects in a communications stack
JP4833220B2 (en) Method, system, and program for dividing client resources by local authority
US20130205011A1 (en) Service providing system
US20160179592A1 (en) Addressing for inter-thread push communication
CN106936931A (en) The implementation method of distributed lock, relevant device and system
US20030018782A1 (en) Scalable memory management of token state for distributed lock managers
US9335970B2 (en) Method of entropy randomization on a parallel computer
JPH07253960A (en) Ipl system in multiprocessor system
CN109729731A (en) A kind of accelerated processing method and equipment
CN115794396A (en) Resource allocation method, system and electronic equipment
CN110224846B (en) Service protection method based on flexible Ethernet, server and storage medium
EP1806871A2 (en) A method and apparatus for dynamically configuring registers by a generic CPU management interface
KR101016036B1 (en) Memory management module and memory management methode

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201180002196.2

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11866571

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11866571

Country of ref document: EP

Kind code of ref document: A1