WO2019029236A1 - 一种内存分配方法和服务器 - Google Patents

一种内存分配方法和服务器 Download PDF

Info

Publication number
WO2019029236A1
WO2019029236A1 PCT/CN2018/088924 CN2018088924W WO2019029236A1 WO 2019029236 A1 WO2019029236 A1 WO 2019029236A1 CN 2018088924 W CN2018088924 W CN 2018088924W WO 2019029236 A1 WO2019029236 A1 WO 2019029236A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
numa
priority
node
numa node
Prior art date
Application number
PCT/CN2018/088924
Other languages
English (en)
French (fr)
Inventor
孙贝磊
沈胜宇
徐建荣
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP18844447.5A priority Critical patent/EP3605331A4/en
Publication of WO2019029236A1 publication Critical patent/WO2019029236A1/zh
Priority to US16/595,920 priority patent/US11042412B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/0284Multiple user address space allocation, e.g. using different base addresses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • G06F12/0607Interleaved addressing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • G06F12/0646Configuration or reconfiguration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4027Coupling between buses using bus bridges
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5013Request control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1024Latency reduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/15Use in a specific computing environment
    • G06F2212/154Networked environment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/25Using a specific main memory architecture
    • G06F2212/254Distributed memory
    • G06F2212/2542Non-uniform memory access [NUMA] architecture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/50Control mechanisms for virtual memory, cache or TLB
    • G06F2212/502Control mechanisms for virtual memory, cache or TLB using adaptive policy

Definitions

  • the present application relates to the field of communications, and in particular, to a memory allocation method and a server.
  • NUMA non-uniform memory architecture
  • the NUMA distance is 0;
  • the NUMA distance between A and C is equal to the number of minimum hops (Hop) elapsed between A and C multiplied by 20.
  • A. Localization strategy This strategy aims to increase the amount of local memory access, thus reducing the access latency.
  • the specific process is as follows:
  • Preferred (preferred) policy This policy specifies a series of memory nodes. When memory allocation is performed, memory is first allocated from the specified memory nodes. If the specified memory node has already been allocated, allocate memory from other nodes.
  • this memory allocation method assumes that the distance or cost of each NUMA Hop is the same, so the calculation algorithm of the NUMA distance only uses NUMA Hop as the only input variable.
  • the interconnection through the NC introduces additional access latency, that is, the two nodes interconnected by the NC have a much longer delay than the two nodes directly connected through the QPI. This results in different transmission overhead between different NUMA nodes. Therefore, the calculation method of the NUMA distance in the above memory allocation method is not aware of the problem of the NC, and the access between the NCs is increased, so that the access delay is increased, resulting in a decrease in server performance.
  • the embodiment of the present application provides a memory allocation method and a server, which are used to reduce performance loss caused by NC delay and improve server performance when memory is allocated.
  • a first aspect of the embodiments of the present application provides a memory allocation method, including: a server identification node topology table, where the node topology table includes each non-uniform memory architecture NUMA node in the server. a connection relationship between the NUMA node and the node controller NC and between the NC and the NC; the server generates a memory jump table of each NUMA node according to the identified node topology table, and sets the NUMA Any point in the node is a first NUMA node, and the memory jump table of the first NUMA node includes an NC hop count and a fast interconnect channel in a shortest path of the first NUMA node connected to each of the other NUMA nodes.
  • the server calculates the memory of each NUMA node according to the memory jump table of each NUMA node.
  • the first access priority table of the first NUMA node includes a priority of the first NUMA node to access other NUMA nodes, and if the number of NC hops is less, the priority of accessing the NUMA node is The higher the number of hops of the NC, the lower the number of QPI hops, the higher the priority of accessing the NUMA node; when the first NUMA node applies for memory, the server according to the first access priority table For memory allocation, the higher the priority, the more preferentially allocate memory from the NUMA node corresponding to the priority.
  • the method further includes: if a plurality of NUMA nodes in the first memory access priority table have the same priority And the server allocates memory from the NUMA nodes of the same priority according to an interleaving strategy.
  • the server in a second implementation manner of the first aspect of the embodiments of the present application, the server generates a memory access jump table of each NUMA node according to the node topology table, and specifically includes: The server reads the stored node topology table; and calculates a shortest path from each NUMA node to each of the other NUMA nodes according to the node topology table, where the shortest path is the path with the smallest number of NC hops in the preselected shortest path The preselected shortest path is a path with a minimum number of QPI hops in a path from one NUMA node to another NUMA node; the server calculates an NC on each shortest path according to the shortest path from each NUMA node to each of the other NUMA nodes The hop count and the QPI hop count; the server forms the number of NC hops and QPI hops on the shortest path of each NUMA node to the other NUMA nodes to form a memory jump table of the NUMA nodes.
  • the server calculates a memory access priority table of each NUMA node according to a memory access jump table of each NUMA node, where The method includes: the server sorts the NUMA nodes in the memory access jump table according to the number of NC hops in the memory jump table of each NUMA node, to obtain a first NUMA node sequence; The NUMA node in the NUMA node sequence has the same number of NC hops, and the server sorts the QPI hops in the order of small to large according to the fetch jump table to obtain a second NUMA node sequence; the server is high in priority according to the priority The low order sequentially assigns priorities to the NUMA nodes in the second NUMA node sequence, wherein the NUMA nodes having the same number of NC hops and QPI hops have the same priority.
  • the server performs memory allocation according to the first memory access priority table, and specifically includes: assuming that the current memory size to be allocated For the first capacity, the current query priority is the first priority, the first priority is a priority in the first access priority table, and the server takes precedence according to the first access priority table.
  • the server queries whether there is free memory in the NUMA node of the current query priority; if there is no free memory in the NUMA node of the current query priority, the server updates the first priority to the current query priority.
  • a priority step of triggering the memory allocation according to the following process; if only one second NUMA node of the currently queried priority URIA node has free memory of a second capacity, the server determines the first Whether the second capacity is not smaller than the current memory size to be allocated; if the second capacity is not smaller than the current memory size to be allocated, the server allocates a size from the second NUMA node to the first NUMA node.
  • the memory of the current memory size is to be allocated, and the memory allocation process is ended; if the second capacity is smaller than the current memory size to be allocated, the server allocates a size from the second NUMA node to the first NUMA node. a second capacity of the memory, updating the first capacity to the current to-be-allocated memory size minus the second capacity, updating the first Step priority to the current query priority next priority, triggering the memory allocation according to the following process.
  • the method further includes: If there is more than one third NUMA node in the NUMA node of the current query priority, the server allocates memory from each third NUMA node by using an interleaving policy, and the allocated memory size is the third capacity.
  • the server updates the first capacity to the The current memory size to be allocated is subtracted from the third capacity, and the first priority is updated to the next priority of the current query priority, and the step of performing memory allocation according to the following process is triggered.
  • the method further includes: If all the NUMA nodes have no free memory, the server determines whether a memory release operation is performed, and the memory release operation indicates that the temporarily unused memory is swapped into the hard disk buffer; if the memory release operation is not performed And performing a memory release operation, initializing the current memory size to be allocated and the current query priority, and triggering the step of performing memory allocation according to the following process according to the priority of the first priority in the first access priority table; .
  • the method before the step of the server allocating memory from each third NUMA node by means of an interleaving policy, the method further includes The server determines whether the current memory size to be allocated is greater than one memory page; if the current memory size to be allocated is greater than one memory page, triggering the server to allocate memory from each third NUMA node by using an interleaving policy If the current to-be-allocated size is not greater than one memory page, the server randomly selects a third NUMA node from the third NUMA nodes for memory allocation, and ends the memory allocation process.
  • the node topology table is a matrix S of (N+M)*(N+M) steps, where N is a server.
  • the number of NUMA nodes M is the number of NCs in the server
  • the first N columns and N rows of the matrix S represent NUMA nodes
  • the last M columns and M rows of the matrix S represent NC
  • the values of the pth row and the qth column in the matrix S Indicates the connection relationship between node p and node q, where N, M, p, q are all positive integers.
  • a second aspect of the embodiments of the present application provides a server, including: an identification module, configured to identify a node topology table, where the node topology table includes between NUMA nodes in the server, between the NUMA node and the NC, and between the NC and the NC.
  • connection module configured to generate a memory access jump table of each NUMA node according to the node topology table identified by the identification module, where the memory access jump table of the first NUMA node includes the The number of NC hops and the number of NC hops in the shortest path of the first NUMA node connected to the other NUMA nodes, the number of NC hops is the number of NCs through which the shortest path passes, and the number of QPI hops is the number of NUMA nodes through which the shortest path passes.
  • a calculation module configured to calculate a memory access priority table of each NUMA node according to the memory access jump table of each NUMA node generated by the generating module, where the first memory access priority table of the first NUMA node includes The first NUMA node accesses the priorities of the other NUMA nodes. If the number of NC hops is less, the priority of accessing the NUMA node is higher. If the number of NC hops is the same, the fewer QPI hops, the NUMA node is accessed. Excellent The higher the level, the allocation module is configured to perform memory allocation according to the first memory access priority table calculated by the computing module when the first NUMA node applies for memory, and the higher the priority, the higher the priority is. Allocate memory in the corresponding NUMA node.
  • the allocating module is further configured to: when the first access priority table has multiple priorities In the case of NUMA nodes, memory is allocated from these same priority NUMA nodes in an interleaving strategy.
  • the generating module specifically includes: a reading unit, configured to read the stored node topology identified by the identifying module a first calculation unit, configured to calculate a shortest path from each NUMA node to each of the other NUMA nodes according to the node topology table read by the reading unit, where the shortest path is the number of NC hops in the preselected shortest path a minimum path, the preselected shortest path is a path with a minimum number of QPI hops in a path from one NUMA node to another NUMA node; and a second calculating unit is configured to calculate each NUMA node according to the first calculating unit Calculating the number of NC hops and QPI hops on each shortest path to the shortest path of each of the other NUMA nodes; and forming a unit for using the NUMA nodes calculated by the second calculating unit to the shortest path of each of the other NUMA nodes
  • the calculating module specifically includes: a first sorting unit, configured to fetch a jump table according to each NUMA node The number of NC hops is from small to large, and the NUMA nodes in the memory access jump table are sorted to obtain a first NUMA node sequence; and the second sorting unit is configured to use the same number of NC hops in the first NUMA node sequence.
  • the NUMA node sorts the QPI hops in the order of the smallest hops in the order of the NUMA nodes, and obtains a sequence of the second NUMA nodes.
  • the assigning unit is configured to sequentially perform the second NUMA according to the order of priority from high to low.
  • the NUMA nodes in the node sequence are given priority, wherein the NUMA nodes with the same number of NC hops and QPI hops have the same priority.
  • the allocating module specifically includes: a starting unit, configured to assume that the current memory size to be allocated is the first capacity, the current query The priority is the first priority, the first priority is a priority in the first access priority table, and the query is triggered according to the priority in the first access priority table from high to low.
  • a unit a query unit, configured to query whether there is free memory in the NUMA node of the current query priority; and a first update unit, configured to update the first priority to be when there is no free memory in the NUMA node of the current query priority
  • the first priority of the current query priority triggers the start unit; the first determining unit is configured to: when there is only one second NUMA node in the NUMA node of the current query priority, the second size is idle And determining, by the first allocation unit, when the second capacity is not less than the current memory size to be allocated.
  • the second NUMA node allocates, to the first NUMA node, a memory whose size is the current memory size to be allocated, and triggers an end unit; and a second update unit, configured to: when the second capacity is smaller than the current to be allocated When the memory size is small, the second NUMA node allocates a memory of a size of the second capacity to the first NUMA node, and the first capacity is updated to the current memory size to be allocated minus the second capacity, and the update is performed.
  • the first priority is a next priority of the current query priority, and the starting unit is triggered. End unit to end the memory allocation process.
  • the allocating module further includes: a second allocating unit, configured to: when the query unit determines the NUMA of the current query priority When there is more than one third NUMA node in the node, the memory is allocated from each third NUMA node by means of an interleaving strategy, and the allocated memory size is the third capacity; the first trigger unit is used for When the third capacity is equal to the current memory size to be allocated, the end unit is triggered; and the second trigger unit is configured to update the first capacity when the third capacity is smaller than the current memory size to be allocated.
  • the current memory size to be allocated is subtracted from the third capacity, and the first priority is updated to be the next priority of the current query priority, and the starting unit is triggered.
  • the allocating module further includes: a second determining unit, configured to: when the query unit queries all NUMA nodes When there is no free memory, it is determined whether a memory release operation is performed, and the memory release operation indicates that the temporarily unused memory is swapped into the hard disk buffer; and the execution unit is released, when the second determining unit determines that the memory is not performed.
  • a memory release operation is performed, the current memory size to be allocated and the current query priority are initialized, and the start unit is triggered.
  • the second allocating unit specifically includes: a determining subunit, configured to determine, when the query unit determines a current query priority When there is more than one third NUMA node in the NUMA node, it is determined whether the current memory size to be allocated is greater than one memory page; the first allocation subunit is configured to: when the current memory size to be allocated is greater than one In the memory page, the memory is allocated from each third NUMA node by means of an interleaving strategy; the second allocation subunit is configured to: when the current to-be-allocated size is not greater than one memory page, from the third NUMA nodes A third NUMA node is randomly selected for memory allocation and the end unit is triggered.
  • a third aspect of embodiments of the present application provides a computer readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the methods described in the above aspects.
  • a fourth aspect of an embodiment of the present application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method described in the above aspects.
  • the embodiment of the present application has the following advantages:
  • the server identifies a node topology table, and the node topology table not only has a connection relationship between NUMA nodes, but also has a NUMA node and an NC.
  • the server generates a memory jump table for each NUMA node according to the node topology table, and the jump table includes not only the QPI hops in the shortest path connected to the other NUMA nodes.
  • the server calculates the access priority of each NUMA node according to the memory jump table of each NUMA node, and uses the NC hop count as an important parameter for calculating the access priority, and the number of NC hops is less.
  • the higher the priority of the memory access is the memory allocation is performed according to the memory access priority table when the NUMA node applies for memory.
  • the higher the priority the higher the memory is allocated from the NUMA node corresponding to the priority. Therefore, As an important parameter of the memory access priority calculation, the NC hop count reduces the chance of allocating memory across the NC when allocating memory, thereby reducing the memory access delay caused by the NC and improving the memory. Service performance.
  • FIG. 1 is a schematic diagram of an application scenario of a memory allocation method according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a memory allocation method in an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of generating a memory access jump table according to an embodiment of the present application
  • FIG. 5 is a schematic flowchart of calculating a priority of a memory access in an embodiment of the present application
  • FIG. 6 is a schematic flowchart of a priority given in an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of performing memory allocation according to priority in the embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a server in an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a generating module in an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a computing module in an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a distribution module in an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a server in an embodiment of the present application.
  • first, second, etc. may be used to describe various NUMA nodes or priorities in embodiments of the present application, the NUMA nodes or priorities should not be limited to these terms. These terms are only used to distinguish NUMA nodes or priorities from each other.
  • the first NUMA node may also be referred to as a second NUMA node without departing from the scope of the embodiments of the present application.
  • the second NUMA node may also be referred to as a first NUMA node; likewise, the second The priority may also be referred to as a third priority or the like, which is not limited in this embodiment of the present application.
  • FIG. 1 is a schematic diagram of an application scenario of the memory allocation method, where the server includes a NUMA node 1 connected through a high-speed Internet. 5.
  • Each NUMA node includes a set of CPU and local memory.
  • Each NUMA node may be directly connected by QPI, for example, NUMA node 1 and NUMA node 2, NUMA node 1 and NUMA node 3, NUMA node 3 and NUMA node 4, and may also be connected through an NC, such as NUMA node 2 and NUMA. Node 5, NUMA node 3 and NUMA node 5.
  • FIG. 1 is only a schematic diagram. In practical applications, the number of CPUs in each NUMA node is not limited, and the server may include more or fewer NUMA nodes, which may also include more or more. There are few NCs, which are not limited here.
  • the server identifies a node topology table.
  • the server identifies a node topology table, which includes a connection relationship between the NUMA nodes in the server, between the NUMA node and the node controller NC, and between the NC and the NC.
  • the node topology table may be a matrix S of (N+M)*(N+M) order, where N is the number of NUMA nodes in the server, M is the number of NCs in the server, and the first N columns of the matrix S are N rows represent NUMA nodes, the last M columns and M rows of matrix S represent NC, and the value of the pth row q column in matrix S represents the connection relationship between node p and node q, where N, M, p, q are Is a positive integer.
  • Table 1 For understanding. Assuming that there are N NUMA nodes and M NC nodes in the system, the node topology table is shown in Table 1.
  • 0-N-1 represents a NUMA node
  • N-(N+(M-1)) represents an NC node.
  • node topology table is stored in the manner shown in Table 1, for any node p, storing the interconnection relationship with other nodes requires N+M bits, if the qth node corresponds to If the bit is 1, it means that node p and node q are directly connected. Otherwise, node p and node q are not connected. Therefore, if there are N+M nodes in the system, only (N+M)*(N+M)/8 bytes are needed, and the topology of the NUMA node and NC in the entire server can be stored.
  • the server generates a memory access jump table of each NUMA node according to the node topology table.
  • the server generates a fetch jump table of each NUMA node according to the identified node topology table, where the fetch jump table of the first NUMA node includes the shortest path that the first NUMA node is connected to the other NUMA nodes.
  • N memory jump tables respectively corresponding to the NUMA nodes may be generated, and the first NUMA node may be any one of the NUMA nodes, which is not limited herein.
  • the process of generating a memory access jump table according to the node topology table may be many.
  • FIG. 3 describes the generation process of one of the memory access jump tables as an example:
  • the server reads the stored node topology table.
  • the server calculates a shortest path from each NUMA node to each of the other NUMA nodes according to the node topology table.
  • the shortest path is a path with a minimum number of NC hops in the preselected shortest path
  • the preselected shortest path is a path with the least number of QPI hops in a path from one NUMA node to another NUMA node;
  • L p ⁇ q represents the shortest path of the NUMA node p to the NUMA node q
  • L p ⁇ q ⁇ p,n 0 ,...n i ,...,n I ,q ⁇ .
  • the shortest path from each NUMA node to each of the other NUMA nodes can be calculated.
  • the server calculates an NC hop count and a QPI hop count on each shortest path according to a shortest path from each NUMA node to each of the other NUMA nodes.
  • the number of NC hops and QPI hops on each shortest path can be calculated.
  • Figure 4 one of the calculation methods is taken as an example:
  • the calculation method is based on the node topology table shown in Table 1 in step 101.
  • H nc where NC represents the number of hops, H qpi represents QPI hops, i is used to index L pq, L pq represented by the shortest path of p to q;
  • L p ⁇ q ⁇ p,n 0 ,...n i ,...,n I ,q ⁇ .
  • N N NUMA nodes in the server
  • L pq [i] represents the node number of the i-th node in the shortest path of L pq , and according to the node topology diagram shown in Table 1, the number is 0.
  • the node of -N-1 represents a NUMA node, therefore:
  • the QPI hop count H qpi and the NC hop count H nc in the shortest path L pq can be obtained. Further, the Qp hop count H qpi and the NC hop count H nc of the node p to all other NUMA nodes can be obtained by using the procedure. Further, the process can obtain the number of NC hops and QPI hops on the shortest path from the NUMA node to the other NUMA nodes.
  • the server combines the NC hop count and the QPI hop count on the shortest path of each NUMA node to each of the other NUMA nodes to form a memory jump table of each NUMA node.
  • the access jump table of the node p can be formed.
  • One way of expressing it can be as shown in Table 2 below:
  • the server calculates a memory access priority table of each NUMA node according to the memory jump table of each NUMA node.
  • the first access priority table of the first NUMA node includes the priority of the first NUMA node to access other NUMA nodes. If the number of NC hops is less, the priority of accessing the NUMA node is higher. If the number of NC hops is the same, the fewer the QPI hops, the higher the priority of accessing the NUMA node;
  • the server sorts the NUMA nodes in the memory access jump table according to the number of NC hops in the memory jump table of each NUMA node, and obtains a sequence of the first NUMA node.
  • the server sorts the QPI hops in the fetch jump table from small to large to obtain a second NUMA node sequence.
  • the server sequentially prioritizes the NUMA nodes in the second NUMA node sequence according to the order of priority from high to low.
  • the NUMA nodes with the same number of NC hops and QPI hops have the same priority.
  • i ⁇ N does not hold, it indicates that a priority has been generated for each NUMA, the whole process ends, and the process jumps to step 10338 to exit; otherwise, the process jumps to step 10333.
  • S[i].Hnc represents the number of NC hops of the NUMA node of S[i] from the NUMA node of p;
  • step 10336 If they are equal, it means that the number of NC hops of the NUMA node of the i-th is the same as the number of NC hops of the NUMA node of the i-1th, and the QPI hop count needs to be compared, then the process proceeds to step 10336;
  • the number of NC hops of the NUMA node of the i-th node is larger than the number of NC hops of the NUMA node of the i-1th, and the process proceeds to step 10334.
  • the QPI hop count of the NUMA node of the i-th is more than the QPI hop count of the NUMA node of the i-1th, and the process jumps to step 10334.
  • the access priority of the NUMA node from the p-number to the other NUMA nodes can be calculated.
  • the priority of the NUMA node from the p-number to the NUMA node accessing the q-number is P[q], and the NUMA node p can be obtained.
  • the access priority list can be expressed as shown in Table 3 below:
  • the server allocates memory according to the first access priority table. The higher the priority, the more preferentially allocates memory from the NUMA node corresponding to the priority.
  • the server allocates memory from the NUMA nodes of the same priority according to the interleaving policy.
  • the server allocates memory to the application through a localization policy or an interleaving strategy.
  • the localization policy can reduce the delay overhead caused by remote access, excessive local memory access reduces the parallelism of the memory access. Therefore, the access congestion may be caused, and the delay overhead caused by the access may be greater than the delay overhead of the remote access memory; if the memory is allocated only by the interleaved strategy, although the memory access parallelism can be maximized, the remote end is greatly increased.
  • the number of memory accesses causes the local memory to only account for 1/N of the number of memory, where N is the number of NUMA nodes, which causes the remote access delay problem to be highlighted.
  • the memory allocation is performed according to the priority in the memory access priority table.
  • the interleaving strategy is used to allocate the locality preferentially, and the parallelism of the memory access is taken into account, thereby not only reducing the accessibility.
  • the number of remote memory accesses also increases the parallelism of memory access, reduces the occurrence of memory congestion, and improves system performance.
  • the current memory size to be allocated is the first capacity
  • the current query priority is the first priority
  • the first priority is a priority in the first access priority list
  • the server queries whether there is free memory in the NUMA node of the current query priority
  • the server determines whether a memory release operation is performed, and the memory release operation indicates that the temporarily unused memory is exchanged into the hard disk buffer;
  • the server updates the first priority to the next priority of the current query priority, and triggers step 1;
  • the server determines whether the second capacity is not smaller than the current memory size to be allocated.
  • the server allocates, from the second NUMA node, the memory of the size of the current memory to be allocated to the first NUMA node, and ends the memory. Distribution process
  • the server allocates a memory of a second capacity from the second NUMA node to the first NUMA node, and updates the first capacity.
  • the first priority is the next priority of the current query priority, and the step 1 is triggered;
  • the server allocates memory from each third NUMA node by using an interleaving strategy, and the allocated memory size is Three volumes
  • the server may determine whether the current memory size to be allocated is greater than one memory page;
  • the server randomly selects a third NUMA node from the third NUMA nodes for memory allocation, and ends the memory allocation process.
  • Step 1 is triggered for the next priority of the current query priority.
  • step 10404 If there is no free memory on the NUMA node with priority P, then go to step 10404;
  • the Swap operation swaps the unused memory pages to the hard disk buffer:
  • step 10407 the memory allocation fails, after the corresponding processing, the process proceeds to step 10423;
  • a memory node set S[X] comprising all memory nodes having a priority of P and having free memory
  • X represents a total of X memory nodes with a priority of P.
  • X is equal to 1, it means that only one memory node with priority P is available for allocation of memory, and y is used for index of set S, and the process proceeds to step 10411;
  • X is greater than 1, and jumps to step 10415.
  • step 10424 If the free memory on S[y] is greater than MR, then go to step 10424;
  • X is greater than 1, it indicates that there are multiple memory nodes with priority P available for allocation. In this case, memory is allocated from each memory node with priority P by polling, and the unit is allocated 1 time.
  • step 10417 If the MR is not less than a 4KB, then go to step 10417;
  • the required memory allocated is less than or equal to one page, and the process proceeds to step 10423;
  • step 10419 If y ⁇ X, it indicates that all memory nodes with priority P have not been polled, and the process proceeds to step 10419;
  • Flag is equal to 1, it means that the last polling allocates memory once, indicating that there is free memory on the memory node with priority P. Therefore, the process jumps to step 10415 to start the next polling;
  • MR is less than or equal to a 4Kbyte
  • S of priority P needs to be selected to allocate enough memory.
  • the server identifies a node topology table.
  • the node topology table not only has a connection relationship between NUMA nodes, but also a connection relationship between the NUMA node and the NC, between the NC and the NC, and the server according to the node topology.
  • the table generates a memory jump table of each NUMA node, where the jump table has not only the number of QPI hops in the shortest path connected to other NUMA nodes, but also the number of NC hops, and the server then fetches the memory according to each NUMA node.
  • Jump table calculate the access priority of each NUMA node, and use the NC hop count as an important parameter for the calculation of the access priority. The fewer the NC hop count, the higher the priority of the memory access.
  • an embodiment of the server in the embodiment of the present application includes:
  • the identification module 801 is configured to identify a node topology table, where the node topology table includes a connection relationship between the NUMA nodes in the server, between the NUMA node and the NC, and between the NC and the NC.
  • the generating module 802 is configured to generate, according to the node topology table that is identified by the identifying module 801, a memory access jump table of each NUMA node, where the first NUMA node includes the first NUMA node in the memory access jump table.
  • the number of NC hops and the fast interconnect channel QPI hops in the shortest path of the other NUMA nodes.
  • the number of NC hops is the number of NCs through which the shortest path passes.
  • the QPI hop count is the number of NUMA nodes through which the shortest path passes.
  • the calculation module 803 is configured to calculate, according to the memory access jump table of each NUMA node generated by the generating module 802, a memory access priority table of each NUMA node, where the first memory access priority table of the first NUMA node is used.
  • the priority of the first NUMA node to access the other NUMA nodes is included. If the number of NC hops is smaller, the priority of accessing the NUMA node is higher. If the number of NC hops is the same, the fewer QPI hops are, and the NUMA is accessed. The higher the priority of the node;
  • the allocation module 804 is configured to perform memory allocation according to the first memory access priority table calculated by the calculating module 803 when the first NUMA node applies for the memory, and the higher the priority, the higher the priority corresponding to the priority Allocate memory in the NUMA node.
  • the identification module 801 identifies a node topology table.
  • the node topology table not only has a connection relationship between NUMA nodes, but also a connection relationship between the NUMA node and the NC, between the NC and the NC, and the generation module 802
  • a memory jump table of each NUMA node is generated, where the jump table includes not only the QPI hop count in the shortest path connected to the other NUMA nodes, but also the NC hop count, and the calculation module 803 further
  • the memory access jump table of each NUMA node calculates the access priority of each NUMA node, and uses the NC hop count as an important parameter for calculating the access priority. The fewer the NC hop count, the higher the priority of the memory access.
  • the allocation module 804 allocates memory according to the memory access priority table.
  • the higher the priority the more preferentially allocates memory from the NUMA node corresponding to the priority. Therefore, the NC hop count is used as the memory priority.
  • An important parameter of the level calculation reduces the chance of allocating memory across the NC when allocating memory, thereby reducing the memory latency caused by the NC and improving server performance.
  • the allocating module 804 is further configured to: when there are multiple NUMA nodes with the same priority in the first access priority table, according to the manner of the interleaving policy Allocate memory from these same priority NUMA nodes.
  • the allocation module 804 performs memory allocation according to the priority in the memory access priority table.
  • the interleaving strategy is used to allocate the locality preferentially, and the parallelism of the memory access is taken into consideration. Only the number of remote memory accesses is reduced, the parallelism of memory access is increased, the occurrence of memory congestion is reduced, and the performance of the system is improved.
  • the generating module 802 may specifically include:
  • the reading unit 901 is configured to read the stored node topology table that is identified by the identification module 801;
  • the first calculating unit 902 is configured to calculate a shortest path from each NUMA node to each of the other NUMA nodes according to the node topology table read by the reading unit 901, where the shortest path is the number of NC hops in the preselected shortest path The least path, the preselected shortest path is the path with the smallest number of QPI hops in the path from one NUMA node to another NUMA node;
  • a second calculating unit 903 configured to calculate, according to the shortest path of each NUMA node calculated by the first calculating unit 902 to each of the other NUMA nodes, an NC hop count and a QPI hop count on each shortest path;
  • the component unit 904 is configured to form an NC hop count and a QPI hop count on the shortest path of each NUMA node calculated by the second calculating unit 903 to the other NUMA nodes to form a fetch jump table of each NUMA node.
  • the first calculating unit 902 calculates the shortest path
  • the second calculating unit 903 calculates the number of NC hops and the number of QPI hops on each shortest path
  • the component 904 composes the hops of each NUMA node.
  • the table is transferred, thereby realizing the generation of the fetch jump table for each NUMA node.
  • the calculating module 803 may specifically include:
  • the first sorting unit 1001 is configured to sort the NUMA nodes in the memory access jump table according to the order of the number of NC hops in the memory jump table of each NUMA node, to obtain a sequence of the first NUMA node;
  • the second sorting unit 1002 is configured to sort the QPI hops in the fetch jump table from the smallest to the largest in order to obtain the second NUMA node sequence, for the NUMA nodes with the same NC hop count in the first NUMA node sequence. ;
  • the assigning unit 1003 is configured to sequentially prioritize the NUMA nodes in the second NUMA node sequence according to the order of priority from high to low, wherein the NUMA nodes with the same number of NC hops and QPI hops have the same priority .
  • the first sorting unit 1001 and the second sorting unit 1002 first sort the NUMA nodes in the memory access jump table, and the assigning unit 1003 assigns priorities in order, thereby improving the efficiency of priority assignment.
  • the foregoing allocating module 804 may specifically include:
  • the start unit 1101 is configured to assume that the current memory size to be allocated is the first capacity, the current query priority is the first priority, and the first priority is a priority in the first access priority list.
  • the query unit 1102 is triggered according to the order of priority in the first access priority table from high to low;
  • the query unit 1102 is configured to query whether there is free memory in the NUMA node of the current query priority
  • the first updating unit 1103 is configured to: when the NUMA node of the current query priority has no free memory, update the first priority to the next priority of the current query priority, triggering the starting unit 1101;
  • the first determining unit 1104 is configured to determine, when there is only one second NUMA node in the NUMA node of the current query priority, having a second capacity, and determining whether the second capacity is not smaller than the current to-be-allocated memory. size;
  • a first allocation unit 1105 configured to: when the second capacity is not less than the current memory size to be allocated, allocate, by the second NUMA node, the memory that is the size of the current memory to be allocated, for the first NUMA node, And triggering the end unit 1107;
  • a second updating unit 1106, configured to: when the second capacity is smaller than the current memory size to be allocated, allocate, by the second NUMA node, the memory of the second capacity to the first NUMA node, and update the The first capacity is the current memory size to be allocated minus the second capacity, and the first priority is updated to be the next priority of the current query priority, and the starting unit 1101 is triggered.
  • the ending unit 1107 is configured to end the memory allocation process.
  • the allocating module 804 may further include:
  • a second allocating unit 1108, configured to: when the query unit 1102 determines that there is more than one third NUMA node in the NUMA node of the current query priority, if there is free memory in the third NUMA node, by using an interleaving strategy from each third NUMA node Allocate memory, and the allocated memory size is the third capacity;
  • the first triggering unit 1109 is configured to trigger the ending unit 1107 when the third capacity is equal to the current memory size to be allocated;
  • the second triggering unit 1110 is configured to: when the third capacity is smaller than the current memory size to be allocated, update the first capacity to the current to-be-allocated memory size minus the third capacity, and update the first
  • the start unit 1101 is triggered by a priority that is the next priority of the current query priority.
  • the allocating module 804 may further include:
  • the second determining unit 1111 is configured to: when the query unit 1102 queries all the NUMA nodes that there is no free memory, determine whether a memory release operation is performed, and the memory release operation indicates that the temporarily unused memory is swapped to the hard disk buffer. in;
  • the release execution unit 1112 is configured to: when the second determining unit 1111 determines that the memory release operation is not performed, perform a memory release operation, initialize a current memory size to be allocated, and a current query priority, and trigger the start unit 1101 .
  • the second allocating unit 1108 may specifically include:
  • a determining subunit configured to determine, when the query unit 1102 determines that there is more than one third NUMA node in the NUMA node of the current query priority, whether the current memory size to be allocated is greater than one memory page;
  • a first allocation subunit configured to allocate memory from each third NUMA node by using an interleaving strategy when the current memory size to be allocated is greater than one memory page;
  • a second allocation subunit configured to: when the current to-be-allocated size is not greater than one memory page, randomly select a third NUMA node from the third NUMA nodes for memory allocation, and trigger the ending unit 1107.
  • the allocation module 804 performs memory allocation according to the memory access priority table. If the priorities are the same, the allocation is performed according to the interleaving strategy, which reduces the memory access delay caused by the NC and improves the performance of the server.
  • the parallelism of memory access not only reduces the number of remote memory accesses, but also increases the parallelism of memory access, reduces the occurrence of memory congestion, and improves the performance of the system.
  • the server in the embodiment of the present application is described above from the perspective of the unitized functional entity.
  • the server in the embodiment of the present application is described from the perspective of hardware processing. Referring to FIG. 12, the server 1200 in the embodiment of the present application is another. Examples include:
  • the input device 1201, the output device 1202, the processor 1203, and the memory 1204 (wherein the number of the processors 1203 in the server 1200 may be one or more, and one processor 1203 in FIG. 12 is taken as an example).
  • the input device 1201, the output device 1202, the processor 1203, and the memory 1204 may be connected by a bus or other means, wherein the bus connection is taken as an example in FIG.
  • the CPU set in all the NUMA nodes in the server in the embodiment scenario diagram shown in FIG. 1 constitutes the processor 1203 in the embodiment of the present application, and the local memory set in all the NUMA nodes is in the embodiment of the present application. Memory 1204.
  • the processor 1203 is configured to perform the following steps by calling an operation instruction stored in the memory 1204:
  • Identifying a node topology table where the node topology table includes a connection relationship between the NUMA nodes in the server, between the NUMA node and the NC, and between the NC and the NC;
  • a memory access jump table of each NUMA node where the memory access jump table of the first NUMA node includes the NC in the shortest path of the first NUMA node connected to each of the other NUMA nodes.
  • Hop count and QPI hop count of the fast interconnect channel The number of NC hops is the number of NCs through which the shortest path passes.
  • the QPI hop count is the number of NUMA nodes through which the shortest path passes.
  • the memory allocation is performed according to the first memory access priority table.
  • the higher the priority the more preferentially allocates memory from the NUMA node corresponding to the priority.
  • the processor 1203 is further configured to perform the following steps:
  • the memory is allocated from the NUMA nodes of the same priority according to the interleaving strategy.
  • the processor 1203 performs the step of generating a memory access jump table of each NUMA node according to the node topology table, and performing the following steps:
  • NC hop count and the QPI hop count on the shortest path of each NUMA node to each of the other NUMA nodes constitute a fetch jump table of each NUMA node.
  • the processor 1203 when the processor 1203 performs the step of calculating the memory access priority table of each NUMA node according to the memory access jump table of each NUMA node, the following steps are specifically performed:
  • the NUMA nodes in the memory access jump table are sorted according to the order of the number of NC hops in the memory jump table of each NUMA node, and the first NUMA node sequence is obtained;
  • the QPI hops in the fetch jump table are sorted in order from small to large to obtain a second NUMA node sequence
  • the NUMA nodes in the second NUMA node sequence are given priority in order of priority from high to low, wherein the NUMA nodes having the same number of NC hops and QPI hops have the same priority.
  • processor 1203 when the processor 1203 performs the step of performing memory allocation according to the first memory access priority table, the following steps are specifically performed:
  • the current memory size to be allocated is the first capacity
  • the current query priority is the first priority
  • the first priority is a priority in the first access priority list, according to the first access priority.
  • the memory allocation is as follows:
  • the first priority is updated to the next priority of the current query priority, and the step of performing memory allocation according to the following process is triggered;
  • the NUMA node of the current query priority has only one second NUMA node having a free memory of a second capacity, determining whether the second capacity is not smaller than the current memory size to be allocated;
  • the second capacity is not smaller than the current memory size to be allocated, allocate, by the second NUMA node, the memory of the size of the current memory to be allocated to the first NUMA node, and end the memory allocation process;
  • the second capacity is smaller than the current size of the memory to be allocated, allocate, by the second NUMA node, the memory of the second capacity to the first NUMA node, and update the first capacity to the current
  • the memory size is allocated minus the second capacity, and the first priority is updated to be the next priority of the current query priority, and the step of performing memory allocation according to the following process is triggered.
  • performing the following steps further includes:
  • the memory is allocated from each third NUMA node by using an interleaving strategy, and the allocated memory size is the third capacity
  • the next priority of the priority triggers the step of performing memory allocation according to the following process.
  • the processor 1203 after the processor 1203 performs the step of querying whether there is free memory in the NUMA node of the current query priority, the processor 1203 further performs the following steps:
  • the processor 1203 before the processor 1203 performs the step of allocating memory from each third NUMA node by means of an interleaving strategy, the processor 1203 further performs the following steps:
  • a third NUMA node is randomly selected from the third NUMA nodes for memory allocation, and the memory allocation process is ended.
  • the node topology table is a matrix S of (N+M)*(N+M) steps, where N is the number of NUMA nodes in the server, M is the number of NCs in the server, and the matrix S The first N columns and N rows represent NUMA nodes, the last M columns and M rows of matrix S represent NC, and the value of the pth row q column in matrix S represents the connection relationship between node p and node q, where N, M , p, q are positive integers.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • a computer readable storage medium A number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请实施例公开了一种内存分配方法和服务器,用于在内存分配的时候,减少NC延迟带来的性能损失,提高服务器性能。本申请实施例方法包括:服务器识别节点拓扑表,该节点拓扑表中不仅存在NUMA节点之间的连接关系,而且存在NUMA节点与NC之间,NC与NC之间的连接关系,服务器根据该节点拓扑表,生成各NUMA节点的访存跳转表,该跳转表中不仅有连接至其他各NUMA节点的最短路径中的QPI跳数,而且有NC跳数,服务器再根据各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级,将NC跳数作为访存优先级计算的一个重要参数,NC跳数越少,访存优先级越高,当有NUMA节点申请内存时,根据该访存优先级表进行内存分配,优先级越高,越优先从该优先级对应的NUMA节点中分配内存。

Description

一种内存分配方法和服务器 技术领域
本申请涉及通讯领域,尤其涉及一种内存分配方法和服务器。
背景技术
受到硬件芯片技术的限制,单个处理器(Central Processing Unit,CPU)的计算能力趋于饱和。因此,为了获取更高的计算性能,服务器趋向于通过增加处理器数量的方式增加服务器的计算性能。对于高性能服务器而言,一般采用非一致性内存架构(Non-Uniform Memory Architecture,NUMA),即多个节点通过高速互联网络连接而成,每个节点由一组CPU和本地内存组成。节点访问本地内存时,访存延迟较小,性能较高;但是,如果访问远端内存,则访存延迟相对较高,会造成性能下降。为此,为了提升系统性能,当前的内存分配器在进行内存分配的时候,需要优先分配本地内存,其次再考虑远端内存。
为了减少访存时带宽的限制,英特尔Intel利用快速互联通道(Quick Pass Interconnect,QPI)代替了传统的前端总线(Front Side Bus,FSB),其中,QPI是一种基于包传输的串行式高速点对点连接协议。但是,由于每个节点只提供三个QPI接口,因此导致一个服务器中的能够互联的节点数量受限。为了解决上述问题,引入了节点控制器(Node Controller,NC)。NC提供了更多的节点接口(Node Interface,NI),因此可以通过NC扩大服务器内互联节点的数量,从而减少跨节点访存延迟。
当前内核中提供的针对传统NUMA内存分配的方法为:
1、在系统初始化过程中生成NUMA距离,其中:
如果是本地节点,则NUMA距离为0;
如果节点A和节点B互联,即A和B相邻,则NUMA距离为20;
如果节点A和节点C不互联,则A和C的NUMA距离等于A到C之间经过的最小跳(Hop)数乘以20。
2、根据节点拓扑分配内存,其中包括如下三种分配策略:
A、本地化(Local)策略:该策略旨增加本地内存的访问量,从而减少访存延迟,具体流程如下:
(1)、检查本地节点是否有足够的内存;
(2)、如果本地节点有足够的内存,则优先从本地节点分配内存;
(3)、如果本地节点没有足够的内存,则需要按照各个节点距离该节点的NUMA距离由小到大的顺序,依次寻找具有足够内存的节点,分配相应的内存。
B、优先(Preferred)策略:该策略会指定一系列的内存节点,在进行内存分配的时候,会首先从指定的内存节点中分配内存。如果指定的内存节点已经分配完毕, 则从其他节点分配内存。
C、交织(Interleaved)策略:该策略旨在增加访存的并行性。在该策略下,系统会以轮询的方式,按照节点编号,依次从各个内存节点中分配内存。
然而,该内存分配方法假定每个NUMA Hop的距离或者开销是相同的,因此NUMA距离的计算算法只以NUMA Hop为唯一的输入变量。事实上,与通过QPI直接相连相比,通过NC的互联会引入额外的访存延迟,即,经过NC互联起来的两个节点,其延迟要远大于直接通过QPI互联起来的两个节点。这就导致,不同的NUMA节点之间的传输开销是不同的。因此,上述内存分配方法中NUMA距离的计算方法不能意识到NC的问题,会导致跨NC的访存增加,从而使得访存延迟增加,导致服务器性能下降。
发明内容
本申请实施例提供了一种内存分配方法和服务器,用于在内存分配的时候,减少NC延迟带来的性能损失,提高服务器性能。
本申请实施例的第一方面提供了一种内存分配方法,其特征在于,包括:服务器识别节点拓扑表,其中,所述节点拓扑表中包括所述服务器中各非一致性内存架构NUMA节点之间、NUMA节点与节点控制器NC之间以及NC与NC之间的连接关系;所述服务器根据识别到的所述节点拓扑表,生成各NUMA节点的访存跳转表,设定该各NUMA节点中的任一点为第一NUMA节点,则所述第一NUMA节点的访存跳转表中包括所述第一NUMA节点连接至其他各NUMA节点的最短路径中的NC跳数和快速互联通道QPI跳数,NC跳数为最短路径经过的NC的数量,QPI跳数为最短路径经过的NUMA节点的数量;所述服务器根据各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级表,其中,第一NUMA节点的第一访存优先级表中包括所述第一NUMA节点访问其他各NUMA节点的优先级,若NC跳数越少,则访问该NUMA节点的优先级越高,若NC跳数相同,则QPI跳数越少,访问该NUMA节点的优先级越高;当所述第一NUMA节点申请内存时,所述服务器根据所述第一访存优先级表进行内存分配,优先级越高,越优先从该优先级对应的NUMA节点中分配内存。
在一种可能的设计中,在本申请实施例第一方面的第一种实现方式中,所述方法还包括:若所述第一访存优先级表中有多个NUMA节点的优先级相同,则所述服务器按照交织策略的方式从这些相同优先级的NUMA节点中分配内存。
在一种可能的设计中,在本申请实施例第一方面的第二种实现方式中,所述服务器根据所述节点拓扑表,生成各NUMA节点的访存跳转表,具体包括:所述服务器读取存储的所述节点拓扑表;并根据所述节点拓扑表,计算每个NUMA节点到其他各NUMA节点的最短路径,其中,所述最短路径为预选最短路径中NC跳数最少的路径,所述预选最短路径为由一个NUMA节点到另一个NUMA节点的路径中QPI跳数最少的路径;所述服务器根据每个NUMA节点到其他各NUMA节点的最短路径,计算各最短路径上的NC跳数和QPI跳数;所述服务器将各NUMA节点到其他各NUMA节点的最短路径上的NC跳数和QPI跳数组成所述各NUMA节点的访存跳转表。
在一种可能的设计中,在本申请实施例第一方面的第三种实现方式中,所述服务器根据各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级表,具体包括:所 述服务器按照各NUMA节点的访存跳转表中NC跳数由小到大的顺序,对访存跳转表中的NUMA节点排序,得到第一NUMA节点序列;对于所述第一NUMA节点序列中NC跳数相同的NUMA节点,所述服务器按照访存跳转表中QPI跳数由小到大的顺序进行排序,得到第二NUMA节点序列;所述服务器按照优先级由高到低的顺序依次对所述第二NUMA节点序列中的NUMA节点赋予优先级,其中,NC跳数和QPI跳数均相同的NUMA节点的优先级相同。
在一种可能的设计中,在本申请实施例第一方面的第四种实现方式中,所述服务器根据所述第一访存优先级表进行内存分配,具体包括:假定当前待分配内存大小为第一容量,当前查询优先级为第一优先级,所述第一优先级为所述第一访存优先级表中的一个优先级,所述服务器根据第一访存优先级表中优先级从高到低的顺序,按照如下流程进行内存分配:
所述服务器查询当前查询优先级的NUMA节点中是否有空闲内存;若当前查询优先级的NUMA节点中没有空闲内存,则所述服务器更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述按照如下流程进行内存分配的步骤;若当前查询优先级的NUMA节点中仅存在一个第二NUMA节点有大小为第二容量的空闲内存,则所述服务器判断所述第二容量是否不小于所述当前待分配内存大小;若所述第二容量不小于所述当前待分配内存大小,则所述服务器从所述第二NUMA节点为所述第一NUMA节点分配大小为当前待分配内存大小的内存,并结束内存分配流程;若所述第二容量小于所述当前待分配内存大小,则所述服务器从所述第二NUMA节点为所述第一NUMA节点分配大小为第二容量的内存,更新所述第一容量为所述当前待分配内存大小减去所述第二容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述按照如下流程进行内存分配的步骤。
在一种可能的设计中,在本申请实施例第一方面的第五种实现方式中,所述服务器查询当前查询优先级的NUMA节点中是否有空闲内存的步骤之后,所述方法还包括:若当前查询优先级的NUMA节点中存在多于一个的第三NUMA节点中有空闲内存,则所述服务器通过交织策略的方式从各第三NUMA节点上分配内存,分配的内存大小为第三容量;若所述第三容量等于所述当前待分配内存大小,则结束内存分配流程;若所述第三容量小于所述当前待分配内存大小,则所述服务器更新所述第一容量为所述当前待分配内存大小减去所述第三容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述按照如下流程进行内存分配的步骤。
在一种可能的设计中,在本申请实施例第一方面的第六种实现方式中,所述服务器查询当前查询优先级的NUMA节点中是否有空闲内存的步骤之后,所述方法还包括:若查询完所有的NUMA节点均没有空闲内存,则所述服务器判断是否进行了内存释放操作,所述内存释放操作表示将暂时不用的内存交换到硬盘缓冲区中;若没有进行所述内存释放操作,则执行内存释放操作,初始化当前待分配内存大小和当前查询优先级,并触发所述服务器根据第一访存优先级表中优先级从高到低的顺序,按照如下流程进行内存分配的步骤。
在一种可能的设计中,在本申请实施例第一方面的第七种实现方式中,所述服务器通过交织策略的方式从各第三NUMA节点上分配内存的步骤之前,所述方法还包括: 所述服务器判断所述当前待分配内存大小是否大于一个内存页面;若所述当前待分配内存大小大于一个内存页面,则触发所述服务器通过交织策略的方式从各第三NUMA节点上分配内存的步骤;若所述当前待分配大小不大于一个内存页面,则所述服务器从所述各第三NUMA节点中随机选择一个第三NUMA节点进行内存分配,并结束内存分配流程。
在一种可能的设计中,在本申请实施例第一方面的第八种实现方式中,所述节点拓扑表为(N+M)*(N+M)阶的矩阵S,其中N为服务器中NUMA节点的数目,M为服务器中NC的数目,矩阵S的前N列和N行表示NUMA节点,矩阵S的后M列和M行表示NC,矩阵S中第p行第q列的值表示节点p和节点q之间的连接关系,其中N、M、p、q均为正整数。
本申请实施例的第二方面提供了一种服务器,包括:识别模块,用于识别节点拓扑表,所述节点拓扑表中包括服务器中NUMA节点之间、NUMA节点与NC之间以及NC与NC之间的连接关系;生成模块,用于根据所述识别模块识别得到的节点拓扑表,生成各NUMA节点的访存跳转表,其中,第一NUMA节点的访存跳转表中包括所述第一NUMA节点连接至其他各NUMA节点的最短路径中的NC跳数和快速互联通道QPI跳数,NC跳数为最短路径经过的NC的数量,QPI跳数为最短路径经过的NUMA节点的数量;计算模块,用于根据所述生成模块生成的各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级表,其中,第一NUMA节点的第一访存优先级表中包括所述第一NUMA节点访问其他各NUMA节点的优先级,若NC跳数越少,则访问该NUMA节点的优先级越高,若NC跳数相同,则QPI跳数越少,访问该NUMA节点的优先级越高;分配模块,用于当所述第一NUMA节点申请内存时,根据所述计算模块计算得到的第一访存优先级表进行内存分配,优先级越高,越优先从该优先级对应的NUMA节点中分配内存。
在一种可能的设计中,在本申请实施例第二方面的第一种实现方式中,所述分配模块还用于:当所述第一访存优先级表中存在多个优先级相同的NUMA节点时,按照交织策略的方式从这些相同优先级的NUMA节点中分配内存。
在一种可能的设计中,在本申请实施例第二方面的第二种实现方式中,所述生成模块具体包括:读取单元,用于读取存储的所述识别模块识别得到的节点拓扑表;第一计算单元,用于根据所述读取单元读取的节点拓扑表,计算每个NUMA节点到其他各NUMA节点的最短路径,其中,所述最短路径为预选最短路径中NC跳数最少的路径,所述预选最短路径为由一个NUMA节点到另一个NUMA节点的路径中QPI跳数最少的路径;第二计算单元,用于根据所述第一计算单元计算出的每个NUMA节点到其他各NUMA节点的最短路径,计算各最短路径上的NC跳数和QPI跳数;组成单元,用于将所述第二计算单元计算出的各NUMA节点到其他各NUMA节点的最短路径上的NC跳数和QPI跳数组成所述各NUMA节点的访存跳转表。
在一种可能的设计中,在本申请实施例第二方面的第三种实现方式中,所述计算模块,具体包括:第一排序单元,用于按照各NUMA节点的访存跳转表中NC跳数由小到大的顺序,对访存跳转表中的NUMA节点排序,得到第一NUMA节点序列;第二排序单元,用于对于所述第一NUMA节点序列中NC跳数相同的NUMA节点,按照访存跳转表中QPI跳数由小到大的顺序进行排序,得到第二NUMA节点序列;赋予单元,用于按照 优先级由高到低的顺序依次对所述第二NUMA节点序列中的NUMA节点赋予优先级,其中,NC跳数和QPI跳数均相同的NUMA节点的优先级相同。
在一种可能的设计中,在本申请实施例第二方面的第四种实现方式中,所述分配模块具体包括:起始单元,用于假定当前待分配内存大小为第一容量,当前查询优先级为第一优先级,所述第一优先级为所述第一访存优先级表中的一个优先级,根据第一访存优先级表中优先级从高到低的顺序,触发查询单元;查询单元,用于查询当前查询优先级的NUMA节点中是否有空闲内存;第一更新单元,用于当当前查询优先级的NUMA节点中没有空闲内存时,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述起始单元;第一判断单元,用于当当前查询优先级的NUMA节点中仅存在一个第二NUMA节点有大小为第二容量的空闲内存时,判断所述第二容量是否不小于所述当前待分配内存大小;第一分配单元,用于当所述第二容量不小于所述当前待分配内存大小时,从所述第二NUMA节点为所述第一NUMA节点分配大小为当前待分配内存大小的内存,并触发结束单元;第二更新单元,用于当所述第二容量小于所述当前待分配内存大小时,从所述第二NUMA节点为所述第一NUMA节点分配大小为第二容量的内存,更新所述第一容量为所述当前待分配内存大小减去所述第二容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述起始单元。结束单元,用于结束内存分配流程。
在一种可能的设计中,在本申请实施例第二方面的第五种实现方式中,所述分配模块还包括:第二分配单元,用于当所述查询单元确定当前查询优先级的NUMA节点中存在多于一个的第三NUMA节点中有空闲内存时,通过交织策略的方式从各第三NUMA节点上分配内存,分配的内存大小为第三容量;第一触发单元,用于当所述第三容量等于所述当前待分配内存大小时,触发所述结束单元;第二触发单元,用于当所述第三容量小于所述当前待分配内存大小时,更新所述第一容量为所述当前待分配内存大小减去所述第三容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述起始单元。
在一种可能的设计中,在本申请实施例第二方面的第六种实现方式中,所述分配模块还包括:第二判断单元,用于当所述查询单元查询完所有的NUMA节点均没有空闲内存时,判断是否进行了内存释放操作,所述内存释放操作表示将暂时不用的内存交换到硬盘缓冲区中;释放执行单元,用于当所述第二判断单元确定没有进行所述内存释放操作时,执行内存释放操作,初始化当前待分配内存大小和当前查询优先级,并触发所述起始单元。
在一种可能的设计中,在本申请实施例第二方面的第七种实现方式中,所述第二分配单元具体包括:判断子单元,用于当所述查询单元确定当前查询优先级的NUMA节点中存在多于一个的第三NUMA节点中有空闲内存时,判断所述当前待分配内存大小是否大于一个内存页面;第一分配子单元,用于当所述当前待分配内存大小大于一个内存页面时,通过交织策略的方式从各第三NUMA节点上分配内存;第二分配子单元,用于当所述当前待分配大小不大于一个内存页面时,从所述各第三NUMA节点中随机选择一个第三NUMA节点进行内存分配,并触发所述结束单元。
本申请实施例第三方面提供了一种计算机可读存储介质,包括指令,当其在计算 机上运行时,使得计算机执行上述各方面所述的方法。
本申请实施例第四方面提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。
从以上技术方案可以看出,本申请实施例具有以下优点:本申请实施例中,服务器识别节点拓扑表,该节点拓扑表中不仅存在NUMA节点之间的连接关系,而且存在NUMA节点与NC之间,NC与NC之间的连接关系,服务器根据该节点拓扑表,生成各NUMA节点的访存跳转表,该跳转表中不仅有连接至其他各NUMA节点的最短路径中的QPI跳数,而且有NC跳数,服务器再根据各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级,将NC跳数作为访存优先级计算的一个重要参数,NC跳数越少,访存优先级越高,当有NUMA节点申请内存时,根据该访存优先级表进行内存分配,优先级越高,越优先从该优先级对应的NUMA节点中分配内存,因此,由于将NC跳数作为访存优先级计算的一个重要参数,则在分配内存时,减少了跨NC分配内存的机会,从而降低了由于NC造成的访存延迟,提升了服务器性能。
附图说明
图1为本申请实施例中内存分配方法一个应用场景示意图;
图2为本申请实施例中内存分配方法一个流程示意图;
图3为本申请实施例中生成访存跳转表一个流程示意图;
图4为本申请实施例中计算NC跳数和QPI跳数一个流程示意图;
图5为本申请实施例中计算访存优先级一个流程示意图;
图6为本申请实施例中赋予优先级一个流程示意图;
图7为本申请实施例中按照优先级进行内存分配一个流程示意图;
图8为本申请实施例中服务器一个结构示意图;
图9为本申请实施例中生成模块一个结构示意图;
图10为本申请实施例中计算模块一个结构示意图;
图11为本申请实施例中分配模块一个结构示意图;
图12为本申请实施例中服务器一个结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应当理解,尽管在本申请实施例中可能采用术语第一、第二等来描述各个NUMA节点或优先级,但NUMA节点或优先级不应限于这些术语。这些术语仅用来将NUMA节点或优先级彼此区分开。例如,在不脱离本申请实施例范围的情况下,第一NUMA节点也可以被称为第二NUMA节点,类似地,第二NUMA节点也可以被称为第一NUMA节点;同样的,第二优先级也可以被称为第三优先级等等,本申请实施例对此不做限制。
本申请实施例中的内存分配方法和系统应用于NUMA构架的服务器中,如图1所示, 为该内存分配方法一个应用场景示意图,其中,服务器中包括通过高速互联网络连接的NUMA节点1至5,每个NUMA节点中包括一组CPU和本地内存。各NUMA节点之间可能通过QPI直接连接,例如NUMA节点1与NUMA节点2、NUMA节点1与NUMA节点3、NUMA节点3与NUMA节点4之间,也可能通过NC连接,例如NUMA节点2与NUMA节点5、NUMA节点3与NUMA节点5之间。可以理解的是,图1所示仅为一个示意图,在实际应用中,各NUMA节点中的CPU数量并不限定,服务器可以包括更多或更少的NUMA节点,其中也可以包括更多或更少的NC,此处不作限定。
请参阅图2,下面对本申请实施例中的内存分配方法进行描述:
101、服务器识别节点拓扑表;
服务器识别节点拓扑表,该节点拓扑表中包括服务器中NUMA节点之间、NUMA节点与节点控制器NC之间以及NC与NC之间的连接关系。
可以理解的是,节点拓扑表的表现形式和存储形式可以有很多种,此处不作限定。
优选的,该节点拓扑表可以为(N+M)*(N+M)阶的矩阵S,其中N为服务器中NUMA节点的数目,M为服务器中NC的数目,矩阵S的前N列和N行表示NUMA节点,矩阵S的后M列和M行表示NC,矩阵S中第p行第q列的值表示节点p和节点q之间的连接关系,其中N、M、p、q均为正整数。
请参考表1进行理解,假设系统中共有N个NUMA节点,M个NC节点,则节点拓扑表如表1所示。表1中0-N-1表示NUMA节点,N-(N+(M-1))表示NC节点。S[p][q]表示矩阵中第p行第q列的值,若S[p][q]=1,则表示节点p和节点q直接相连,若p=q,则S[p][q]=0。
表1
Figure PCTCN2018088924-appb-000001
可以理解的是,若采用表1所示的方式进行节点拓扑表的存储,则对于任意节点p,存储其与其他节点的互联关系共需要N+M个比特位,若第q个节点对应的比特位为1,则表示节点p和节点q直接相连,否则,节点p和节点q不相连。因此,若系统中存在N+M个节点,则只需要(N+M)*(N+M)/8个字节,就可以存储整个服务器中NUMA节点和NC的拓扑结构。
102、服务器根据节点拓扑表,生成各NUMA节点的访存跳转表;
服务器根据识别出的节点拓扑表,生成各NUMA节点的访存跳转表,其中,第一NUMA节点的访存跳转表中包括所述第一NUMA节点连接至其他各NUMA节点的最短路径 中的NC跳数和快速互联通道QPI跳数,NC跳数为最短路径经过的NC的数量,QPI跳数为最短路径经过的NUMA节点的数量;
可以理解的是,若服务器中有N个NUMA节点,则可以生成N个分别对应于各NUMA节点的访存跳转表,上述第一NUMA节点可以为其中任一个NUMA节点,此处不作限定。
具体的根据节点拓扑表生成访存跳转表的流程可以有很多,为便于理解,请参阅图3,下面以其中一种访存跳转表的生成流程为例进行说明:
1021、服务器读取存储的所述节点拓扑表;
1022、服务器根据节点拓扑表,计算每个NUMA节点到其他各NUMA节点的最短路径;
其中,所述最短路径为预选最短路径中NC跳数最少的路径,所述预选最短路径为由一个NUMA节点到另一个NUMA节点的路径中QPI跳数最少的路径;
例如,设任意NUMA节点p到其他任意NUMA节点q的最短路径为L p→q,假设L p→q={p,n 0,...n i,...n I,q},其中p≠q∨I。假设由p到q存在多条预选最短路径,预
Figure PCTCN2018088924-appb-000002
可以计算出NUMA节点p到其他所有NUMA节点的最短路径,可记为
Figure PCTCN2018088924-appb-000003
其中L p→q表示NUMA节点p到NUMA节点q的最短路径,L p→q={p,n 0,...n i,...,n I,q}。进一步的,按照该方法,可以计算出每个NUMA节点到其他各NUMA节点的最短路径。
1023、服务器根据每个NUMA节点到其他各NUMA节点的最短路径,计算各最短路径上的NC跳数和QPI跳数;
根据各NUMA节点到其他各NUMA节点的最短路径,即可计算各最短路径上的NC跳数和QPI跳数,为便于理解,请参阅图4,下面以其中一种计算方式为例进行描述:
需要说明的是,该计算方法基于步骤101中表1所示的节点拓扑表。
10231、令H nc=0,H qpi=0,i=0;
其中H nc表示NC跳数,H qpi表示QPI跳数,i用来对L pq进行索引,L pq表示由p到q的最短路径;
例如L p→q={p,n 0,...n i,...,n I,q}。
10232、判断p是否等于q;
如果p等于q,则表示是本地访存,因此H nc=0,且H qpi=0,跳转至步骤10238;否则,跳转步骤10233。
10233、令i=1,并跳转至步骤10234;
10234、判断L pq[i]是否小于N;
需要说明的是,其中N表示服务器中的N个NUMA节点,L pq[i]表示L pq的最短路径中第i个节点的节点编号,根据表1所示的节点拓扑图可知,编号为0-N-1的节点表示NUMA节点,因此:
L pq[i]<N,则表示L pq[i]为NUMA节点,跳转到步骤10235;
否则,表示L pq[i]为NC节点,跳转到步骤10236。
10235、将H qpi加1,并跳转至步骤10237;
10236、将H nc加1,并跳转至步骤10237;
10237、判断下一个节点是否为目标节点q;
即判断L pq[++i]与q是否相等,如果相等则表示到达了q节点,跳转至步骤10238;否则,说明还未到达q节点,跳转至步骤10234。
10238、结束。
采用上述流程遍历完节点p到节点q的最短路径中的所有节点,即可以得到最短路径L pq中的QPI跳数H qpi和NC跳数H nc。进一步的,采用该流程可以得到节点p到所有其他NUMA节点的QPI跳数H qpi和NC跳数H nc。进一步的,采用该流程可以得到个NUMA节点到其他各NUMA节点的最短路径上的NC跳数和QPI跳数。
1024、服务器将各NUMA节点到其他各NUMA节点的最短路径上的NC跳数和QPI跳数组成各NUMA节点的访存跳转表。
得到节点p到所有其他NUMA节点的QPI跳数H qpi和NC跳数H nc后,可以组成节点p的访存跳转表。其一种表示方式可以如下表2所示:
表2
Figure PCTCN2018088924-appb-000004
可以理解的是,采用同样的方法,可以组成其他各NUMA节点的访存跳转表。
103、服务器根据各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级表;
其中,第一NUMA节点的第一访存优先级表中包括所述第一NUMA节点访问其他各NUMA节点的优先级,若NC跳数越少,则访问该NUMA节点的优先级越高,若NC跳数相同,则QPI跳数越少,访问该NUMA节点的优先级越高;
可以理解的是,计算各NUMA节点的访存优先级的方式有多种,为便于理解,请参阅图5,下面以其中一种计算访存优先级的方式为例进行描述:
1031、服务器按照各NUMA节点的访存跳转表中NC跳数由小到大的顺序,对访存跳转表中的NUMA节点排序,得到第一NUMA节点序列;
1032、对于第一NUMA节点序列中NC跳数相同的NUMA节点,服务器按照访存跳转表中QPI跳数由小到大的顺序进行排序,得到第二NUMA节点序列;
1033、服务器按照优先级由高到低的顺序依次对第二NUMA节点序列中的NUMA节点赋予优先级。
其中,NC跳数和QPI跳数均相同的NUMA节点的优先级相同。
赋予优先级的具体计算方式可以有很多,为便于理解,请参阅图6,下面以其中一种赋予优先级的计算方式为例进行描述:
假设得到的第二NUMA节点序列为S[N]。
10331、令i=1,P=0,P[N]={0};
其中i用来对S进行索引,P表示优先级,P[N]用来记录每个NUMA的优先级。
10332、判断是否i<N;
如果i<N不成立,则表明已经为每个NUMA生成了优先级,整个过程结束,跳转到步骤10338,退出;否则,跳转至步骤10333。
10333、判断S[i].Hnc与S[i-1].Hnc是否相等;
其中S[i].Hnc表示S[i]号NUMA节点距离p号NUMA节点的NC跳数;
若相等,则表示第i号NUMA节点的NC跳数与第i-1号NUMA节点的NC跳数相同,需要比较QPI跳数,则跳转至步骤10336;
否则,按照第二NUMA节点序列的排列规则,第i号NUMA节点的NC跳数比第i-1号NUMA节点的NC跳数多,跳转至步骤10334。
10334、令P[S[i]]=++P;
因为S[i]距离p的NC跳数增加,导致P数值增加,其对应的访存优先级的数值也增加,跳转至步骤10335。
因为S[i]距离p的NC跳数相等,而QPI跳数增加,导致P数值增加,其对应的访存优先级的数值也增加,跳转至步骤10335。
10335、令i=i+1,跳转至步骤10332,计算下一个NUMA的访存优先级。
10336、判断S[i].H qpi与S[i-1].H qpi是否相等;
其中,S[i].H qpi表示S[i]号NUMA节点距离p号SKT的QPI跳数。
如果S[i].H qpi与S[i-1].H qpi相等,则表明第第i号NUMA节点的NC跳数和QPI跳数均与第i-1号NUMA节点相等,优先级也应该相等,跳转至步骤10337;
否则,按照第二NUMA节点序列的排列规则,第i号NUMA节点的QPI跳数比第i-1号NUMA节点的QPI跳数多,跳转至步骤10334。
10337、令P[S[i]]=P;
由于当前SKT和上一个SKT距离p的NC Hop数量和QPI Hop数量都相等,因此其访存优先级不变,令P[S[i]]=P,跳转至步骤10335。
10338、结束。
通过上述算法,即可计算得到p号NUMA节点到其他各NUMA节点的访存优先级,记p号NUMA节点到访问q号NUMA节点的优先级为P[q],则可以得到NUMA节点p的访存优先级表,其一种表示方式可以如下表3所示:
表3
Figure PCTCN2018088924-appb-000005
可以理解的是,按照同样的方法,可以得到其他各NUMA节点的访存优先级表。
104、当第一NUMA节点申请内存时,服务器根据第一访存优先级表进行内存分配,优先级越高,越优先从该优先级对应的NUMA节点中分配内存。
优选的,若所述第一访存优先级表中存在多个优先级相同的NUMA节点,则所述服务器按照交织策略的方式从这些相同优先级的NUMA节点中分配内存。
在现有技术中,服务器通过本地化策略或者交织策略为应用分配内存,虽然本地化策略能够减少远端访存带来的延迟开销,但是过多的本地访存会减少访存的并行性,从而可能导致访存拥塞,其造成的延迟开销可能大于远端访存的延迟开销;而如果仅仅以交织的策略分配内存,虽然可以最大化访存并行性,但是却极大的增加了远端访存的数量,导致本地访存只占访存数量的1/N,其中N为NUMA节点的数量,因此会导致远端访存延迟问题凸显。而本优选方案中,按照访存优先级表中优先级进行内存分配,优先级相同时,采用交织策略分配,则优先访存本地性的同时,兼顾了访存并行性,从而不仅仅降低了远端访存的数量,也增加了访存的并行性,减少访存拥塞的发生,提升了系统的性能。
可以理解的是,具体的按照优先级分配内存的方式有很多,为便于理解,,以其中一种方式为例进行描述:
1、假定当前待分配内存大小为第一容量,当前查询优先级为第一优先级,所述第一优先级为所述第一访存优先级表中的一个优先级,所述服务器根据第一访存优先级表中优先级从高到低的顺序,按照如下流程进行内存分配:
2、所述服务器查询当前查询优先级的NUMA节点中是否有空闲内存;
3、若查询完所有的NUMA节点均没有空闲内存,则所述服务器判断是否进行了内存释放操作,所述内存释放操作表示将暂时不用的内存交换到硬盘缓冲区中;
4、若没有进行所述内存释放操作,则执行内存释放操作,初始化当前待分配内存大小和当前查询优先级,并触发步骤1;
5、若当前查询优先级的NUMA节点中没有空闲内存,则所述服务器更新所述第一优先级为所述当前查询优先级的下一优先级,触发步骤1;
6、若当前查询优先级的NUMA节点中仅存在一个第二NUMA节点有大小为第二容量的空闲内存,则所述服务器判断所述第二容量是否不小于所述当前待分配内存大小;
7、若所述第二容量不小于所述当前待分配内存大小,则所述服务器从所述第二NUMA节点为所述第一NUMA节点分配大小为当前待分配内存大小的内存,并结束内存分配流程;
8、若所述第二容量小于所述当前待分配内存大小,则所述服务器从所述第二NUMA节点为所述第一NUMA节点分配大小为第二容量的内存,更新所述第一容量为所述当前待分配内存大小减去所述第二容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发步骤1;
9、若当前查询优先级的NUMA节点中存在多于一个的第三NUMA节点中有空闲内存,则所述服务器通过交织策略的方式从各第三NUMA节点上分配内存,分配的内存大小为第三容量;
优选的,若当前查询优先级的NUMA节点中存在多于一个的第三NUMA节点中有空闲内存,服务器可以判断所述当前待分配内存大小是否大于一个内存页面;
若所述当前待分配内存大小大于一个内存页面,则触发该服务器通过交织策略的方式从各第三NUMA节点上分配内存的步骤;
若所述当前待分配大小不大于一个内存页面,则所述服务器从所述各第三NUMA节点中随机选择一个第三NUMA节点进行内存分配,并结束内存分配流程。
10、若所述第三容量等于所述当前待分配内存大小,则结束内存分配流程;
11、若所述第三容量小于所述当前待分配内存大小,则所述服务器更新所述第一容量为所述当前待分配内存大小减去所述第三容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发步骤1。
上面对按照优先级进行内存分配的一种方式进行了描述,为便于理解,请参阅图7,下面结合具体场景对该按照优先级进行内存分配的方式进行详细的描述:
10401、运行于p号NUMA节点的进程,需要申请的内存大小为MR Bytes;
10402、令优先级P=0,从p号NUMA节点的访存优先级表中访存优先级最高的节点开始分配内存;
其中,P=0表明从本地内存上分配内存;
10403、查询优先级为P的内存节点上是否有空闲内存;
如果优先级为P的NUMA节点上没有空闲内存,则跳转至步骤10404;
否则,跳转至步骤10409。
10404、令P=P+1,跳转至步骤10405;
10405、判断是否P>MAX_P;
若P>MAX_P,则表明,已经查询完了所有的NUMA节点,跳转至步骤10406;
否则,跳转至步骤10404;
10406、判断系统是否已经进行了Swap操作;
Swap操作即将暂时不用的内存页面交换到硬盘缓冲区:
若已经进行了Swap操作,则说明系统中已经没有内存可用,跳转至步骤10407;
否则,跳转至步骤10408。
10407、内存分配失败,进行相应的处理之后,跳转至步骤10423;
10408、进行Swap操作;
系统中没有了空闲内存,通过Swap操作将暂时不用的内存交换到硬盘缓冲区中,释放部分内存,然后跳转至步骤10402。
10409、将所有优先级为P且有空闲内存的内存节点组成的内存节点集合S[X];
其中X表示共有X个优先级为P的内存节点。
10410、判断X是否大于1;
若X等于1,则说明只有一个优先级为P的内存节点可供分配内存,用y来表示集合S的索引,跳转至步骤10411;
否则,X大于1,跳转至步骤10415。
10411、令y=0,令z=y,跳转至步骤10412;
y来表示集合S的索引。
10412、判断S[y]上空闲内存是否大于MR;
若S[y]上空闲内存大于MR,则跳转至步骤10424;
否则,跳转至步骤10413;
10413、令y=(y+1)%X,跳转至步骤10414;
10414、判断是否y>=z;
若y>=z,则说明已经遍历完了所有优先级为P的内存节点,此时还是不能分配足够的内存,因此,需要查询优先级更低的内存节点,即跳转至步骤10404;
否则,跳转至步骤10412。
10415、令y=-1,Flag=0;
X大于1,说明有多个优先为P的内存节点可供分配,此时通过交织的方式,以轮询的方式从各个优先级为P的内存节点上分配内存,每次分配的单位为1个内存页面,即4KB。令y=-1,Flag=0,其中y是S的索引,Flag表明此次轮询过程中是否分配了页面,如果在整个轮询过程中分配了页面则将其置1;否则Flag为0,表明整个轮询过程中没有分配一个页面,从而表示优先级为P的内存节点已经没有满足要求的内存,跳转至步骤10416;
10416、判断MR是否不小于4KB;
若MR不小于一个4KB,则跳转至步骤10417;
否则,所需分配的内存小于等于一个页面,跳转至步骤10423;
10417、令y=y+1,跳转至步骤10418;
10418、判断是否y<X;
若y<X,则表明还未轮询完所有优先级为P的内存节点,跳转至步骤10419;
否则,表示已经轮询完所有的优先级为P的内存节点,跳转至步骤10422;
10419、判断S[y]的空闲内存是否大于4Kbytes;
若S[y]的空闲内存大于4Kbytes,则跳转至步骤10420;
否则,S[y]上空闲内存不够一个页面,则跳转至步骤10417。
10420、从S[y]上分配4Kbytes的内存,并且令Flag=1;
表示此次轮询分配了一次内存,跳转至步骤10421;
10421、令MR=MR-4Kbytes,跳转至步骤10416;
10422、判断是否Flag=1;
若Flag等于1,则表示上一次轮询分配了一次内存,说明优先级为P的内存节点上还有空闲内存,因此,跳转至步骤10415,开始下一次轮询;
否则,表示上一次轮询没有分配一次内存,说明优先级为P的内存节点上已经没有空闲内存,需要到优先级更低的内存节点上分配内存,因此跳转至步骤10404。
10423、令y=rand()%X,令z=y;
如果MR小于等于一个4Kbyte,则只需要从优先级为P的内存节点集合S中选取一个,即可分配足够的内存。为了避免当此种情况经常发生时,内存集中的分配到某些节点上,令y=rand()%X,表示从S中随机选择一个内存节点分配内存,同时令z=y,然后跳转至步骤10412。
10424、从S[y]上分配MR大小的内存,跳转至步骤10425;
10425、结束。
本申请实施例中,服务器识别节点拓扑表,该节点拓扑表中不仅存在NUMA节点之间的连接关系,而且存在NUMA节点与NC之间,NC与NC之间的连接关系,服务器根据该节点拓扑表,生成各NUMA节点的访存跳转表,该跳转表中不仅有连接至其他各NUMA节点的最短路径中的QPI跳数,而且有NC跳数,服务器再根据各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级,将NC跳数作为访存优先级计算的一个重要参数,NC跳数越少,访存优先级越高,当有NUMA节点申请内存时,根据该访存优先级表进行内存分配,优先级越高,越优先从该优先级对应的NUMA节点中分配内存,因此,由于将NC跳数作为访存优先级计算的一个重要参数,则在分配内存时,减少了跨NC分配内存的机会,从而降低了由于NC造成的访存延迟,提升了服务器性能。
上面对本申请实施例中的内存分配方法进行了描述,下面对本申请实施例中的服务器进行描述,请参阅图8,本申请实施例中服务器一个实施例包括:
识别模块801,用于识别节点拓扑表,所述节点拓扑表中包括服务器中NUMA节点之间、NUMA节点与NC之间以及NC与NC之间的连接关系;
生成模块802,用于根据所述识别模块801识别得到的节点拓扑表,生成各NUMA节点的访存跳转表,其中,第一NUMA节点的访存跳转表中包括所述第一NUMA节点连接至其他各NUMA节点的最短路径中的NC跳数和快速互联通道QPI跳数,NC跳数为最短路径经过的NC的数量,QPI跳数为最短路径经过的NUMA节点的数量;
计算模块803,用于根据所述生成模块802生成的各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级表,其中,第一NUMA节点的第一访存优先级表中包括所述第一NUMA节点访问其他各NUMA节点的优先级,若NC跳数越少,则访问该NUMA节点的优先级越高,若NC跳数相同,则QPI跳数越少,访问该NUMA节点的优先级越高;
分配模块804,用于当所述第一NUMA节点申请内存时,根据所述计算模块803计算得到的第一访存优先级表进行内存分配,优先级越高,越优先从该优先级对应的NUMA节点中分配内存。
本申请实施例中,识别模块801识别节点拓扑表,该节点拓扑表中不仅存在NUMA节点之间的连接关系,而且存在NUMA节点与NC之间,NC与NC之间的连接关系,生成模块802根据该节点拓扑表,生成各NUMA节点的访存跳转表,该跳转表中不仅有连接至其他各NUMA节点的最短路径中的QPI跳数,而且有NC跳数,计算模块803再根据各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级,将NC跳数作为访存优先级计算的一个重要参数,NC跳数越少,访存优先级越高,当有NUMA节点申请内存时,分配模块804根据该访存优先级表进行内存分配,优先级越高,越优先从该优先级对应的NUMA节点中分配内存,因此,由于将NC跳数作为访存优先级计算的一个重要参数,则在分配内存时,减少了跨NC分配内存的机会,从而降低了由于NC造成的访存延迟,提升了服务器性能。
优选的,作为本申请实施例中服务器另一个实施例,该分配模块804还可以用于当所述第一访存优先级表中存在多个优先级相同的NUMA节点时,按照交织策略的方式从这些相同优先级的NUMA节点中分配内存。
本申请实施例中,分配模块804按照访存优先级表中优先级进行内存分配,优先级相同时,采用交织策略分配,则优先访存本地性的同时,兼顾了访存并行性,从而不仅仅降低了远端访存的数量,也增加了访存的并行性,减少访存拥塞的发生,提升了系统的性能。
优选的,请参阅图9,结合图8所示实施例,作为本申请实施例中服务器另一个实施例,上述生成模块802具体可以包括:
读取单元901,用于读取存储的所述识别模块801识别得到的节点拓扑表;
第一计算单元902,用于根据所述读取单元901读取的节点拓扑表,计算每个NUMA节点到其他各NUMA节点的最短路径,其中,所述最短路径为预选最短路径中NC跳数最少的路径,所述预选最短路径为由一个NUMA节点到另一个NUMA节点的路径中QPI跳数最少的路径;
第二计算单元903,用于根据所述第一计算单元902计算出的每个NUMA节点到其他各NUMA节点的最短路径,计算各最短路径上的NC跳数和QPI跳数;
组成单元904,用于将所述第二计算单元903计算出的各NUMA节点到其他各NUMA节点的最短路径上的NC跳数和QPI跳数组成所述各NUMA节点的访存跳转表。
本申请实施例中,第一计算单元902计算得到最短路径,第二计算单元903计算得到各最短路径上的NC跳数和QPI跳数,组成单元904将其组成为各NUMA节点的访存跳转表,从而实现了对各NUMA节点的访存跳转表的生成。
优选的,请参阅图10,结合图8所示实施例,作为本申请实施例中服务器另一个实施例,上述计算模块803具体可以包括:
第一排序单元1001,用于按照各NUMA节点的访存跳转表中NC跳数由小到大的顺序,对访存跳转表中的NUMA节点排序,得到第一NUMA节点序列;
第二排序单元1002,用于对于所述第一NUMA节点序列中NC跳数相同的NUMA节点,按照访存跳转表中QPI跳数由小到大的顺序进行排序,得到第二NUMA节点序列;
赋予单元1003,用于按照优先级由高到低的顺序依次对所述第二NUMA节点序列中的NUMA节点赋予优先级,其中,NC跳数和QPI跳数均相同的NUMA节点的优先级相 同。
本申请实施例中,第一排序单元1001和第二排序单元1002先对访存跳转表中的NUMA节点排序,赋予单元1003再按顺序赋予优先级,提高了进行优先级赋予的效率。
优选的,请参阅图11,结合图8所示实施例,作为本申请实施例中服务器另一个实施例,上述分配模块804具体可以包括:
起始单元1101,用于假定当前待分配内存大小为第一容量,当前查询优先级为第一优先级,所述第一优先级为所述第一访存优先级表中的一个优先级,根据第一访存优先级表中优先级从高到低的顺序,触发查询单元1102;
查询单元1102,用于查询当前查询优先级的NUMA节点中是否有空闲内存;
第一更新单元1103,用于当当前查询优先级的NUMA节点中没有空闲内存时,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述起始单元1101;
第一判断单元1104,用于当当前查询优先级的NUMA节点中仅存在一个第二NUMA节点有大小为第二容量的空闲内存时,判断所述第二容量是否不小于所述当前待分配内存大小;
第一分配单元1105,用于当所述第二容量不小于所述当前待分配内存大小时,从所述第二NUMA节点为所述第一NUMA节点分配大小为当前待分配内存大小的内存,并触发结束单元1107;
第二更新单元1106,用于当所述第二容量小于所述当前待分配内存大小时,从所述第二NUMA节点为所述第一NUMA节点分配大小为第二容量的内存,更新所述第一容量为所述当前待分配内存大小减去所述第二容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述起始单元1101。
结束单元1107,用于结束内存分配流程。
优选的,所述分配模块804还可以包括:
第二分配单元1108,用于当所述查询单元1102确定当前查询优先级的NUMA节点中存在多于一个的第三NUMA节点中有空闲内存时,通过交织策略的方式从各第三NUMA节点上分配内存,分配的内存大小为第三容量;
第一触发单元1109,用于当所述第三容量等于所述当前待分配内存大小时,触发所述结束单元1107;
第二触发单元1110,用于当所述第三容量小于所述当前待分配内存大小时,更新所述第一容量为所述当前待分配内存大小减去所述第三容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述起始单元1101。
优选的,所述分配模块804还可以包括:
第二判断单元1111,用于当所述查询单元1102查询完所有的NUMA节点均没有空闲内存时,判断是否进行了内存释放操作,所述内存释放操作表示将暂时不用的内存交换到硬盘缓冲区中;
释放执行单元1112,用于当所述第二判断单元1111确定没有进行所述内存释放操作时,执行内存释放操作,初始化当前待分配内存大小和当前查询优先级,并触发所述起始单元1101。
可选的,该第二分配单元1108具体可以包括:
判断子单元,用于当所述查询单元1102确定当前查询优先级的NUMA节点中存在多于一个的第三NUMA节点中有空闲内存时,判断所述当前待分配内存大小是否大于一个内存页面;
第一分配子单元,用于当所述当前待分配内存大小大于一个内存页面时,通过交织策略的方式从各第三NUMA节点上分配内存;
第二分配子单元,用于当所述当前待分配大小不大于一个内存页面时,从所述各第三NUMA节点中随机选择一个第三NUMA节点进行内存分配,并触发所述结束单元1107。
本申请实施例中,分配模块804按照访存优先级表进行内存分配,若优先级相同,则按照交织策略分配,在降低了由于NC造成的访存延迟,提升了服务器性能的同时,兼顾了访存并行性,不仅仅降低了远端访存的数量,也增加了访存的并行性,减少访存拥塞的发生,提升了系统的性能。
上面从单元化功能实体的角度对本申请实施例中的服务器进行了描述,下面从硬件处理的角度对本申请实施例中的服务器进行描述,请参阅图12,本申请实施例中的服务器1200另一实施例包括:
输入装置1201、输出装置1202、处理器1203和存储器1204(其中服务器1200中的处理器1203的数量可以一个或多个,图12中以一个处理器1203为例)。在本申请的一些实施例中,输入装置1201、输出装置1202、处理器1203和存储器1204可通过总线或其它方式连接,其中,图12中以通过总线连接为例。
可以理解的是,图1所示的实施例场景图中服务器中所有NUMA节点中的CPU集合组成本申请实施例中的处理器1203,所有的NUMA节点中的本地内存集合组成本申请实施例中的存储器1204。
其中,通过调用存储器1204存储的操作指令,处理器1203,用于执行如下步骤:
识别节点拓扑表,所述节点拓扑表中包括服务器中NUMA节点之间、NUMA节点与NC之间以及NC与NC之间的连接关系;
根据所述节点拓扑表,生成各NUMA节点的访存跳转表,其中,第一NUMA节点的访存跳转表中包括所述第一NUMA节点连接至其他各NUMA节点的最短路径中的NC跳数和快速互联通道QPI跳数,NC跳数为最短路径经过的NC的数量,QPI跳数为最短路径经过的NUMA节点的数量;
根据各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级表,其中,第一NUMA节点的第一访存优先级表中包括所述第一NUMA节点访问其他各NUMA节点的优先级,若NC跳数越少,则访问该NUMA节点的优先级越高,若NC跳数相同,则QPI跳数越少,访问该NUMA节点的优先级越高;
当所述第一NUMA节点申请内存时,根据所述第一访存优先级表进行内存分配,优先级越高,越优先从该优先级对应的NUMA节点中分配内存。
本申请的一些实施例中,所述处理器1203还用于执行如下步骤:
当所述第一访存优先级表中存在多个优先级相同的NUMA节点时,按照交织策略的方式从这些相同优先级的NUMA节点中分配内存。
本申请的一些实施例中,所述处理器1203执行所述根据所述节点拓扑表,生成各 NUMA节点的访存跳转表的步骤是,具体执行如下步骤:
读取存储的所述节点拓扑表;
根据所述节点拓扑表,计算每个NUMA节点到其他各NUMA节点的最短路径,其中,所述最短路径为预选最短路径中NC跳数最少的路径,所述预选最短路径为由一个NUMA节点到另一个NUMA节点的路径中QPI跳数最少的路径;
根据每个NUMA节点到其他各NUMA节点的最短路径,计算各最短路径上的NC跳数和QPI跳数;
将各NUMA节点到其他各NUMA节点的最短路径上的NC跳数和QPI跳数组成所述各NUMA节点的访存跳转表。
本申请的一些实施例中,所述处理器1203执行所述根据各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级表的步骤时,具体执行如下步骤:
按照各NUMA节点的访存跳转表中NC跳数由小到大的顺序,对访存跳转表中的NUMA节点排序,得到第一NUMA节点序列;
对于所述第一NUMA节点序列中NC跳数相同的NUMA节点,按照访存跳转表中QPI跳数由小到大的顺序进行排序,得到第二NUMA节点序列;
按照优先级由高到低的顺序依次对所述第二NUMA节点序列中的NUMA节点赋予优先级,其中,NC跳数和QPI跳数均相同的NUMA节点的优先级相同。
本申请的一些实施例中,所述处理器1203执行所述根据所述第一访存优先级表进行内存分配的步骤时,具体执行如下步骤:
假定当前待分配内存大小为第一容量,当前查询优先级为第一优先级,所述第一优先级为所述第一访存优先级表中的一个优先级,根据第一访存优先级表中优先级从高到低的顺序,按照如下流程进行内存分配:
查询当前查询优先级的NUMA节点中是否有空闲内存;
若当前查询优先级的NUMA节点中没有空闲内存,则更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述按照如下流程进行内存分配的步骤;
若当前查询优先级的NUMA节点中仅存在一个第二NUMA节点有大小为第二容量的空闲内存,则判断所述第二容量是否不小于所述当前待分配内存大小;
若所述第二容量不小于所述当前待分配内存大小,则从所述第二NUMA节点为所述第一NUMA节点分配大小为当前待分配内存大小的内存,并结束内存分配流程;
若所述第二容量小于所述当前待分配内存大小,则从所述第二NUMA节点为所述第一NUMA节点分配大小为第二容量的内存,更新所述第一容量为所述当前待分配内存大小减去所述第二容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述按照如下流程进行内存分配的步骤。
本申请的一些实施例中,所述处理器1203执行所述查询当前查询优先级的NUMA节点中是否有空闲内存的步骤之后,还执行如下步骤包括:
若当前查询优先级的NUMA节点中存在多于一个的第三NUMA节点中有空闲内存,则通过交织策略的方式从各第三NUMA节点上分配内存,分配的内存大小为第三容量;
若所述第三容量等于所述当前待分配内存大小,则结束内存分配流程;
若所述第三容量小于所述当前待分配内存大小,则更新所述第一容量为所述当前 待分配内存大小减去所述第三容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述按照如下流程进行内存分配的步骤。
本申请的一些实施例中,所述处理器1203执行所述查询当前查询优先级的NUMA节点中是否有空闲内存的步骤之后,所述处理器1203还执行如下步骤:
若查询完所有的NUMA节点均没有空闲内存,则判断是否进行了内存释放操作,所述内存释放操作表示将暂时不用的内存交换到硬盘缓冲区中;
若没有进行所述内存释放操作,则执行内存释放操作,初始化当前待分配内存大小和当前查询优先级,并触发所述按照如下流程进行内存分配的步骤。
本申请的一些实施例中,所述处理器1203执行所述通过交织策略的方式从各第三NUMA节点上分配内存的步骤之前,所述处理器1203还执行如下步骤:
判断所述当前待分配内存大小是否大于一个内存页面;
若所述当前待分配内存大小大于一个内存页面,则触发所述通过交织策略的方式从各第三NUMA节点上分配内存的步骤;
若所述当前待分配大小不大于一个内存页面,则从所述各第三NUMA节点中随机选择一个第三NUMA节点进行内存分配,并结束内存分配流程。
本申请的一些实施例中,所述节点拓扑表为(N+M)*(N+M)阶的矩阵S,其中N为服务器中NUMA节点的数目,M为服务器中NC的数目,矩阵S的前N列和N行表示NUMA节点,矩阵S的后M列和M行表示NC,矩阵S中第p行第q列的值表示节点p和节点q之间的连接关系,其中N、M、p、q均为正整数。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施 例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (19)

  1. 一种内存分配方法,其特征在于,包括:
    服务器识别节点拓扑表,所述节点拓扑表中包括所述服务器中各非一致性内存架构NUMA节点之间、NUMA节点与节点控制器NC之间以及NC与NC之间的连接关系;
    所述服务器根据所述节点拓扑表,生成各NUMA节点的访存跳转表,第一NUMA节点的访存跳转表中包括所述第一NUMA节点连接至其他各NUMA节点的最短路径中的NC跳数和快速互联通道QPI跳数,所述NC跳数为最短路径经过的NC的数量,所述QPI跳数为最短路径经过的NUMA节点的数量,所述第一NUMA节点为所述各NUMA节点中的一个节点;
    所述服务器根据所述各NUMA节点的访存跳转表,计算所述各NUMA节点的访存优先级表,所述第一NUMA节点的第一访存优先级表中包括所述第一NUMA节点访问其他各NUMA节点的优先级,若NC跳数越少,则访问所述NUMA节点的优先级越高,若NC跳数相同,则QPI跳数越少,访问所述NUMA节点的优先级越高;
    当所述第一NUMA节点申请内存时,所述服务器根据所述第一访存优先级表进行内存分配,优先级越高,越优先从所述优先级对应的NUMA节点中分配内存。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    若所述第一访存优先级表中存在多个优先级相同的NUMA节点,则所述服务器按照交织策略的方式从所述多个优先级相同的NUMA节点中分配内存。
  3. 根据权利要求1或2所述的方法,其特征在于,所述服务器根据所述节点拓扑表,生成各NUMA节点的访存跳转表,包括:
    所述服务器读取存储的所述节点拓扑表;
    所述服务器根据所述节点拓扑表,计算每个NUMA节点到其他各NUMA节点的最短路径,所述最短路径为预选最短路径中NC跳数最少的路径,所述预选最短路径为由一个NUMA节点到另一个NUMA节点的路径中QPI跳数最少的路径;
    所述服务器根据所述每个NUMA节点到其他各NUMA节点的最短路径,计算各最短路径上的NC跳数和QPI跳数;
    所述服务器将所述各NUMA节点到其他各NUMA节点的最短路径上的NC跳数和QPI跳数组成所述各NUMA节点的访存跳转表。
  4. 根据权利要求3所述的方法,其特征在于,所述服务器根据所述各NUMA节点的访存跳转表,计算各NUMA节点的访存优先级表,包括:
    所述服务器按照所述各NUMA节点的访存跳转表中NC跳数由小到大的顺序,对所述访存跳转表中的NUMA节点排序,得到第一NUMA节点序列;
    对于所述第一NUMA节点序列中NC跳数相同的NUMA节点,所述服务器按照所述访存跳转表中QPI跳数由小到大的顺序进行排序,得到第二NUMA节点序列;
    所述服务器按照优先级由高到低的顺序依次对所述第二NUMA节点序列中的NUMA节点赋予优先级,NC跳数和QPI跳数均相同的NUMA节点的优先级相同。
  5. 根据权利要求2所述的方法,其特征在于,所述服务器根据所述第一访存优先级表进行内存分配,包括:
    若当前待分配内存大小为第一容量,当前查询优先级为第一优先级,所述第一优先级为所述第一访存优先级表中的一个优先级,所述服务器根据第一访存优先级表中优先级从高到低的顺序,按照如下流程进行内存分配:
    所述服务器查询当前查询优先级的NUMA节点中是否有空闲内存;
    若所述当前查询优先级的NUMA节点中没有空闲内存,则所述服务器更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述按照如下流程进行内存分配的步骤;
    若所述当前查询优先级的NUMA节点中仅存在一个第二NUMA节点有大小为第二容量的空闲内存,则所述服务器判断所述第二容量是否不小于所述当前待分配内存大小;
    若所述第二容量不小于所述当前待分配内存大小,则所述服务器从所述第二NUMA节点为所述第一NUMA节点分配大小为当前待分配内存大小的内存,并结束内存分配流程;
    若所述第二容量小于所述当前待分配内存大小,则所述服务器从所述第二NUMA节点为所述第一NUMA节点分配大小为第二容量的内存,更新所述第一容量为所述当前待分配内存大小减去所述第二容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述按照如下流程进行内存分配的步骤。
  6. 根据权利要求5所述的方法,其特征在于,所述服务器查询当前查询优先级的NUMA节点中是否有空闲内存的步骤之后,所述方法还包括:
    若当前查询优先级的NUMA节点中存在多于一个的第三NUMA节点中有空闲内存,则所述服务器通过交织策略的方式从各第三NUMA节点上分配内存,分配的内存大小为第三容量;
    若所述第三容量等于所述当前待分配内存大小,则结束内存分配流程;
    若所述第三容量小于所述当前待分配内存大小,则所述服务器更新所述第一容量为所述当前待分配内存大小减去所述第三容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述按照如下流程进行内存分配的步骤。
  7. 根据权利要求5所述的方法,其特征在于,所述服务器查询当前查询优先级的NUMA节点中是否有空闲内存的步骤之后,所述方法还包括:
    若查询完所有的NUMA节点均没有空闲内存,则所述服务器判断是否进行了内存释放操作,所述内存释放操作表示将暂时不用的内存交换到硬盘缓冲区中;
    若没有进行所述内存释放操作,则执行内存释放操作,初始化当前待分配内存大小和当前查询优先级,并触发所述服务器根据第一访存优先级表中优先级从高到低的顺序,按照如下流程进行内存分配的步骤。
  8. 根据权利要求6所述的方法,其特征在于,所述服务器通过交织策略的方式从各第三NUMA节点上分配内存的步骤之前,所述方法还包括:
    所述服务器判断所述当前待分配内存大小是否大于一个内存页面;
    若所述当前待分配内存大小大于一个内存页面,则触发所述服务器通过交织策略的方式从各第三NUMA节点上分配内存的步骤;
    若所述当前待分配大小不大于一个内存页面,则所述服务器从所述各第三NUMA节点中随机选择一个第三NUMA节点进行内存分配,并结束内存分配流程。
  9. 根据权利要求1或2所述的方法,其特征在于,所述节点拓扑表为(N+M)*(N+M)阶的矩阵S,其中N为服务器中NUMA节点的数目,M为服务器中NC的数目,矩阵S的前N列和N行表示NUMA节点,矩阵S的后M列和M行表示NC,矩阵S中第p行第q列的值表示节点p和节点q之间的连接关系,其中N、M、p、q均为正整数。
  10. 一种服务器,其特征在于,包括:
    识别模块,用于识别节点拓扑表,所述节点拓扑表中包括服务器中NUMA节点之间、NUMA节点与NC之间以及NC与NC之间的连接关系;
    生成模块,用于根据所述识别模块识别得到的节点拓扑表,生成各NUMA节点的访存跳转表,第一NUMA节点的访存跳转表中包括所述第一NUMA节点连接至其他各NUMA节点的最短路径中的NC跳数和快速互联通道QPI跳数,所述NC跳数为最短路径经过的NC的数量,所述QPI跳数为最短路径经过的NUMA节点的数量,所述第一NUMA节点为所述各NUMA节点中的一个节点;
    计算模块,用于根据所述生成模块生成的各NUMA节点的访存跳转表,计算所述各NUMA节点的访存优先级表,所述第一NUMA节点的第一访存优先级表中包括所述第一NUMA节点访问其他各NUMA节点的优先级,若NC跳数越少,则访问所述NUMA节点的优先级越高,若NC跳数相同,则QPI跳数越少,访问所述NUMA节点的优先级越高;
    分配模块,用于当所述第一NUMA节点申请内存时,根据所述计算模块计算得到的第一访存优先级表进行内存分配,优先级越高,越优先从所述优先级对应的NUMA节点中分配内存。
  11. 根据权利要求10所述的服务器,其特征在于,所述分配模块还用于:
    当所述第一访存优先级表中存在多个优先级相同的NUMA节点时,按照交织策略的方式从所述多个优先级相同的NUMA节点中分配内存。
  12. 根据权利要求10或11所述的服务器,其特征在于,所述生成模块包括:
    读取单元,用于读取存储的所述识别模块识别得到的节点拓扑表;
    第一计算单元,用于根据所述读取单元读取的节点拓扑表,计算每个NUMA节点到其他各NUMA节点的最短路径,所述最短路径为预选最短路径中NC跳数最少的路径,所述预选最短路径为由一个NUMA节点到另一个NUMA节点的路径中QPI跳数最少的路径;
    第二计算单元,用于根据所述第一计算单元计算出的每个NUMA节点到其他各NUMA节点的最短路径,计算所述各最短路径上的NC跳数和QPI跳数;
    组成单元,用于将所述第二计算单元计算出的各NUMA节点到其他各NUMA节点的最短路径上的NC跳数和QPI跳数组成所述各NUMA节点的访存跳转表。
  13. 根据权利要求12所述的服务器,其特征在于,所述计算模块,包括:
    第一排序单元,用于按照所述各NUMA节点的访存跳转表中NC跳数由小到大的顺序,对所述访存跳转表中的NUMA节点排序,得到第一NUMA节点序列;
    第二排序单元,用于对于所述第一NUMA节点序列中NC跳数相同的NUMA节点,按照所述访存跳转表中QPI跳数由小到大的顺序进行排序,得到第二NUMA节点序列;
    赋予单元,用于按照优先级由高到低的顺序依次对所述第二NUMA节点序列中的NUMA节点赋予优先级,NC跳数和QPI跳数均相同的NUMA节点的优先级相同。
  14. 根据权利要求11所述的服务器,其特征在于,所述分配模块包括:
    起始单元,用于若当前待分配内存大小为第一容量,当前查询优先级为第一优先级,所述第一优先级为所述第一访存优先级表中的一个优先级,根据第一访存优先级表中优先级从高到低的顺序,触发查询单元;
    所述查询单元,用于查询当前查询优先级的NUMA节点中是否有空闲内存;
    第一更新单元,用于当所述当前查询优先级的NUMA节点中没有空闲内存时,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述起始单元;
    第一判断单元,用于当所述当前查询优先级的NUMA节点中仅存在一个第二NUMA节点有大小为第二容量的空闲内存时,判断所述第二容量是否不小于所述当前待分配内存大小;
    第一分配单元,用于当所述第二容量不小于所述当前待分配内存大小时,从所述第二NUMA节点为所述第一NUMA节点分配大小为当前待分配内存大小的内存,并触发结束单元;
    第二更新单元,用于当所述第二容量小于所述当前待分配内存大小时,从所述第二NUMA节点为所述第一NUMA节点分配大小为第二容量的内存,更新所述第一容量为所述当前待分配内存大小减去所述第二容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述起始单元;
    结束单元,用于结束内存分配流程。
  15. 根据权利要求14所述的服务器,其特征在于,所述分配模块还包括:
    第二分配单元,用于当所述查询单元确定当前查询优先级的NUMA节点中存在多于一个的第三NUMA节点中有空闲内存时,通过交织策略的方式从各第三NUMA节点上分配内存,分配的内存大小为第三容量;
    第一触发单元,用于当所述第三容量等于所述当前待分配内存大小时,触发所述结束单元;
    第二触发单元,用于当所述第三容量小于所述当前待分配内存大小时,更新所述第一容量为所述当前待分配内存大小减去所述第三容量,更新所述第一优先级为所述当前查询优先级的下一优先级,触发所述起始单元。
  16. 根据权利要求14所述的服务器,其特征在于,所述分配模块还包括:
    第二判断单元,用于当所述查询单元查询完所有的NUMA节点均没有空闲内存时,判断是否进行了内存释放操作,所述内存释放操作表示将暂时不用的内存交换到硬盘缓冲区中;
    释放执行单元,用于当所述第二判断单元确定没有进行所述内存释放操作时,执行内存释放操作,初始化当前待分配内存大小和当前查询优先级,并触发所述起始单元。
  17. 根据权利要求15所述的服务器,其特征在于,所述第二分配单元包括:
    判断子单元,用于当所述查询单元确定当前查询优先级的NUMA节点中存在多于一个的第三NUMA节点中有空闲内存时,判断所述当前待分配内存大小是否大于一个内存页面;
    第一分配子单元,用于当所述当前待分配内存大小大于一个内存页面时,通过交 织策略的方式从各第三NUMA节点上分配内存;
    第二分配子单元,用于当所述当前待分配大小不大于一个内存页面时,从所述各第三NUMA节点中随机选择一个第三NUMA节点进行内存分配,并触发所述结束单元。
  18. 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1-9任意一项所述的方法。
  19. 一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行如权利要求1-9任意一项所述的方法。
PCT/CN2018/088924 2017-08-07 2018-05-30 一种内存分配方法和服务器 WO2019029236A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP18844447.5A EP3605331A4 (en) 2017-08-07 2018-05-30 STORAGE ASSIGNMENT METHOD AND SERVER
US16/595,920 US11042412B2 (en) 2017-08-07 2019-10-08 Memory allocation method and server

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710669106.4A CN109388490B (zh) 2017-08-07 2017-08-07 一种内存分配方法和服务器
CN201710669106.4 2017-08-07

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/595,920 Continuation US11042412B2 (en) 2017-08-07 2019-10-08 Memory allocation method and server

Publications (1)

Publication Number Publication Date
WO2019029236A1 true WO2019029236A1 (zh) 2019-02-14

Family

ID=65270873

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/088924 WO2019029236A1 (zh) 2017-08-07 2018-05-30 一种内存分配方法和服务器

Country Status (4)

Country Link
US (1) US11042412B2 (zh)
EP (1) EP3605331A4 (zh)
CN (1) CN109388490B (zh)
WO (1) WO2019029236A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112084028A (zh) * 2020-09-07 2020-12-15 北京字节跳动网络技术有限公司 一种内存检测方法及装置

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113138851B (zh) * 2020-01-16 2023-07-14 华为技术有限公司 一种数据管理方法、相关装置及系统
CN111756802B (zh) * 2020-05-26 2021-09-03 深圳大学 一种数据流任务在numa平台上的调度方法及系统
CN114281516A (zh) * 2020-09-27 2022-04-05 华为云计算技术有限公司 一种基于numa属性的资源分配方法及装置
US20210073151A1 (en) * 2020-11-18 2021-03-11 Intel Corporation Page-based remote memory access using system memory interface network device
US11860798B2 (en) 2021-01-22 2024-01-02 Nyriad, Inc. Data access path optimization
CN112860530B (zh) * 2021-01-27 2022-09-27 中山大学 一种利用非统一存储器访问架构特点提升并行化NumPy计算性能的方法
CN113468080B (zh) * 2021-06-10 2024-02-09 山东英信计算机技术有限公司 一种全闪元数据的缓存方法、系统及相关装置
CN114780325B (zh) * 2022-06-21 2022-09-30 新华三信息技术有限公司 一种PCIe设备检测方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050238035A1 (en) * 2004-04-27 2005-10-27 Hewlett-Packard System and method for remote direct memory access over a network switch fabric
CN104166596A (zh) * 2013-05-17 2014-11-26 华为技术有限公司 一种内存分配方法及节点
CN104166597A (zh) * 2013-05-17 2014-11-26 华为技术有限公司 一种分配远程内存的方法及装置
CN105988876A (zh) * 2015-03-27 2016-10-05 杭州迪普科技有限公司 内存分配方法及装置

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6115804A (en) * 1999-02-10 2000-09-05 International Business Machines Corporation Non-uniform memory access (NUMA) data processing system that permits multiple caches to concurrently hold data in a recent state from which data can be sourced by shared intervention
US7577813B2 (en) * 2005-10-11 2009-08-18 Dell Products L.P. System and method for enumerating multi-level processor-memory affinities for non-uniform memory access systems
US7698523B2 (en) * 2006-09-29 2010-04-13 Broadcom Corporation Hardware memory locks
US20090006804A1 (en) * 2007-06-29 2009-01-01 Seagate Technology Llc Bi-level map structure for sparse allocation of virtual storage
US8245008B2 (en) 2009-02-18 2012-08-14 Advanced Micro Devices, Inc. System and method for NUMA-aware heap memory management
CN102413509A (zh) * 2011-11-09 2012-04-11 中国科学院上海微系统与信息技术研究所 一种wsn中的时延受限能耗均衡数据采集树构建方法
CN103428088B (zh) * 2012-05-14 2018-11-06 中兴通讯股份有限公司 一种树根分配、报文处理的方法及路由网桥
CN103136110B (zh) 2013-02-18 2016-03-30 华为技术有限公司 内存管理方法、内存管理装置及numa系统
US10048871B2 (en) 2013-02-20 2018-08-14 Red Hat, Inc. Assigning pre-existing processes to select sets of non-uniform memory access (NUMA) aligned resources
US9684682B2 (en) * 2013-09-21 2017-06-20 Oracle International Corporation Sharding of in-memory objects across NUMA nodes
US9817607B1 (en) * 2014-06-20 2017-11-14 EMC IP Holding Company LLC Optimizations to avoid intersocket links
CN105677373B (zh) * 2014-11-17 2019-04-19 杭州华为数字技术有限公司 一种节点热插拔的方法和numa节点装置
US20170017419A1 (en) * 2015-07-15 2017-01-19 Innovium, Inc. System And Method For Enabling High Read Rates To Data Element Lists
US9697048B2 (en) 2015-08-20 2017-07-04 Sap Se Non-uniform memory access (NUMA) database management system
CN105389211B (zh) 2015-10-22 2018-10-30 北京航空航天大学 适用于numa架构的内存分配方法及延时感知-内存分配装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050238035A1 (en) * 2004-04-27 2005-10-27 Hewlett-Packard System and method for remote direct memory access over a network switch fabric
CN104166596A (zh) * 2013-05-17 2014-11-26 华为技术有限公司 一种内存分配方法及节点
CN104166597A (zh) * 2013-05-17 2014-11-26 华为技术有限公司 一种分配远程内存的方法及装置
CN105988876A (zh) * 2015-03-27 2016-10-05 杭州迪普科技有限公司 内存分配方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3605331A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112084028A (zh) * 2020-09-07 2020-12-15 北京字节跳动网络技术有限公司 一种内存检测方法及装置
CN112084028B (zh) * 2020-09-07 2022-02-25 北京字节跳动网络技术有限公司 一种内存检测方法及装置

Also Published As

Publication number Publication date
US11042412B2 (en) 2021-06-22
CN109388490B (zh) 2020-11-17
EP3605331A1 (en) 2020-02-05
US20200042358A1 (en) 2020-02-06
EP3605331A4 (en) 2020-04-22
CN109388490A (zh) 2019-02-26

Similar Documents

Publication Publication Date Title
WO2019029236A1 (zh) 一种内存分配方法和服务器
US20190266193A1 (en) Data processing method for bloom filter, and bloom filter
US20210191765A1 (en) Method for static scheduling of artificial neural networks for a processor
US20210075745A1 (en) Methods and apparatus for improved polling efficiency in network interface fabrics
US10331499B2 (en) Method, apparatus, and chip for implementing mutually-exclusive operation of multiple threads
US10725940B2 (en) Reallocate memory pending queue based on stall
WO2015176315A1 (zh) 哈希连接方法、装置和数据库管理系统
US11436046B2 (en) Electronic device with memory processor-based multiprocessing architecture and operation method thereof
CN116010109A (zh) 缓存资源分配方法、装置、电子设备和存储介质
CN113094179B (zh) 作业分配方法、装置、电子设备及可读存储介质
WO2020124488A1 (zh) 应用进程映射方法、电子装置及计算机可读存储介质
TWI690848B (zh) 基於記憶體處理器的多處理架構及其操作方法
JP6682848B2 (ja) 情報処理装置、情報処理方法、及び、プログラム
CN113886090A (zh) 内存分配方法及装置、设备、存储介质
CN112100446B (zh) 搜索方法、可读存储介质和电子设备
US11429452B2 (en) Method for distributing keys using two auxiliary hashing functions
JP6519228B2 (ja) データ配置決定装置、データ配置決定プログラム及びデータ配置決定方法
JP4691348B2 (ja) 記憶領域管理プログラムおよびメッセージ管理プログラム
CN113934361A (zh) 用于管理存储系统的方法、设备和计算机程序产品
WO2024066676A1 (zh) 一种神经网络模型的推理方法、装置及相关设备
Hajidehi et al. CUTTANA: Scalable Graph Partitioning for Faster Distributed Graph Databases and Analytics
US10817413B2 (en) Hardware-based memory management for system-on-chip (SoC) integrated circuits that identify blocks of continuous available tokens needed to store data
WO2024016863A1 (zh) 规则查找方法、装置、设备及计算机可读存储介质
Sem-Jacobsen et al. Efficient and contention-free virtualisation of fat-trees
WO2021115326A1 (zh) 数据处理的方法、装置和电子设备、存储介质和程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18844447

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018844447

Country of ref document: EP

Effective date: 20191021

NENP Non-entry into the national phase

Ref country code: DE