CN106383791A - Memory block combination method and apparatus based on non-uniform memory access architecture - Google Patents

Memory block combination method and apparatus based on non-uniform memory access architecture Download PDF

Info

Publication number
CN106383791A
CN106383791A CN201610844237.7A CN201610844237A CN106383791A CN 106383791 A CN106383791 A CN 106383791A CN 201610844237 A CN201610844237 A CN 201610844237A CN 106383791 A CN106383791 A CN 106383791A
Authority
CN
China
Prior art keywords
node
memory
block
window
memory block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610844237.7A
Other languages
Chinese (zh)
Other versions
CN106383791B (en
Inventor
张健
王梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Polytechnic
Original Assignee
Shenzhen Polytechnic
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Polytechnic filed Critical Shenzhen Polytechnic
Priority to CN201610844237.7A priority Critical patent/CN106383791B/en
Publication of CN106383791A publication Critical patent/CN106383791A/en
Application granted granted Critical
Publication of CN106383791B publication Critical patent/CN106383791B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • G06F13/1652Handling requests for interconnection or transfer for access to memory bus based on arbitration in a multiprocessor architecture
    • G06F13/1657Access to multiple memories
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/25Using a specific main memory architecture
    • G06F2212/254Distributed memory
    • G06F2212/2542Non-uniform memory access [NUMA] architecture

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention belongs to the technical field of cloud storage, and relates to a memory block combination method and apparatus based on a non-uniform memory access architecture. The method comprises three steps: 1) classifying memory provided by available nodes according to frequencies of the nodes, and connecting memory logic of the available nodes with a same frequency to form a memory block; 2) using the memory block as a window block, and by adjusting an arrangement order of window blocks and an arrangement order of each available node in the window block, determining a logic arrangement result with the lowest connection cost, wherein the logic arrangement result comprises a master node in a logic arrangement with the lowest connection cost, and recording the logic arrangement result in a routing table; and 3) storing the routing table into a control processor connected to the master node, and allocating the routing table to a global address of each memory block by the control processor, so as to construct a memory cloud. According to the method and apparatus provided by the present invention, low efficiency of a cluster interconnection network and heterogeneity of different memories are overcome, and non-uniform access memory cloud storage is built up as much as possible.

Description

A kind of memory block combined method based on nonuniform memory access framework and device
Technical field
The invention belongs to cloud storage technical field is and in particular to a kind of memory block group based on nonuniform memory access framework Close method and device.
Background technology
At present, the cloud storage technology development in cloud computing is increasingly faster, (Solid State from disk array to SSD Drives, solid state hard disc) array, development RAM (Random Access Memory, random access memory) cloud till now deposits Storage.RAM cloud storage stores the data of whole application, throughput using the memory ram of up to hundreds of or even thousand of server On higher hundreds of~thousand of times than disk base system, postpone but only hundreds of~a few one thousandth.Typically MapReduce is The new technique that Google rises recent years, it is therefore intended that improving data access speed, eliminates delay issue.It solves Large-scale problem, but if be continuous data access, by make the program be only limited in random access data should Use middle use.This set distributed computing framework of MapReduce is realized primary limitation and is following two aspects, and the first is used MapReduce writes linear communication model comparision trouble, and it two is how it improves all or a frame based on batch mode Frame;The RAMCloud project that Stanford University announces, builds memory array it is achieved that more than 1PB using the internal memory of same type Amount of storage.But the limitation of this project is the internal memory using same type.
NUMA (Non Uniform Memory Access Architecture, Non Uniform Memory Access accesses) framework then carries The different types of internal memory of permission has been supplied to be combined into the possibility of internal memory cloud storage.But, corresponding iff memory group is passed through Board, bus or network connection are got up, and can not constitute optimized internal memory cloud storage.
Content of the invention
The invention aims to the internal memory cloud framework of change existing homotype memory array composition and other correlations are asked Topic it is proposed that a kind of memory block combined method based on nonuniform memory access framework and device, can efficiently to non-homotype, The non-unified internal memory that accesses is ranked up merger, logic arrangement result is transferred to control process device, constructs high-quality as much as possible Non- unified access internal memory cloud storage.
For achieving the above object, the present invention adopts the following technical scheme that:In a kind of framework based on nonuniform memory access Counterfoil combined method, comprises the steps:
Step one:The internal memory that enabled node is provided according to the frequency of node, by the internal memory of the enabled node of same frequency Logic connects and composes a memory block;
Step 2:Using memory block as window block, by adjusting in putting in order between each window block and window block Putting in order of each enabled node, determines the minimum logic arrangement result of link cost, in wherein said logic arrangement result Including the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Step 3:Described routing table is stored in the control process device being connected with described host node, and by described Each memory block global address distributed to by control process device, to build internal memory cloud.
The algorithm of the present invention is based on NUMA and SIMD hardware environment.Heretofore described node is network section Point, wherein, enabled node can provide the node of internal memory to be connected to the node of network by NUMA card for part.Wherein, with regard to section , due to each server in model and connection, there is different internal memories, CPU, mainboard and network interface, therefore in the frequency of point Connection speed is different, and the present invention, by the factor of each such impact speed, is reduced to the frequency of node memory.Its In, host node is to the enabled node that the totle drilling cost of other each enabled nodes is minimum in enabled node.Wherein, link cost, shadow Any factor ringing data transfer is considered as link cost.Wherein, the cost of host node to memory block is interior to this for host node In counterfoil, the cost of all nodes is cumulative.
Preferably, described step 2 includes:
One enabled node is first chosen from described enabled node by simulated annealing and is used as host node, wherein said Host node is the connecting interface of described control process device;
By each window block, the link cost sequence from small to large according to described host node to window block is arranged, and By the link cost of the enabled node in each window block according to described host node of the enabled node in each window block from small to large Sequence arranged.
Preferably, described step 3 is included described host node and is connected with described control process device by bus.
On the other hand, the present invention also provides a kind of combination unit of the memory block based on nonuniform memory access framework, described Device includes:
Division module, for internal memory that enabled node is provided according to node frequency, by the enabled node of same frequency Internal memory logic connect and compose a memory block;
Processing module, for using memory block as window block, by adjusting putting in order and window between each window block The putting in order of each enabled node in buccal mass, determines the minimum logic arrangement result of link cost, wherein said logic arrangement Result includes the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Build module, for being stored in described routing table in the control process device being connected with described host node, and lead to Cross described control process device and distribute to each memory block global address, to build internal memory cloud.
Preferably, described processing module, is additionally operable to first choose one from described enabled node by simulated annealing As host node, wherein said host node is the connecting interface of described control process device to enabled node;
Described processing module, is additionally operable to each window block, according to described host node to window block link cost from little to Big sequence is arranged, and by the enabled node in each window block according to described host node the enabled node in each window block Link cost sequence from small to large arranged.
The memory block combined method based on nonuniform memory access framework of the present invention and device, this algorithm is based on non-unification Internal storage access framework, efficiently can be ranked up merger to non-homotype, the non-unified internal memory that accesses, constitute processor and operation The architecture of system interconnectivity and sharing memory bus;The present invention can be applied to large-scale NUMA internal memory cloud storage platform, Overcome due to the poor efficiency of the cluster interconnection network memorizer different with heterozygosis, construct high-quality non-unified visit as much as possible Ask internal memory cloud storage.
Brief description
Fig. 1 is RAMCloud nonuniform memory access framework in the embodiment of the present invention;
Fig. 2 is potential data center node topology in the embodiment of the present invention;
Fig. 3 is the memory block merging in the embodiment of the present invention;
Fig. 4 is window block-simulated annealing in the embodiment of the present invention;
Fig. 5 is number of run and convergence state figure in the embodiment of the present invention.
Specific embodiment
Embodiment 1:
A kind of memory block combined method based on nonuniform memory access framework, comprises the steps:
Step one:The internal memory that enabled node is provided according to the frequency of node, by the internal memory of the enabled node of same frequency Logic connects and composes a memory block;
Step 2:Using memory block as window block, by adjusting in putting in order between each window block and window block Putting in order of each enabled node, determines the minimum logic arrangement result of link cost, in wherein said logic arrangement result Including the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Step 3:Described routing table is stored in the control process device being connected with described host node, and by described Each memory block global address distributed to by control process device, to build internal memory cloud.
A kind of memory block combination unit based on nonuniform memory access framework, described device includes:
Division module, for internal memory that enabled node is provided according to node frequency, by the enabled node of same frequency Internal memory logic connect and compose a memory block;
Processing module, for using memory block as window block, by adjusting putting in order and window between each window block The putting in order of each enabled node in buccal mass, determines the minimum logic arrangement result of link cost, wherein said logic arrangement Result includes the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Build module, for being stored in described routing table in the control process device being connected with described host node, and lead to Cross described control process device and distribute to each memory block global address, to build internal memory cloud.
The algorithm of the present invention is based on NUMA and SIMD hardware environment.Heretofore described node is network section Point, wherein, enabled node can provide the node of internal memory to be connected to the node of network by NUMA card for part.Wherein, with regard to section , due to each server in model and connection, there is different internal memories, CPU, mainboard and network interface, therefore in the frequency of point Connection speed is different, and the present invention, by the factor of each such impact speed, is reduced to the frequency of node memory.Its In, host node is to the enabled node that the totle drilling cost of other each enabled nodes is minimum in enabled node.Wherein, link cost, shadow Any factor ringing data transfer is considered as link cost.Wherein, the cost of host node to memory block is interior to this for host node In counterfoil, the cost of all nodes is cumulative.
This embodiment will be applied to large-scale NUMA internal memory cloud storage platform, mutual using processor and operating system cluster The even architecture of shared memory bus, this structure overcome due to the poor efficiency of cluster interconnection network different with heterozygosis Memorizer, availability leaps, and constitutes the internal memory cloud storage of more optimization.
Embodiment 2:
A kind of memory block combined method based on nonuniform memory access framework, comprises the steps:
Step one:The internal memory that enabled node is provided according to the frequency of node, by the internal memory of the enabled node of same frequency Logic connects and composes a memory block;
Step 2:Using memory block as window block, by adjusting in putting in order between each window block and window block Putting in order of each enabled node, determines the minimum logic arrangement result of link cost, in wherein said logic arrangement result Including the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Step 3:Described routing table is stored in the control process device being connected with described host node, and by described Each memory block global address distributed to by control process device, to build internal memory cloud.
Wherein, described step 2 includes:
One enabled node is first chosen from described enabled node by simulated annealing and is used as host node, wherein said Host node is the connecting interface of described control process device;
By each window block, the link cost sequence from small to large according to described host node to window block is arranged, and By the link cost of the enabled node in each window block according to described host node of the enabled node in each window block from small to large Sequence arranged.
Wherein, described step 3 is included described host node and is connected with described control process device by bus.
A kind of memory block combination unit based on framework nonuniform memory access framework, described device includes:
Division module, for internal memory that enabled node is provided according to node frequency, by the enabled node of same frequency Internal memory logic connect and compose a memory block;
Processing module, for using memory block as window block, by adjusting putting in order and window between each window block The putting in order of each enabled node in buccal mass, determines the minimum logic arrangement result of link cost, wherein said logic arrangement Result includes the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Build module, for being stored in described routing table in the control process device being connected with described host node, and lead to Cross described control process device and distribute to each memory block global address, to build internal memory cloud.
Wherein, described processing module, being additionally operable to first to choose one from described enabled node by simulated annealing can With node as host node, wherein said host node is the connecting interface of described control process device;
Described processing module, is additionally operable to each window block, according to described host node to window block link cost from little to Big sequence is arranged, and by the enabled node in each window block according to described host node the enabled node in each window block Link cost sequence from small to large arranged.
As shown in figure 1, the internal memory cloud under nonuniform memory access framework includes application library, data center and control Processor.Nonuniform memory access framework tissue internal memory cloud is pressed by data center, and control process device manages data center.
For internal memory cloud, realize the target of its low latency, need the high performance network technology with following characteristic:Low prolong Late, high bandwidth and full-duplex bandwidth.
Below by model, the algorithm of the present invention is elaborated:
1. model formulation
Assume 1:Each node has memorizer, may with other nodes non-homotype, such as different frequency, bus, CPU model With the speed of service etc., in this model, these aspects are all reduced to different frequencies;
Assume 2:According to prior art, node presses frequency sequence merger, can obtain optimal performance;
Assume 3:Connecting node needs different costs.Any factor of impact data transfer is all assumed to connect into This.
2. model
As shown in Fig. 2 the connection topological structure of node A/B/C.../H, using the different non-homotypes of frequency analog, the Organization of African Unity One internal memory.These nodes provide respectively and a number of interior are stored to high in the clouds;Node is connected to each other and has different costs.
3. data model and initialization
For each node above-mentioned, each node has memory span and frequency.Related data such as table 1 institute Show.
Table 1:Nodal information
For any node being connected, node 1 arrives node 2 and corresponding cost.Related data is as shown in table 2.
Table 2:Node connects expense
Node 1 Node 2 Node connects expense
A B 2
B A 1
A D 3
D A 1
D B 1
This model is the cloud storage of nonuniform memory access framework, follows following 3 rules during access:
(1) must not the adjacent memory node of random writing;
(2) must not the adjacent memory node of random read take;
(3) asynchronous adjacent memory node.
Experiment shows, performance can be made drastically to decline if any violating dependency rule.To Jin Shidun memory performance test data Show, be optimal in the combination of identical clock memory.Otherwise, internal memory may under single channel or single tape wide mode work Make, so that Memory access speeds can be drastically declined.
In internal memory cloud, the research of sequence merger join algorithm optimization is concentrated mainly on NUMA and SIMD hardware environment.Non- Sorting in parallel merger join algorithm under uniform memory access framework can be divided into three phases:Phase sorting, separator stage and company Connect the stage.Therefore, the present invention is to merge homotype internal memory and find with the access node of minimum cost, and this node directly passes through to make It is interconnected with the bus of processor, such as AMD HT (super transmission) and Intel QPI (Quick Path Interconnect).
We will define following rule:
Rule 1:In order to obtain optimum performance, sorted according to nodal frequency and merge the internal memory of enabled node, after sequence To corresponding internal memory set of blocks, it is designated asMemory block will be a set, be designated as { Mbi };
Rule 2:Find a host node as the connecting interface of control process device.Assembly from host node to other nodes This is minimum, is expressed asMeanwhile, logically do not changed inside the memory block after merging;
Rule 3:From the second memory block, sequence node by by from the cost of host node to this node, immediate first sort, This group is represented as { Ai }.
According to above-mentioned rule, can fast and effeciently sort the non-homotype of merger, non-unified memory, and searches out connection control The node of processor;Carried out the distribution of global address by control process device, build internal memory cloud, for applying routine access.
As a example model shown in by Fig. 2, this algorithm will have three phases:Sequence merger, subregion and connection
(1) sequence, merger ----initialization
Data according to table 1, we sort the internal memory of merge node.By the memorizer of node memory and same frequency Rate logic connects.Obtain four memory block { Mbi }={ 6,9,6,2 }, as shown in Figure 3.
(2) subregion ----window block simulated annealing
Data according to table 2, our initialization systems.We have obtained the data in table 3, from any node to The shortest path of the cost of other nodes.If access details are 0 it means that this two nodes are joined directly together;Otherwise this will A character string as from a node to another routed path.Associated data is illustrated as table 3.
Table 3:Minimum cost and corresponding access path from server-to-server
According to table 3, current overhead is
The present invention by the thought of simulated annealing, in the approximate global optimization approach in big search space.
According to rule 1, in this case it is impossible to memory block after breaking merger.The present invention utilizes window block, each Window block all can be taken as an internal storage location.Inside window block, each node can be reordered.During, meter Let it pass the totle drilling cost of current annealing, and anneal.With the internal node sequence of Moving Window buccal mass and window block, obtain finite time Best solution in cost.
In fig .4, it is one possible solution.Host node is F, and coprocessor accesses other nodes from F point, always Cost is 65.
In fig. 4b, it is a preferably solution.Host node is B, and from B to other nodes, totle drilling cost is 27, and from 2nd piece of window starts, and sequence node is ranked up by rule 3.
(3) connect the internal memory cloud of combination
When obtaining a best solution, as shown in Figure 4 b, coprocessor will be connected to node B, and routing table is (similar In table 3) coprocessor will be copied and stored in.Coordinator will distribute to the global address of each cluster.
The present embodiment not details to the greatest extent, refers to the associated description of previous embodiment 1, here is omitted.
The simulated annealing that this embodiment adopts has carried out some improvement to traditional algorithm, not only to memory block proportionately Originally be ranked up, also the node within memory block be ranked up simultaneously, using simulated annealing flexibly, efficiency high, new when having Node when being added in this internal memory cloud, can quickly the memory block in internal memory cloud and respective nodes be made adjustment, thus Construct high-quality non-unified access internal memory cloud storage.
With a concrete application scene, the memory block based on nonuniform memory access framework in the present embodiment is combined below Method illustrates, and concrete mode is as follows:
(4) arthmetic statement
According to rule 1, initialize first, call Init () sequence merger node, produce first state S0, be shown in Table 3.So Afterwards, according to window block simulated annealing rule 2.Call Cost () will to calculate and return the cost of Present solutions.Call Neighbor () arrives traditional simulated annealing, and it will produce a given state of the neighbours of a random selection.? Afterwards, it obtains best solution.Coprocessor is connected to host node and replicates route information table by function Connect () 3.Function AssignGlobalAddress () is by coordinated allocation cluster memorizer according to the global address of block sequence.
Parameter S0 initial solution, the preferably best at present solution of parameter Sbest, parameter T0 is initial temperature, and α is cold But speed, β is a constant, and M represents the time until the renewal of next parameter, and the maximum time limit is the lehr attendant of total time Skill.
Pseudo-code below gives the described memory block combined method for nonuniform memory access framework.
In the algorithm, most important function is Neighbor ().Give for one of the neighbours of its one random selection of generation Fixed state.In " window block ", each node will be rearranged by rule 3;Outside " block " window, each window Block all will rearrange.
In this model, there are 8 nodes and 4 window blocks.Through many experiments result, the overhead of best solution Finally it is converged in 27.Best situation is 3 times, top it all off 15 times, as shown in Figure 5.

Claims (5)

1. a kind of memory block combined method based on nonuniform memory access framework is it is characterised in that comprise the steps:
Step one:The internal memory that enabled node is provided according to the frequency of node, by the internal memory logic of the enabled node of same frequency Connect and compose a memory block;
Step 2:Using memory block as window block, by adjusting each in putting in order between each window block and window block Putting in order of enabled node, determines the minimum logic arrangement result of link cost, and wherein said logic arrangement result includes Host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Step 3:Described routing table is stored in the control process device being connected with described host node, and by described control Each memory block global address distributed to by processor, to build internal memory cloud.
2. the memory block combined method based on nonuniform memory access framework according to claim 1 is it is characterised in that institute State step 2 to include:
One enabled node is first chosen from described enabled node by simulated annealing and is used as host node, wherein said main section Point is the connecting interface of described control process device;
By each window block, the link cost sequence from small to large according to described host node to window block is arranged, and will be each The link cost of the enabled node in each window block according to described host node of the enabled node in window block row from small to large Sequence is arranged.
3. the memory block combined method based on nonuniform memory access framework according to claim 1 it is characterised in that:Institute State step 3 and include described host node and be connected with described control process device by bus.
4. a kind of memory block based on framework nonuniform memory access framework combines the unit it is characterised in that described device includes:
Division module, for internal memory that enabled node is provided according to node frequency, by the enabled node of same frequency Deposit logic and connect and compose a memory block;
Processing module, for using memory block as window block, by adjusting putting in order and window block between each window block In the putting in order of each enabled node, determine the minimum logic arrangement result of link cost, wherein said logic arrangement result Include the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Build module, for being stored in described routing table in the control process device being connected with described host node, and pass through institute State control process device and distribute to each memory block global address, to build internal memory cloud.
5. according to claim 4 based on nonuniform memory access framework memory block combination unit it is characterised in that
Described processing module, is additionally operable to first to choose an enabled node from described enabled node by simulated annealing and is used as Host node, wherein said host node is the connecting interface of described control process device;
Described processing module, be additionally operable to each window block, according to described host node to window block link cost from small to large Arranged, and the company by the enabled node in each window block according to described host node of the enabled node in each window block It is connected into this sequence from small to large to be arranged.
CN201610844237.7A 2016-09-23 2016-09-23 A kind of memory block combined method and device based on nonuniform memory access framework Active CN106383791B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610844237.7A CN106383791B (en) 2016-09-23 2016-09-23 A kind of memory block combined method and device based on nonuniform memory access framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610844237.7A CN106383791B (en) 2016-09-23 2016-09-23 A kind of memory block combined method and device based on nonuniform memory access framework

Publications (2)

Publication Number Publication Date
CN106383791A true CN106383791A (en) 2017-02-08
CN106383791B CN106383791B (en) 2019-07-12

Family

ID=57936804

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610844237.7A Active CN106383791B (en) 2016-09-23 2016-09-23 A kind of memory block combined method and device based on nonuniform memory access framework

Country Status (1)

Country Link
CN (1) CN106383791B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112558869A (en) * 2020-12-11 2021-03-26 北京航天世景信息技术有限公司 Remote sensing image caching method based on big data

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030009637A1 (en) * 2001-06-21 2003-01-09 International Business Machines Corporation Decentralized global coherency management in a multi-node computer system
CN104144194A (en) * 2013-05-10 2014-11-12 中国移动通信集团公司 Data processing method and device for cloud storage system
CN104199718A (en) * 2014-08-22 2014-12-10 上海交通大学 Dispatching method of virtual processor based on NUMA high-performance network cache resource affinity
CN104506362A (en) * 2014-12-29 2015-04-08 浪潮电子信息产业股份有限公司 Method for system state switching and monitoring on CC-NUMA (cache coherent-non uniform memory access architecture) multi-node server
CN104657198A (en) * 2015-01-24 2015-05-27 深圳职业技术学院 Memory access optimization method and memory access optimization system for NUMA (Non-Uniform Memory Access) architecture system in virtual machine environment
CN104850461A (en) * 2015-05-12 2015-08-19 华中科技大学 NUMA-oriented virtual cpu (central processing unit) scheduling and optimizing method
CN105391590A (en) * 2015-12-26 2016-03-09 深圳职业技术学院 Method and system for automatically obtaining system routing table of NUMA

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030009637A1 (en) * 2001-06-21 2003-01-09 International Business Machines Corporation Decentralized global coherency management in a multi-node computer system
CN104144194A (en) * 2013-05-10 2014-11-12 中国移动通信集团公司 Data processing method and device for cloud storage system
CN104199718A (en) * 2014-08-22 2014-12-10 上海交通大学 Dispatching method of virtual processor based on NUMA high-performance network cache resource affinity
CN104506362A (en) * 2014-12-29 2015-04-08 浪潮电子信息产业股份有限公司 Method for system state switching and monitoring on CC-NUMA (cache coherent-non uniform memory access architecture) multi-node server
CN104657198A (en) * 2015-01-24 2015-05-27 深圳职业技术学院 Memory access optimization method and memory access optimization system for NUMA (Non-Uniform Memory Access) architecture system in virtual machine environment
CN104850461A (en) * 2015-05-12 2015-08-19 华中科技大学 NUMA-oriented virtual cpu (central processing unit) scheduling and optimizing method
CN105391590A (en) * 2015-12-26 2016-03-09 深圳职业技术学院 Method and system for automatically obtaining system routing table of NUMA

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112558869A (en) * 2020-12-11 2021-03-26 北京航天世景信息技术有限公司 Remote sensing image caching method based on big data

Also Published As

Publication number Publication date
CN106383791B (en) 2019-07-12

Similar Documents

Publication Publication Date Title
Ajima et al. The tofu interconnect d
US8117288B2 (en) Optimizing layout of an application on a massively parallel supercomputer
JPH0766718A (en) Wafer scale structure for programmable logic
WO2015171461A1 (en) Interconnect systems and methods using hybrid memory cube links
JPH06243113A (en) Calculation model mapping method for parallel computer
Wang et al. A message-passing multi-softcore architecture on FPGA for breadth-first search
CN101441616B (en) Rapid data exchange structure based on register document and management method thereof
CN109101338A (en) A kind of block chain processing framework and its method based on the extension of multichannel chip
KR20160121380A (en) Distributed file system using torus network and method for configuring and operating of the distributed file system using torus network
CN108304261B (en) Job scheduling method and device based on 6D-Torus network
CN106383791B (en) A kind of memory block combined method and device based on nonuniform memory access framework
Fernández et al. Efficient VLSI layouts for homogeneous product networks
Lin et al. A distributed resource management mechanism for a partitionable multiprocessor system
Laili et al. Parallel transfer evolution algorithm
Ravindran et al. On topology and bisection bandwidth of hierarchical-ring networks for shared-memory multiprocessors
Pietracaprina et al. Constructive deterministic PRAM simulation on a mesh-connected computer
Odendahl et al. Optimized buffer allocation in multicore platforms
Raghunath et al. Designing interconnection networks for multi-level packaging
Mackenzie et al. Comparative modeling of network topologies and routing strategies in multicomputers
JP6991446B2 (en) Packet processing device and its memory access control method
Sudheer et al. Dynamic load balancing for petascale quantum Monte Carlo applications: The Alias method
Davis IV et al. The performance analysis of partitioned circuit switched multistage interconnection networks
KEERTHI Improving the Network Traffic Performance in MapReduce for Big Data Applications through Online Algorithm
CN117201406A (en) Network signal stream transmission method and device, electronic equipment and medium
Herbordt et al. Towards scalable multicomputer communication through offline routing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant