CN106383791A - Memory block combination method and apparatus based on non-uniform memory access architecture - Google Patents
Memory block combination method and apparatus based on non-uniform memory access architecture Download PDFInfo
- Publication number
- CN106383791A CN106383791A CN201610844237.7A CN201610844237A CN106383791A CN 106383791 A CN106383791 A CN 106383791A CN 201610844237 A CN201610844237 A CN 201610844237A CN 106383791 A CN106383791 A CN 106383791A
- Authority
- CN
- China
- Prior art keywords
- node
- memory
- block
- window
- memory block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/06—Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/16—Handling requests for interconnection or transfer for access to memory bus
- G06F13/1605—Handling requests for interconnection or transfer for access to memory bus based on arbitration
- G06F13/1652—Handling requests for interconnection or transfer for access to memory bus based on arbitration in a multiprocessor architecture
- G06F13/1657—Access to multiple memories
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/25—Using a specific main memory architecture
- G06F2212/254—Distributed memory
- G06F2212/2542—Non-uniform memory access [NUMA] architecture
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multi Processors (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention belongs to the technical field of cloud storage, and relates to a memory block combination method and apparatus based on a non-uniform memory access architecture. The method comprises three steps: 1) classifying memory provided by available nodes according to frequencies of the nodes, and connecting memory logic of the available nodes with a same frequency to form a memory block; 2) using the memory block as a window block, and by adjusting an arrangement order of window blocks and an arrangement order of each available node in the window block, determining a logic arrangement result with the lowest connection cost, wherein the logic arrangement result comprises a master node in a logic arrangement with the lowest connection cost, and recording the logic arrangement result in a routing table; and 3) storing the routing table into a control processor connected to the master node, and allocating the routing table to a global address of each memory block by the control processor, so as to construct a memory cloud. According to the method and apparatus provided by the present invention, low efficiency of a cluster interconnection network and heterogeneity of different memories are overcome, and non-uniform access memory cloud storage is built up as much as possible.
Description
Technical field
The invention belongs to cloud storage technical field is and in particular to a kind of memory block group based on nonuniform memory access framework
Close method and device.
Background technology
At present, the cloud storage technology development in cloud computing is increasingly faster, (Solid State from disk array to SSD
Drives, solid state hard disc) array, development RAM (Random Access Memory, random access memory) cloud till now deposits
Storage.RAM cloud storage stores the data of whole application, throughput using the memory ram of up to hundreds of or even thousand of server
On higher hundreds of~thousand of times than disk base system, postpone but only hundreds of~a few one thousandth.Typically MapReduce is
The new technique that Google rises recent years, it is therefore intended that improving data access speed, eliminates delay issue.It solves
Large-scale problem, but if be continuous data access, by make the program be only limited in random access data should
Use middle use.This set distributed computing framework of MapReduce is realized primary limitation and is following two aspects, and the first is used
MapReduce writes linear communication model comparision trouble, and it two is how it improves all or a frame based on batch mode
Frame;The RAMCloud project that Stanford University announces, builds memory array it is achieved that more than 1PB using the internal memory of same type
Amount of storage.But the limitation of this project is the internal memory using same type.
NUMA (Non Uniform Memory Access Architecture, Non Uniform Memory Access accesses) framework then carries
The different types of internal memory of permission has been supplied to be combined into the possibility of internal memory cloud storage.But, corresponding iff memory group is passed through
Board, bus or network connection are got up, and can not constitute optimized internal memory cloud storage.
Content of the invention
The invention aims to the internal memory cloud framework of change existing homotype memory array composition and other correlations are asked
Topic it is proposed that a kind of memory block combined method based on nonuniform memory access framework and device, can efficiently to non-homotype,
The non-unified internal memory that accesses is ranked up merger, logic arrangement result is transferred to control process device, constructs high-quality as much as possible
Non- unified access internal memory cloud storage.
For achieving the above object, the present invention adopts the following technical scheme that:In a kind of framework based on nonuniform memory access
Counterfoil combined method, comprises the steps:
Step one:The internal memory that enabled node is provided according to the frequency of node, by the internal memory of the enabled node of same frequency
Logic connects and composes a memory block;
Step 2:Using memory block as window block, by adjusting in putting in order between each window block and window block
Putting in order of each enabled node, determines the minimum logic arrangement result of link cost, in wherein said logic arrangement result
Including the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Step 3:Described routing table is stored in the control process device being connected with described host node, and by described
Each memory block global address distributed to by control process device, to build internal memory cloud.
The algorithm of the present invention is based on NUMA and SIMD hardware environment.Heretofore described node is network section
Point, wherein, enabled node can provide the node of internal memory to be connected to the node of network by NUMA card for part.Wherein, with regard to section
, due to each server in model and connection, there is different internal memories, CPU, mainboard and network interface, therefore in the frequency of point
Connection speed is different, and the present invention, by the factor of each such impact speed, is reduced to the frequency of node memory.Its
In, host node is to the enabled node that the totle drilling cost of other each enabled nodes is minimum in enabled node.Wherein, link cost, shadow
Any factor ringing data transfer is considered as link cost.Wherein, the cost of host node to memory block is interior to this for host node
In counterfoil, the cost of all nodes is cumulative.
Preferably, described step 2 includes:
One enabled node is first chosen from described enabled node by simulated annealing and is used as host node, wherein said
Host node is the connecting interface of described control process device;
By each window block, the link cost sequence from small to large according to described host node to window block is arranged, and
By the link cost of the enabled node in each window block according to described host node of the enabled node in each window block from small to large
Sequence arranged.
Preferably, described step 3 is included described host node and is connected with described control process device by bus.
On the other hand, the present invention also provides a kind of combination unit of the memory block based on nonuniform memory access framework, described
Device includes:
Division module, for internal memory that enabled node is provided according to node frequency, by the enabled node of same frequency
Internal memory logic connect and compose a memory block;
Processing module, for using memory block as window block, by adjusting putting in order and window between each window block
The putting in order of each enabled node in buccal mass, determines the minimum logic arrangement result of link cost, wherein said logic arrangement
Result includes the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Build module, for being stored in described routing table in the control process device being connected with described host node, and lead to
Cross described control process device and distribute to each memory block global address, to build internal memory cloud.
Preferably, described processing module, is additionally operable to first choose one from described enabled node by simulated annealing
As host node, wherein said host node is the connecting interface of described control process device to enabled node;
Described processing module, is additionally operable to each window block, according to described host node to window block link cost from little to
Big sequence is arranged, and by the enabled node in each window block according to described host node the enabled node in each window block
Link cost sequence from small to large arranged.
The memory block combined method based on nonuniform memory access framework of the present invention and device, this algorithm is based on non-unification
Internal storage access framework, efficiently can be ranked up merger to non-homotype, the non-unified internal memory that accesses, constitute processor and operation
The architecture of system interconnectivity and sharing memory bus;The present invention can be applied to large-scale NUMA internal memory cloud storage platform,
Overcome due to the poor efficiency of the cluster interconnection network memorizer different with heterozygosis, construct high-quality non-unified visit as much as possible
Ask internal memory cloud storage.
Brief description
Fig. 1 is RAMCloud nonuniform memory access framework in the embodiment of the present invention;
Fig. 2 is potential data center node topology in the embodiment of the present invention;
Fig. 3 is the memory block merging in the embodiment of the present invention;
Fig. 4 is window block-simulated annealing in the embodiment of the present invention;
Fig. 5 is number of run and convergence state figure in the embodiment of the present invention.
Specific embodiment
Embodiment 1:
A kind of memory block combined method based on nonuniform memory access framework, comprises the steps:
Step one:The internal memory that enabled node is provided according to the frequency of node, by the internal memory of the enabled node of same frequency
Logic connects and composes a memory block;
Step 2:Using memory block as window block, by adjusting in putting in order between each window block and window block
Putting in order of each enabled node, determines the minimum logic arrangement result of link cost, in wherein said logic arrangement result
Including the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Step 3:Described routing table is stored in the control process device being connected with described host node, and by described
Each memory block global address distributed to by control process device, to build internal memory cloud.
A kind of memory block combination unit based on nonuniform memory access framework, described device includes:
Division module, for internal memory that enabled node is provided according to node frequency, by the enabled node of same frequency
Internal memory logic connect and compose a memory block;
Processing module, for using memory block as window block, by adjusting putting in order and window between each window block
The putting in order of each enabled node in buccal mass, determines the minimum logic arrangement result of link cost, wherein said logic arrangement
Result includes the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Build module, for being stored in described routing table in the control process device being connected with described host node, and lead to
Cross described control process device and distribute to each memory block global address, to build internal memory cloud.
The algorithm of the present invention is based on NUMA and SIMD hardware environment.Heretofore described node is network section
Point, wherein, enabled node can provide the node of internal memory to be connected to the node of network by NUMA card for part.Wherein, with regard to section
, due to each server in model and connection, there is different internal memories, CPU, mainboard and network interface, therefore in the frequency of point
Connection speed is different, and the present invention, by the factor of each such impact speed, is reduced to the frequency of node memory.Its
In, host node is to the enabled node that the totle drilling cost of other each enabled nodes is minimum in enabled node.Wherein, link cost, shadow
Any factor ringing data transfer is considered as link cost.Wherein, the cost of host node to memory block is interior to this for host node
In counterfoil, the cost of all nodes is cumulative.
This embodiment will be applied to large-scale NUMA internal memory cloud storage platform, mutual using processor and operating system cluster
The even architecture of shared memory bus, this structure overcome due to the poor efficiency of cluster interconnection network different with heterozygosis
Memorizer, availability leaps, and constitutes the internal memory cloud storage of more optimization.
Embodiment 2:
A kind of memory block combined method based on nonuniform memory access framework, comprises the steps:
Step one:The internal memory that enabled node is provided according to the frequency of node, by the internal memory of the enabled node of same frequency
Logic connects and composes a memory block;
Step 2:Using memory block as window block, by adjusting in putting in order between each window block and window block
Putting in order of each enabled node, determines the minimum logic arrangement result of link cost, in wherein said logic arrangement result
Including the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Step 3:Described routing table is stored in the control process device being connected with described host node, and by described
Each memory block global address distributed to by control process device, to build internal memory cloud.
Wherein, described step 2 includes:
One enabled node is first chosen from described enabled node by simulated annealing and is used as host node, wherein said
Host node is the connecting interface of described control process device;
By each window block, the link cost sequence from small to large according to described host node to window block is arranged, and
By the link cost of the enabled node in each window block according to described host node of the enabled node in each window block from small to large
Sequence arranged.
Wherein, described step 3 is included described host node and is connected with described control process device by bus.
A kind of memory block combination unit based on framework nonuniform memory access framework, described device includes:
Division module, for internal memory that enabled node is provided according to node frequency, by the enabled node of same frequency
Internal memory logic connect and compose a memory block;
Processing module, for using memory block as window block, by adjusting putting in order and window between each window block
The putting in order of each enabled node in buccal mass, determines the minimum logic arrangement result of link cost, wherein said logic arrangement
Result includes the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Build module, for being stored in described routing table in the control process device being connected with described host node, and lead to
Cross described control process device and distribute to each memory block global address, to build internal memory cloud.
Wherein, described processing module, being additionally operable to first to choose one from described enabled node by simulated annealing can
With node as host node, wherein said host node is the connecting interface of described control process device;
Described processing module, is additionally operable to each window block, according to described host node to window block link cost from little to
Big sequence is arranged, and by the enabled node in each window block according to described host node the enabled node in each window block
Link cost sequence from small to large arranged.
As shown in figure 1, the internal memory cloud under nonuniform memory access framework includes application library, data center and control
Processor.Nonuniform memory access framework tissue internal memory cloud is pressed by data center, and control process device manages data center.
For internal memory cloud, realize the target of its low latency, need the high performance network technology with following characteristic:Low prolong
Late, high bandwidth and full-duplex bandwidth.
Below by model, the algorithm of the present invention is elaborated:
1. model formulation
Assume 1:Each node has memorizer, may with other nodes non-homotype, such as different frequency, bus, CPU model
With the speed of service etc., in this model, these aspects are all reduced to different frequencies;
Assume 2:According to prior art, node presses frequency sequence merger, can obtain optimal performance;
Assume 3:Connecting node needs different costs.Any factor of impact data transfer is all assumed to connect into
This.
2. model
As shown in Fig. 2 the connection topological structure of node A/B/C.../H, using the different non-homotypes of frequency analog, the Organization of African Unity
One internal memory.These nodes provide respectively and a number of interior are stored to high in the clouds;Node is connected to each other and has different costs.
3. data model and initialization
For each node above-mentioned, each node has memory span and frequency.Related data such as table 1 institute
Show.
Table 1:Nodal information
For any node being connected, node 1 arrives node 2 and corresponding cost.Related data is as shown in table 2.
Table 2:Node connects expense
Node 1 | Node 2 | Node connects expense |
A | B | 2 |
B | A | 1 |
A | D | 3 |
D | A | 1 |
… | … | … |
D | B | 1 |
This model is the cloud storage of nonuniform memory access framework, follows following 3 rules during access:
(1) must not the adjacent memory node of random writing;
(2) must not the adjacent memory node of random read take;
(3) asynchronous adjacent memory node.
Experiment shows, performance can be made drastically to decline if any violating dependency rule.To Jin Shidun memory performance test data
Show, be optimal in the combination of identical clock memory.Otherwise, internal memory may under single channel or single tape wide mode work
Make, so that Memory access speeds can be drastically declined.
In internal memory cloud, the research of sequence merger join algorithm optimization is concentrated mainly on NUMA and SIMD hardware environment.Non-
Sorting in parallel merger join algorithm under uniform memory access framework can be divided into three phases:Phase sorting, separator stage and company
Connect the stage.Therefore, the present invention is to merge homotype internal memory and find with the access node of minimum cost, and this node directly passes through to make
It is interconnected with the bus of processor, such as AMD HT (super transmission) and Intel QPI (Quick Path Interconnect).
We will define following rule:
Rule 1:In order to obtain optimum performance, sorted according to nodal frequency and merge the internal memory of enabled node, after sequence
To corresponding internal memory set of blocks, it is designated asMemory block will be a set, be designated as { Mbi };
Rule 2:Find a host node as the connecting interface of control process device.Assembly from host node to other nodes
This is minimum, is expressed asMeanwhile, logically do not changed inside the memory block after merging;
Rule 3:From the second memory block, sequence node by by from the cost of host node to this node, immediate first sort,
This group is represented as { Ai }.
According to above-mentioned rule, can fast and effeciently sort the non-homotype of merger, non-unified memory, and searches out connection control
The node of processor;Carried out the distribution of global address by control process device, build internal memory cloud, for applying routine access.
As a example model shown in by Fig. 2, this algorithm will have three phases:Sequence merger, subregion and connection
(1) sequence, merger ----initialization
Data according to table 1, we sort the internal memory of merge node.By the memorizer of node memory and same frequency
Rate logic connects.Obtain four memory block { Mbi }={ 6,9,6,2 }, as shown in Figure 3.
(2) subregion ----window block simulated annealing
Data according to table 2, our initialization systems.We have obtained the data in table 3, from any node to
The shortest path of the cost of other nodes.If access details are 0 it means that this two nodes are joined directly together;Otherwise this will
A character string as from a node to another routed path.Associated data is illustrated as table 3.
Table 3:Minimum cost and corresponding access path from server-to-server
According to table 3, current overhead is
The present invention by the thought of simulated annealing, in the approximate global optimization approach in big search space.
According to rule 1, in this case it is impossible to memory block after breaking merger.The present invention utilizes window block, each
Window block all can be taken as an internal storage location.Inside window block, each node can be reordered.During, meter
Let it pass the totle drilling cost of current annealing, and anneal.With the internal node sequence of Moving Window buccal mass and window block, obtain finite time
Best solution in cost.
In fig .4, it is one possible solution.Host node is F, and coprocessor accesses other nodes from F point, always
Cost is 65.
In fig. 4b, it is a preferably solution.Host node is B, and from B to other nodes, totle drilling cost is 27, and from
2nd piece of window starts, and sequence node is ranked up by rule 3.
(3) connect the internal memory cloud of combination
When obtaining a best solution, as shown in Figure 4 b, coprocessor will be connected to node B, and routing table is (similar
In table 3) coprocessor will be copied and stored in.Coordinator will distribute to the global address of each cluster.
The present embodiment not details to the greatest extent, refers to the associated description of previous embodiment 1, here is omitted.
The simulated annealing that this embodiment adopts has carried out some improvement to traditional algorithm, not only to memory block proportionately
Originally be ranked up, also the node within memory block be ranked up simultaneously, using simulated annealing flexibly, efficiency high, new when having
Node when being added in this internal memory cloud, can quickly the memory block in internal memory cloud and respective nodes be made adjustment, thus
Construct high-quality non-unified access internal memory cloud storage.
With a concrete application scene, the memory block based on nonuniform memory access framework in the present embodiment is combined below
Method illustrates, and concrete mode is as follows:
(4) arthmetic statement
According to rule 1, initialize first, call Init () sequence merger node, produce first state S0, be shown in Table 3.So
Afterwards, according to window block simulated annealing rule 2.Call Cost () will to calculate and return the cost of Present solutions.Call
Neighbor () arrives traditional simulated annealing, and it will produce a given state of the neighbours of a random selection.?
Afterwards, it obtains best solution.Coprocessor is connected to host node and replicates route information table by function Connect ()
3.Function AssignGlobalAddress () is by coordinated allocation cluster memorizer according to the global address of block sequence.
Parameter S0 initial solution, the preferably best at present solution of parameter Sbest, parameter T0 is initial temperature, and α is cold
But speed, β is a constant, and M represents the time until the renewal of next parameter, and the maximum time limit is the lehr attendant of total time
Skill.
Pseudo-code below gives the described memory block combined method for nonuniform memory access framework.
In the algorithm, most important function is Neighbor ().Give for one of the neighbours of its one random selection of generation
Fixed state.In " window block ", each node will be rearranged by rule 3;Outside " block " window, each window
Block all will rearrange.
In this model, there are 8 nodes and 4 window blocks.Through many experiments result, the overhead of best solution
Finally it is converged in 27.Best situation is 3 times, top it all off 15 times, as shown in Figure 5.
Claims (5)
1. a kind of memory block combined method based on nonuniform memory access framework is it is characterised in that comprise the steps:
Step one:The internal memory that enabled node is provided according to the frequency of node, by the internal memory logic of the enabled node of same frequency
Connect and compose a memory block;
Step 2:Using memory block as window block, by adjusting each in putting in order between each window block and window block
Putting in order of enabled node, determines the minimum logic arrangement result of link cost, and wherein said logic arrangement result includes
Host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Step 3:Described routing table is stored in the control process device being connected with described host node, and by described control
Each memory block global address distributed to by processor, to build internal memory cloud.
2. the memory block combined method based on nonuniform memory access framework according to claim 1 is it is characterised in that institute
State step 2 to include:
One enabled node is first chosen from described enabled node by simulated annealing and is used as host node, wherein said main section
Point is the connecting interface of described control process device;
By each window block, the link cost sequence from small to large according to described host node to window block is arranged, and will be each
The link cost of the enabled node in each window block according to described host node of the enabled node in window block row from small to large
Sequence is arranged.
3. the memory block combined method based on nonuniform memory access framework according to claim 1 it is characterised in that:Institute
State step 3 and include described host node and be connected with described control process device by bus.
4. a kind of memory block based on framework nonuniform memory access framework combines the unit it is characterised in that described device includes:
Division module, for internal memory that enabled node is provided according to node frequency, by the enabled node of same frequency
Deposit logic and connect and compose a memory block;
Processing module, for using memory block as window block, by adjusting putting in order and window block between each window block
In the putting in order of each enabled node, determine the minimum logic arrangement result of link cost, wherein said logic arrangement result
Include the host node in the minimum logic arrangement of link cost, by described logic arrangement result record in the routing table;
Build module, for being stored in described routing table in the control process device being connected with described host node, and pass through institute
State control process device and distribute to each memory block global address, to build internal memory cloud.
5. according to claim 4 based on nonuniform memory access framework memory block combination unit it is characterised in that
Described processing module, is additionally operable to first to choose an enabled node from described enabled node by simulated annealing and is used as
Host node, wherein said host node is the connecting interface of described control process device;
Described processing module, be additionally operable to each window block, according to described host node to window block link cost from small to large
Arranged, and the company by the enabled node in each window block according to described host node of the enabled node in each window block
It is connected into this sequence from small to large to be arranged.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610844237.7A CN106383791B (en) | 2016-09-23 | 2016-09-23 | A kind of memory block combined method and device based on nonuniform memory access framework |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610844237.7A CN106383791B (en) | 2016-09-23 | 2016-09-23 | A kind of memory block combined method and device based on nonuniform memory access framework |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106383791A true CN106383791A (en) | 2017-02-08 |
CN106383791B CN106383791B (en) | 2019-07-12 |
Family
ID=57936804
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610844237.7A Active CN106383791B (en) | 2016-09-23 | 2016-09-23 | A kind of memory block combined method and device based on nonuniform memory access framework |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106383791B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112558869A (en) * | 2020-12-11 | 2021-03-26 | 北京航天世景信息技术有限公司 | Remote sensing image caching method based on big data |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030009637A1 (en) * | 2001-06-21 | 2003-01-09 | International Business Machines Corporation | Decentralized global coherency management in a multi-node computer system |
CN104144194A (en) * | 2013-05-10 | 2014-11-12 | 中国移动通信集团公司 | Data processing method and device for cloud storage system |
CN104199718A (en) * | 2014-08-22 | 2014-12-10 | 上海交通大学 | Dispatching method of virtual processor based on NUMA high-performance network cache resource affinity |
CN104506362A (en) * | 2014-12-29 | 2015-04-08 | 浪潮电子信息产业股份有限公司 | Method for system state switching and monitoring on CC-NUMA (cache coherent-non uniform memory access architecture) multi-node server |
CN104657198A (en) * | 2015-01-24 | 2015-05-27 | 深圳职业技术学院 | Memory access optimization method and memory access optimization system for NUMA (Non-Uniform Memory Access) architecture system in virtual machine environment |
CN104850461A (en) * | 2015-05-12 | 2015-08-19 | 华中科技大学 | NUMA-oriented virtual cpu (central processing unit) scheduling and optimizing method |
CN105391590A (en) * | 2015-12-26 | 2016-03-09 | 深圳职业技术学院 | Method and system for automatically obtaining system routing table of NUMA |
-
2016
- 2016-09-23 CN CN201610844237.7A patent/CN106383791B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030009637A1 (en) * | 2001-06-21 | 2003-01-09 | International Business Machines Corporation | Decentralized global coherency management in a multi-node computer system |
CN104144194A (en) * | 2013-05-10 | 2014-11-12 | 中国移动通信集团公司 | Data processing method and device for cloud storage system |
CN104199718A (en) * | 2014-08-22 | 2014-12-10 | 上海交通大学 | Dispatching method of virtual processor based on NUMA high-performance network cache resource affinity |
CN104506362A (en) * | 2014-12-29 | 2015-04-08 | 浪潮电子信息产业股份有限公司 | Method for system state switching and monitoring on CC-NUMA (cache coherent-non uniform memory access architecture) multi-node server |
CN104657198A (en) * | 2015-01-24 | 2015-05-27 | 深圳职业技术学院 | Memory access optimization method and memory access optimization system for NUMA (Non-Uniform Memory Access) architecture system in virtual machine environment |
CN104850461A (en) * | 2015-05-12 | 2015-08-19 | 华中科技大学 | NUMA-oriented virtual cpu (central processing unit) scheduling and optimizing method |
CN105391590A (en) * | 2015-12-26 | 2016-03-09 | 深圳职业技术学院 | Method and system for automatically obtaining system routing table of NUMA |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112558869A (en) * | 2020-12-11 | 2021-03-26 | 北京航天世景信息技术有限公司 | Remote sensing image caching method based on big data |
Also Published As
Publication number | Publication date |
---|---|
CN106383791B (en) | 2019-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ajima et al. | The tofu interconnect d | |
US8117288B2 (en) | Optimizing layout of an application on a massively parallel supercomputer | |
JPH0766718A (en) | Wafer scale structure for programmable logic | |
WO2015171461A1 (en) | Interconnect systems and methods using hybrid memory cube links | |
JPH06243113A (en) | Calculation model mapping method for parallel computer | |
Wang et al. | A message-passing multi-softcore architecture on FPGA for breadth-first search | |
CN101441616B (en) | Rapid data exchange structure based on register document and management method thereof | |
CN109101338A (en) | A kind of block chain processing framework and its method based on the extension of multichannel chip | |
KR20160121380A (en) | Distributed file system using torus network and method for configuring and operating of the distributed file system using torus network | |
CN108304261B (en) | Job scheduling method and device based on 6D-Torus network | |
CN106383791B (en) | A kind of memory block combined method and device based on nonuniform memory access framework | |
Fernández et al. | Efficient VLSI layouts for homogeneous product networks | |
Lin et al. | A distributed resource management mechanism for a partitionable multiprocessor system | |
Laili et al. | Parallel transfer evolution algorithm | |
Ravindran et al. | On topology and bisection bandwidth of hierarchical-ring networks for shared-memory multiprocessors | |
Pietracaprina et al. | Constructive deterministic PRAM simulation on a mesh-connected computer | |
Odendahl et al. | Optimized buffer allocation in multicore platforms | |
Raghunath et al. | Designing interconnection networks for multi-level packaging | |
Mackenzie et al. | Comparative modeling of network topologies and routing strategies in multicomputers | |
JP6991446B2 (en) | Packet processing device and its memory access control method | |
Sudheer et al. | Dynamic load balancing for petascale quantum Monte Carlo applications: The Alias method | |
Davis IV et al. | The performance analysis of partitioned circuit switched multistage interconnection networks | |
KEERTHI | Improving the Network Traffic Performance in MapReduce for Big Data Applications through Online Algorithm | |
CN117201406A (en) | Network signal stream transmission method and device, electronic equipment and medium | |
Herbordt et al. | Towards scalable multicomputer communication through offline routing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |