CN110704362A - Processor array local storage hybrid management technology - Google Patents

Processor array local storage hybrid management technology Download PDF

Info

Publication number
CN110704362A
CN110704362A CN201910864444.2A CN201910864444A CN110704362A CN 110704362 A CN110704362 A CN 110704362A CN 201910864444 A CN201910864444 A CN 201910864444A CN 110704362 A CN110704362 A CN 110704362A
Authority
CN
China
Prior art keywords
cores
shares
core
processor
array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910864444.2A
Other languages
Chinese (zh)
Other versions
CN110704362B (en
Inventor
高剑刚
施晶晶
李宏亮
过锋
唐勇
吴铁彬
郑方
许勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Jiangnan Computing Technology Institute
Original Assignee
Wuxi Jiangnan Computing Technology Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Jiangnan Computing Technology Institute filed Critical Wuxi Jiangnan Computing Technology Institute
Priority to CN201910864444.2A priority Critical patent/CN110704362B/en
Publication of CN110704362A publication Critical patent/CN110704362A/en
Application granted granted Critical
Publication of CN110704362B publication Critical patent/CN110704362B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17306Intercommunication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17306Intercommunication techniques
    • G06F15/17331Distributed shared memory [DSM], e.g. remote direct memory access [RDMA]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a processor array local storage hybrid management technology, and belongs to the technical field of computer system structures and processor microstructures. The processor array local storage hybrid management technology comprises the following steps: s1: dividing an on-chip local store (LDM) of each core in the array processor into a first type area, a second type area and a third type area; s2: setting a first type area as a private storage space for storing local private data, wherein the specific addressing of the first type area is only visible to the application program of the core; s3: setting a second type area to a shared memory space for holding shared data of the plurality of cores, the shared memory space specifically addressing the shared memory space visible to applications of the plurality of cores; s4: and setting the third type area as a Cache storage space which is used for mapping to the whole main memory space and managed in a Cache mode so as to enable the access of the application program of the core to the Cache space to be visible. The invention flexibly configures the application characteristics and efficiently exerts the actual running performance of the application.

Description

Processor array local storage hybrid management technology
Technical Field
The invention belongs to the technical field of computer system structures and processor microstructures, and relates to a processor array local storage hybrid management technology.
Background
With the continuous increase of the number of cores of the many-core processor and the great improvement of the computing capability, the memory access capability of the chip is improved far slower than the computing capability, and the problem of a memory wall becomes an important factor for restricting the performance of the chip. The on-chip storage hierarchy design matched with the application characteristic depth is an important technical approach for relieving the access and storage wall problem.
The method has the advantages that the core data sharing of the many-core processor is efficiently realized, and the key for improving the on-chip data reuse rate is realized. However, different applications have great difference in the requirements for on-chip storage, and different sizes of the shared working set and data access mechanisms have great influence on the application performance. The single processor core data management mode has the defect of adaptability.
Disclosure of Invention
The present invention provides a processor array local storage hybrid management technology aiming at the above problems in the prior art, and the technical problems to be solved by the present invention are: how to provide a processor array local storage hybrid management technique.
The purpose of the invention can be realized by the following technical scheme:
a processor array local storage hybrid management technique, comprising the steps of:
s1: dividing an on-chip local storage (LDM) of each core in the array processor into a first type area, a second type area and a third type area;
s2: setting a first type area as a private storage space for storing local private data, wherein the specific addressing of the first type area is only visible to the application program of the core;
s3: setting a second type area to a shared memory space for holding shared data of the plurality of cores, the shared memory space specifically addressing the shared memory space visible to applications of the plurality of cores;
s4: and setting the third type area as a Cache storage space which is used for mapping to the whole main memory space and managed in a Cache mode so as to enable the access of the application program of the core to the Cache space to be visible.
Preferably, the capacity of the private storage space, the capacity of the shared storage space and the capacity of the Cache storage space are all variable.
Preferably, the shared memory space supports sharing by mapping to accomplish multiple granularities and shapes.
Preferably, the array processor is an array processor with 8 x 8 cores and 8 cores per row and column, and the shared memory space includes 16 four-core neighborhood shares, which are four-core neighborhood shares to divide the array processor equally into 16, including four cores.
Preferably, the array processor is an array processor with 8 x 8 cores and 8 cores per row and column, and the shared memory space includes 4 sixteen core neighborhood shares, which are sixteen core neighborhood shares to divide the array processor equally into 4, including sixteen, cores.
Preferably, the array processor is an 8 x 8 core array processor with 8 cores per row and column, and the shared memory space includes 1 sixty-four core neighborhood share with sixty-four cores.
Preferably, the array processor is an array processor with 8 × 8 cores and 8 cores per row and column, the shared memory space includes 8 row shares, and the row share is a row share for setting each row of the array processor as a shared memory space.
Preferably, the array processor is an array processor with 8 × 8 cores and 8 cores per row and column, and the shared memory space includes 8 column shares, which is a column share for setting each column of the array processor as a shared memory space.
Preferably, the array processor is an 8 x 8 core array processor with 8 cores per row and column, the shared space includes a plurality of irregular shares, the irregular shares are non-16 four core adjacent shares, non-4 sixteen core adjacent shares, non-1 sixty four core adjacent shares, non-8 row shares, non-8 column shares, the column shares are column shares to set each column of the array processor as a shared memory space, the row shares are row shares to set each row of the array processor as a shared memory space, the sixteen core adjacent shares are sixteen core adjacent shares to equally divide the array processor into 4 cores including sixteen cores, the four core adjacent shares are four core adjacent shares to equally divide the array processor into 16 cores including four cores, the sixty-four core neighbor sharing is sixty-four core neighbor sharing with sixty-four cores.
Preferably, after the shared memory space is configured, the cores in the irregular sharing are addressed in a unified manner, the memory access of the cores in the irregular sharing cannot exceed the range of the irregular sharing, and if the preset cores in the irregular sharing exceed the range of the irregular sharing, an exception is generated, but the execution correctness of other cores in the irregular cores is not affected.
In the present invention, the on-chip local storage (LDM) of each core in the array processor is first divided into a first type area, a second type area and a third type area, then respectively setting the first type area as a private storage space for storing local private data, wherein the specific addressing of the private storage space is only visible for the application program of the core, setting the second type area as a shared storage space for storing shared data of a plurality of cores, wherein the specific addressing of the shared storage space is visible for the application programs of the cores, setting the third type area as a Cache storage space for mapping to the whole main storage space, managing in a Cache manner to make the access of the application program of the core to the Cache space visible, therefore, the local storage can store local private data and shared data of other cores, flexible configuration is carried out on application characteristics, and the actual operation performance of the application is effectively exerted.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic diagram of the LDM of the present invention;
FIG. 3 is a schematic diagram of a four-core neighborhood sharing architecture according to the present invention;
FIG. 4 is a schematic diagram of a sixteen core neighborhood sharing architecture according to the present invention;
FIG. 5 is a schematic diagram of the architecture of the sixty-four core neighborhood sharing of the present invention;
FIG. 6 is a schematic diagram of the row sharing structure of the present invention;
FIG. 7 is a schematic diagram of the structure of column sharing in the present invention;
FIG. 8 is a schematic diagram of the structure of irregular sharing in the present invention.
Detailed Description
The following are specific embodiments of the present invention and are further described with reference to the drawings, but the present invention is not limited to these embodiments.
Referring to fig. 1 and 2, the technique for hybrid management of local storage of a processor array in the present embodiment includes the following steps:
s1: dividing an on-chip local store (LDM) of each core in the array processor into a first type area, a second type area and a third type area;
s2: setting a first type area as a private storage space for storing local private data, wherein the specific addressing of the first type area is only visible to the application program of the core;
s3: setting a second type area to a shared memory space for holding shared data of the plurality of cores, the shared memory space specifically addressing the shared memory space visible to applications of the plurality of cores;
s4: and setting the third type area as a Cache storage space which is used for mapping to the whole main memory space and managed in a Cache mode so as to enable the access of the application program of the core to the Cache space to be visible.
Here, the on-chip local stores (LDM) of each core in the array processor are first divided into a first type area, a second type area, and a third type area, then respectively setting the first type area as a private storage space for storing local private data, wherein the specific addressing of the private storage space is only visible for the application program of the core, setting the second type area as a shared storage space for storing shared data of a plurality of cores, wherein the specific addressing of the shared storage space is visible for the application programs of the cores, setting the third type area as a Cache storage space for mapping to the whole main storage space, managing in a Cache manner to make the access of the application program of the core to the Cache space visible, therefore, the local storage can store local private data and shared data of other cores, flexible configuration is carried out on application characteristics, and the actual operation performance of the application is effectively exerted.
The capacity of the private storage space, the capacity of the shared storage space and the capacity of the Cache storage space can be variable. The capacity of the private storage space, the capacity of the shared storage space and the capacity of the Cache storage space can be flexibly adjusted according to the application requirements, dynamic configuration of LDM space application, dynamic configuration and dynamic protection of the shared range in the array are achieved, the type and the shared range of the storage space are suitable for application characteristics, and the storage structure is matched with the application to the maximum extent.
The shared memory space may support sharing by mapping to accomplish multiple granularities and shapes. For the shared space, in order to adapt to the working set and the affinity characteristics of different applications, the sharing of various granularities and shapes is completed through mapping.
Referring to fig. 3, the array processor may be an array processor with 8 x 8 cores, each row and column may be 8 cores, the shared memory space includes 16 four-core neighborhood shares, and the four-core neighborhood share is a four-core neighborhood share for equally dividing the array processor into 16, including four cores. After configuration, each quad core is adjacent to the cores in the shared black frame, the seen shared space is uniformly addressed, the memory access can not exceed the range of the black frame, namely, the addresses of the cores in the one quad core adjacent sharing are the same, and the memory access can only be in the local quad core adjacent sharing.
Referring to fig. 4, the array processor may be an array processor with 8 x 8 cores, 8 cores per row and column, the shared memory space includes 4 sixteen core neighbor shares, and the sixteen core neighbor share is a sixteen core neighbor share that is used to divide the array processor equally into 4, including sixteen, cores. By configuring, each sixteen core is adjacent to the core in the shared black frame, the seen shared space is uniformly addressed, the memory access can not exceed the range of the black frame, namely, the address of the core in the adjacent sharing of one sixteen core is the same, and the memory access can only be in the adjacent sharing of the sixteen core.
Referring to fig. 5, the array processor may be an 8 x 8 core array processor with 8 cores per row and column, and the shared memory space includes 1 sixty-four core neighborhood sharing with sixty-four cores. By configuring, sixty-four cores are adjacent to cores in a shared black frame, the seen shared space is uniformly addressed, and memory access cannot exceed the range of the black frame, namely addresses of the cores in the sixty-four core adjacent sharing are the same, and memory access can only be in the sixty-four core adjacent sharing.
Referring to fig. 6, the array processor may be an array processor with 8 × 8 cores per row and 8 cores per column, the shared memory space includes 8 row shares, and the row share is a row share for setting each row of the array processor as the shared memory space. After configuration, the cores in the black frame shared by each row see the shared space addressed uniformly, the memory access can not exceed the range of the black frame, namely, the addresses of the cores in one row share are the same, and the memory access can only be in the local row share.
Referring to fig. 7, the array processor may be an array processor with 8 x 8 cores and 8 cores per row and column, the shared memory space includes 8 column shares, and the column share is a column share for setting each column of the array processor as the shared memory space. After configuration, the cores in the black frame shared by each column are seen to share the space and are addressed uniformly, the memory access cannot exceed the range of the black frame, namely, the addresses of the cores in one column share are the same, and the memory access can only be in the column share.
Referring to fig. 8, the array processor may be an 8 x 8 core array processor with 8 cores per row and column, the shared space includes a plurality of irregular shares, the irregular shares are not 16 four-core adjacent shares, not 4 sixteen core adjacent shares, not 1 sixty-four core adjacent shares, not 8 row shares, and not 8 column shares, the column shares are column shares for setting each column of the array processor as a shared memory space, the row shares are row shares for setting each row of the array processor as a shared memory space, the sixteen core adjacent shares are sixteen core adjacent shares for equally dividing the array processor into 4 cores including sixteen cores, the four core adjacent shares are four core adjacent shares for equally dividing the array processor into 16 cores including four cores, the sixty-four core neighbor sharing is sixty-four core neighbor sharing with sixty-four cores. The shared memory space includes an eight core share with 8 cores, a ten core share with 10 cores, a two core share with 2 cores, a three core share with 3 cores, a four core share with 4 cores, and a five core share with 5 cores. After configuration, shared spaces seen by cores in the irregular shared black frame are addressed uniformly, memory access cannot exceed the range of the black frame, namely the addresses of the cores in one irregular shared frame are the same, and the memory access can only be in the irregular shared frame. The size and the shape of the storage space can be adjusted according to the capacity of the storage content, dynamic configuration of the LDM space application, dynamic configuration and dynamic protection of the array internal sharing range are achieved, and the wide applicability of the LDM space is improved.
After the shared memory space is configured, the cores in the irregular sharing are uniformly addressed, the memory access of the cores in the irregular sharing cannot exceed the irregular sharing range, if the preset cores in the irregular sharing exceed the irregular sharing range, an exception occurs, but the execution correctness of other cores in the irregular cores is not affected, so that when one core is abnormal, other cores can normally work, and the larger loss is avoided. For multiple shared space types and capacities, the hardware of the source side core is checked according to the configuration, and the exception is generated when the out-of-range access occurs, but the execution correctness of other cores is not influenced.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (10)

1. A technique for hybrid management of local storage in a processor array, comprising the steps of:
s1: dividing an on-chip local storage (LDM) of each core in the array processor into a first type area, a second type area and a third type area;
s2: setting a first type area as a private storage space for storing local private data, wherein the specific addressing of the first type area is only visible to the application program of the core;
s3: setting a second type area to a shared memory space for holding shared data of the plurality of cores, the shared memory space specifically addressing the shared memory space visible to applications of the plurality of cores;
s4: and setting the third type area as a Cache storage space which is used for mapping to the whole main memory space and managed in a Cache mode so as to enable the access of the application program of the core to the Cache space to be visible.
2. A processor array local store blend management technique as recited in claim 1, wherein: the capacity of the private storage space, the capacity of the shared storage space and the capacity of the Cache storage space are all variable.
3. A processor array local storage blend management technique as claimed in claim 1 or 2, wherein: the shared memory space supports sharing by mapping to accomplish multiple granularities and shapes.
4. A processor array local store blend management technique as recited in claim 3, wherein: the array processor is an array processor with 8 x 8 cores and 8 cores per row and column, the shared memory space includes 16 quad-core neighborhood shares, the quad-core neighborhood shares being a quad-core neighborhood share to divide the array processor equally into 16, including four cores.
5. A processor array local store blend management technique as recited in claim 3, wherein: the array processor is an array processor with 8 x 8 cores, 8 cores per row and column, the shared memory space includes 4 sixteen core neighbor shares, the sixteen core neighbor shares being sixteen core neighbor shares to divide the array processor equally into 4, including sixteen cores.
6. A processor array local store blend management technique as recited in claim 3, wherein: the array processor is 8 by 8 cores with 8 cores per row and column, and the shared memory space includes 1 sixty-four core neighborhood share with sixty-four cores.
7. A processor array local store blend management technique as recited in claim 3, wherein: the array processor is 8 by 8 cores, each row and each column is an array processor of 8 cores, the shared memory space comprises 8 row shares, and the row shares are used for setting each row of the array processor as a row share of the shared memory space.
8. A processor array local store blend management technique as recited in claim 3, wherein: the array processor is 8 by 8 cores, each row and column is an array processor of 8 cores, the shared memory space comprises 8 column shares, and the column shares are column shares used for setting each column of the array processor as the shared memory space.
9. A processor array local store blend management technique as recited in claim 3, wherein: the array processor is an 8 by 8 core array processor with 8 cores per row and column, the shared space includes a plurality of irregular shares that are non-16 four core neighbor shares, non-4 sixteen core neighbor shares, non-1 sixty-four core neighbor shares, non-8 row shares, non-8 column shares, the column shares are column shares to set each column of the array processor as a shared memory space, the row shares are row shares to set each row of the array processor as a shared memory space, the sixteen core neighbor shares are sixteen core neighbor shares to equally divide the array processor into 4 cores including sixteen cores, the four core neighbor shares are four core neighbor shares to equally divide the array processor into 16 cores including four cores, the sixty-four core neighbor sharing is sixty-four core neighbor sharing with sixty-four cores.
10. A processor array local store blend management technique as recited in claim 9, wherein: after the shared memory space is configured, the cores in the irregular sharing are subjected to unified addressing, the memory access of the cores in the irregular sharing cannot exceed the range of the irregular sharing, if the preset cores in the irregular sharing exceed the range of the irregular sharing, an exception is generated, but the execution correctness of other cores in the irregular cores is not influenced.
CN201910864444.2A 2019-09-12 2019-09-12 Processor array local storage hybrid management method Active CN110704362B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910864444.2A CN110704362B (en) 2019-09-12 2019-09-12 Processor array local storage hybrid management method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910864444.2A CN110704362B (en) 2019-09-12 2019-09-12 Processor array local storage hybrid management method

Publications (2)

Publication Number Publication Date
CN110704362A true CN110704362A (en) 2020-01-17
CN110704362B CN110704362B (en) 2021-03-12

Family

ID=69195264

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910864444.2A Active CN110704362B (en) 2019-09-12 2019-09-12 Processor array local storage hybrid management method

Country Status (1)

Country Link
CN (1) CN110704362B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115114192A (en) * 2021-03-23 2022-09-27 北京灵汐科技有限公司 Memory interface, functional core, many-core system and memory data access method
WO2022199357A1 (en) * 2021-03-23 2022-09-29 北京灵汐科技有限公司 Data processing method and apparatus, electronic device, and computer-readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050021944A1 (en) * 2003-06-23 2005-01-27 International Business Machines Corporation Security architecture for system on chip
CN1758229A (en) * 2005-10-28 2006-04-12 中国人民解放军国防科学技术大学 Local space shared memory method of heterogeneous multi-kernel microprocessor
CN101551761A (en) * 2009-04-30 2009-10-07 浪潮电子信息产业股份有限公司 Method for sharing stream memory of heterogeneous multi-processor
CN101601017A (en) * 2007-02-28 2009-12-09 学校法人早稻田大学 The generation method and the program of storage management method, signal conditioning package, program
CN101706755A (en) * 2009-11-24 2010-05-12 中国科学技术大学苏州研究院 Caching collaboration system of on-chip multi-core processor and cooperative processing method thereof
CN102073533A (en) * 2011-01-14 2011-05-25 中国人民解放军国防科学技术大学 Multicore architecture supporting dynamic binary translation
CN105183662A (en) * 2015-07-30 2015-12-23 复旦大学 Cache consistency protocol-free distributed sharing on-chip storage framework
CN107168683A (en) * 2017-05-05 2017-09-15 中国科学院软件研究所 GEMM dense matrix multiply high-performance implementation method on the domestic many-core CPU of Shen prestige 26010

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050021944A1 (en) * 2003-06-23 2005-01-27 International Business Machines Corporation Security architecture for system on chip
CN1758229A (en) * 2005-10-28 2006-04-12 中国人民解放军国防科学技术大学 Local space shared memory method of heterogeneous multi-kernel microprocessor
CN101601017A (en) * 2007-02-28 2009-12-09 学校法人早稻田大学 The generation method and the program of storage management method, signal conditioning package, program
CN101551761A (en) * 2009-04-30 2009-10-07 浪潮电子信息产业股份有限公司 Method for sharing stream memory of heterogeneous multi-processor
CN101706755A (en) * 2009-11-24 2010-05-12 中国科学技术大学苏州研究院 Caching collaboration system of on-chip multi-core processor and cooperative processing method thereof
CN102073533A (en) * 2011-01-14 2011-05-25 中国人民解放军国防科学技术大学 Multicore architecture supporting dynamic binary translation
CN105183662A (en) * 2015-07-30 2015-12-23 复旦大学 Cache consistency protocol-free distributed sharing on-chip storage framework
CN107168683A (en) * 2017-05-05 2017-09-15 中国科学院软件研究所 GEMM dense matrix multiply high-performance implementation method on the domestic many-core CPU of Shen prestige 26010

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
周永彬等: "基于软硬件的协同支持在众核上对1-DFFT算法的优化研究", 《计算机学报》 *
李宏亮等: "面向智能计算的国产众核处理器架构研究", 《中国科学:信息科学》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115114192A (en) * 2021-03-23 2022-09-27 北京灵汐科技有限公司 Memory interface, functional core, many-core system and memory data access method
WO2022199357A1 (en) * 2021-03-23 2022-09-29 北京灵汐科技有限公司 Data processing method and apparatus, electronic device, and computer-readable storage medium

Also Published As

Publication number Publication date
CN110704362B (en) 2021-03-12

Similar Documents

Publication Publication Date Title
EP2645259B1 (en) Method, device and system for caching data in multi-node system
JP5241838B2 (en) System and method for allocating cache sectors (cache sector allocation)
CN105103144B (en) For the device and method of the self adaptive control of memory
US8656397B2 (en) Migrating groups of threads across NUMA nodes based on remote page access frequency
US20150067269A1 (en) Method for building multi-processor system with nodes having multiple cache coherency domains
CN110704362B (en) Processor array local storage hybrid management method
US20040117594A1 (en) Memory management method
CN112506823B (en) FPGA data reading and writing method, device, equipment and readable storage medium
CN111488114A (en) Reconfigurable processor architecture and computing device
JP2004038807A (en) Cache memory device and memory allocation method
US20230251903A1 (en) High bandwidth memory system with dynamically programmable distribution scheme
EP4012569A1 (en) Ai accelerator, cache memory and method of operating cache memory using the same
Huang et al. Vulnerability-aware energy optimization using reconfigurable caches in multicore systems
EP3839717B1 (en) High bandwidth memory system with crossbar switch for dynamically programmable distribution scheme
US10540286B2 (en) Systems and methods for dynamically modifying coherence domains
JP6059360B2 (en) Buffer processing method and apparatus
KR101967857B1 (en) Processing in memory device with multiple cache and memory accessing method thereof
US20200249852A1 (en) Methods for Aligned, MPU Region, and Very Small Heap Block Allocations
US11314656B2 (en) Restartable, lock-free concurrent shared memory state with pointers
CN104932990B (en) The replacement method and device of data block in a kind of cache memory
EP4120087B1 (en) Systems, methods, and devices for utilization aware memory allocation
US20240220409A1 (en) Unified flexible cache
US11847328B1 (en) Method and system for memory pool management
US11023319B2 (en) Maintaining a consistent logical data size with variable protection stripe size in an array of independent disks system
Lee et al. T-CAT: Dynamic Cache Allocation for Tiered Memory Systems with Memory Interleaving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant