CA2864752A1 - Multi-core compute cache coherency with a release consistency memory ordering model - Google Patents
Multi-core compute cache coherency with a release consistency memory ordering modelInfo
- Publication number
- CA2864752A1 CA2864752A1 CA2864752A CA2864752A CA2864752A1 CA 2864752 A1 CA2864752 A1 CA 2864752A1 CA 2864752 A CA2864752 A CA 2864752A CA 2864752 A CA2864752 A CA 2864752A CA 2864752 A1 CA2864752 A1 CA 2864752A1
- Authority
- CA
- Canada
- Prior art keywords
- cache
- processor
- shared
- programmable processor
- variable data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0837—Cache consistency protocols with software control, e.g. non-cacheable data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0811—Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0891—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using clearing, invalidating or resetting means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0831—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
- G06F12/0833—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means in combination with broadcast means (e.g. for invalidation or updating)
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/30—Providing cache or TLB in specific location of a processing system
- G06F2212/302—In image processor or graphics adapter
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Applications Claiming Priority (7)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201261680201P | 2012-08-06 | 2012-08-06 | |
| US61/680,201 | 2012-08-06 | ||
| US201361800441P | 2013-03-15 | 2013-03-15 | |
| US61/800,441 | 2013-03-15 | ||
| US13/958,399 | 2013-08-02 | ||
| US13/958,399 US9218289B2 (en) | 2012-08-06 | 2013-08-02 | Multi-core compute cache coherency with a release consistency memory ordering model |
| PCT/US2013/053626 WO2014025691A1 (en) | 2012-08-06 | 2013-08-05 | Multi-core compute cache coherency with a release consistency memory ordering model |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CA2864752A1 true CA2864752A1 (en) | 2014-02-13 |
Family
ID=50026664
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CA2864752A Pending CA2864752A1 (en) | 2012-08-06 | 2013-08-05 | Multi-core compute cache coherency with a release consistency memory ordering model |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US9218289B2 (https=) |
| JP (1) | JP6062550B2 (https=) |
| KR (1) | KR101735222B1 (https=) |
| CN (1) | CN104520825B (https=) |
| CA (1) | CA2864752A1 (https=) |
| WO (1) | WO2014025691A1 (https=) |
Families Citing this family (50)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9396112B2 (en) * | 2013-08-26 | 2016-07-19 | Advanced Micro Devices, Inc. | Hierarchical write-combining cache coherence |
| KR101785301B1 (ko) * | 2013-09-27 | 2017-11-15 | 인텔 코포레이션 | 디바이스들 간의 메모리 리소스를 구성하기 위한 장치, 방법 및 저장 매체 |
| JP6200824B2 (ja) * | 2014-02-10 | 2017-09-20 | ルネサスエレクトロニクス株式会社 | 演算制御装置及び演算制御方法並びにプログラム、OpenCLデバイス |
| US20150331608A1 (en) * | 2014-05-16 | 2015-11-19 | Samsung Electronics Co., Ltd. | Electronic system with transactions and method of operation thereof |
| US9652390B2 (en) * | 2014-08-05 | 2017-05-16 | Advanced Micro Devices, Inc. | Moving data between caches in a heterogeneous processor system |
| US9495302B2 (en) * | 2014-08-18 | 2016-11-15 | Xilinx, Inc. | Virtualization of memory for programmable logic |
| CN105740164B (zh) * | 2014-12-10 | 2020-03-17 | 阿里巴巴集团控股有限公司 | 支持缓存一致性的多核处理器、读写方法、装置及设备 |
| US20160210231A1 (en) * | 2015-01-21 | 2016-07-21 | Mediatek Singapore Pte. Ltd. | Heterogeneous system architecture for shared memory |
| CN115082282B (zh) | 2015-06-10 | 2025-10-31 | 无比视视觉技术有限公司 | 用于处理图像的图像处理器和方法 |
| KR102026877B1 (ko) * | 2015-06-16 | 2019-09-30 | 한국전자통신연구원 | 메모리 관리 유닛 및 그 동작 방법 |
| CN105118520B (zh) * | 2015-07-13 | 2017-11-10 | 腾讯科技(深圳)有限公司 | 一种音频开头爆音的消除方法及装置 |
| JP6739513B2 (ja) * | 2015-07-21 | 2020-08-12 | アンペア・コンピューティング・エルエルシー | Dmb操作を伴うロード/ストア操作を使用するロード獲得/ストア解放命令の実装 |
| CN105426316B (zh) * | 2015-11-09 | 2018-02-13 | 北京大学 | 一种基于配额控制温度的赛道存储芯片及其控制方法 |
| AU2017211781B2 (en) * | 2016-01-26 | 2021-04-22 | Icat Llc | Processor with reconfigurable algorithmic pipelined core and algorithmic matching pipelined compiler |
| FR3048526B1 (fr) * | 2016-03-07 | 2023-01-06 | Kalray | Instruction atomique de portee limitee a un niveau de cache intermediaire |
| US10157134B2 (en) * | 2016-04-11 | 2018-12-18 | International Business Machines Corporation | Decreasing the data handoff interval for a reserved cache line based on an early indication of a systemwide coherence response |
| EP3249541B1 (en) * | 2016-05-27 | 2020-07-08 | NXP USA, Inc. | A data processor |
| CA3033502A1 (en) * | 2016-08-12 | 2018-02-15 | Siemens Product Lifecycle Management Software Inc. | Computer aided design with high resolution lattice structures using graphics processing units (gpu) |
| US10241911B2 (en) * | 2016-08-24 | 2019-03-26 | Hewlett Packard Enterprise Development Lp | Modification of multiple lines of cache chunk before invalidation of lines |
| US10248565B2 (en) * | 2016-09-19 | 2019-04-02 | Qualcomm Incorporated | Hybrid input/output coherent write |
| US10255181B2 (en) * | 2016-09-19 | 2019-04-09 | Qualcomm Incorporated | Dynamic input/output coherency |
| US9852202B1 (en) | 2016-09-23 | 2017-12-26 | International Business Machines Corporation | Bandwidth-reduced coherency communication |
| CN106708777A (zh) * | 2017-01-23 | 2017-05-24 | 张军 | 一种多核异构cpu‑gpu‑fpga系统架构 |
| JP6984148B2 (ja) * | 2017-03-22 | 2021-12-17 | 日本電気株式会社 | 計算機システム及びキャッシュ・コヒーレンス方法 |
| US10282811B2 (en) | 2017-04-07 | 2019-05-07 | Intel Corporation | Apparatus and method for managing data bias in a graphics processing architecture |
| US10373285B2 (en) * | 2017-04-09 | 2019-08-06 | Intel Corporation | Coarse grain coherency |
| US10409614B2 (en) | 2017-04-24 | 2019-09-10 | Intel Corporation | Instructions having support for floating point and integer data types in the same register |
| WO2019094843A1 (en) * | 2017-11-10 | 2019-05-16 | Nvidia Corporation | Systems and methods for safe and reliable autonomous vehicles |
| GB2570665B (en) * | 2018-01-31 | 2020-08-26 | Advanced Risc Mach Ltd | Address translation in a data processing apparatus |
| US10831650B2 (en) * | 2018-03-07 | 2020-11-10 | Exten Technologies, Inc. | Systems and methods for accessing non-volatile memory and write acceleration cache |
| US10929144B2 (en) | 2019-02-06 | 2021-02-23 | International Business Machines Corporation | Speculatively releasing store data before store instruction completion in a processor |
| US10908821B2 (en) | 2019-02-28 | 2021-02-02 | Micron Technology, Inc. | Use of outstanding command queues for separate read-only cache and write-read cache in a memory sub-system |
| US10970222B2 (en) * | 2019-02-28 | 2021-04-06 | Micron Technology, Inc. | Eviction of a cache line based on a modification of a sector of the cache line |
| US11288199B2 (en) | 2019-02-28 | 2022-03-29 | Micron Technology, Inc. | Separate read-only cache and write-read cache in a memory sub-system |
| US11106609B2 (en) | 2019-02-28 | 2021-08-31 | Micron Technology, Inc. | Priority scheduling in queues to access cache data in a memory sub-system |
| WO2020190814A1 (en) | 2019-03-15 | 2020-09-24 | Intel Corporation | Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format |
| CN110489356B (zh) * | 2019-08-06 | 2022-02-22 | 上海商汤智能科技有限公司 | 信息处理方法、装置、电子设备及存储介质 |
| CN112540938B (zh) * | 2019-09-20 | 2024-08-09 | 阿里巴巴集团控股有限公司 | 处理器核、处理器、装置和方法 |
| US12602310B1 (en) * | 2019-10-04 | 2026-04-14 | Nvidia Corporation | Automatic stale data detection for accelerator-enabled programs |
| US11861761B2 (en) | 2019-11-15 | 2024-01-02 | Intel Corporation | Graphics processing unit processing and caching improvements |
| US11568523B1 (en) * | 2020-03-03 | 2023-01-31 | Nvidia Corporation | Techniques to perform fast fourier transform |
| US11550725B2 (en) | 2020-05-18 | 2023-01-10 | Micron Technology, Inc. | Dynamically sized redundant write buffer with sector-based tracking |
| US11301380B2 (en) * | 2020-05-18 | 2022-04-12 | Micron Technology, Inc. | Sector-based tracking for a page cache |
| US11681624B2 (en) * | 2020-07-17 | 2023-06-20 | Qualcomm Incorporated | Space and time cache coherency |
| US11636893B2 (en) * | 2020-10-19 | 2023-04-25 | Micron Technology, Inc. | Memory device with multiple row buffers |
| US11687459B2 (en) * | 2021-04-14 | 2023-06-27 | Hewlett Packard Enterprise Development Lp | Application of a default shared state cache coherency protocol |
| US12061545B1 (en) | 2022-02-10 | 2024-08-13 | Apple Inc. | Memory page manager |
| US12487927B2 (en) | 2023-09-25 | 2025-12-02 | Apple Inc. | Remote cache invalidation |
| US12468644B2 (en) | 2024-01-31 | 2025-11-11 | Apple Inc. | Invalidation of permission information stored by another processor |
| US12505043B1 (en) * | 2024-09-27 | 2025-12-23 | Intel Corporation | Methods and apparatus for timed hardware delay for reductions in instruction fetch traffic |
Family Cites Families (26)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH02100155A (ja) * | 1988-10-06 | 1990-04-12 | Nec Corp | データ処理システム |
| US5553266A (en) | 1992-04-24 | 1996-09-03 | Digital Equipment Corporation | Update vs. invalidate policy for a snoopy bus protocol |
| US5533035A (en) * | 1993-06-16 | 1996-07-02 | Hal Computer Systems, Inc. | Error detection and correction method and apparatus |
| JP2916421B2 (ja) * | 1996-09-09 | 1999-07-05 | 株式会社東芝 | キャッシュフラッシュ装置およびデータ処理方法 |
| US6557082B1 (en) | 2000-03-30 | 2003-04-29 | International Business Machines Corporation | Method and apparatus for ensuring cache coherency for spawned dependent transactions in a multi-system environment with shared data storage devices |
| JP2002100155A (ja) * | 2000-09-21 | 2002-04-05 | Toshiba Corp | 磁気ディスク装置及びデータ編集方法 |
| US6745297B2 (en) * | 2000-10-06 | 2004-06-01 | Broadcom Corporation | Cache coherent protocol in which exclusive and modified data is transferred to requesting agent from snooping agent |
| US6571321B2 (en) * | 2001-07-27 | 2003-05-27 | Broadcom Corporation | Read exclusive for fast, simple invalidate |
| US7194587B2 (en) * | 2003-04-24 | 2007-03-20 | International Business Machines Corp. | Localized cache block flush instruction |
| US7136969B1 (en) | 2003-06-17 | 2006-11-14 | Emc Corporation | Using the message fabric to maintain cache coherency of local caches of global memory |
| US7844801B2 (en) * | 2003-07-31 | 2010-11-30 | Intel Corporation | Method and apparatus for affinity-guided speculative helper threads in chip multiprocessors |
| JP4376692B2 (ja) * | 2004-04-30 | 2009-12-02 | 富士通株式会社 | 情報処理装置、プロセッサ、プロセッサの制御方法、情報処理装置の制御方法、キャッシュメモリ |
| US20060026371A1 (en) * | 2004-07-30 | 2006-02-02 | Chrysos George Z | Method and apparatus for implementing memory order models with order vectors |
| KR100864834B1 (ko) * | 2007-04-30 | 2008-10-23 | 한국전자통신연구원 | 메모리 재할당을 이용한 다중 프로세서 간의 데이터 전송장치 및 방법 |
| US8131941B2 (en) | 2007-09-21 | 2012-03-06 | Mips Technologies, Inc. | Support for multiple coherence domains |
| WO2009050644A1 (en) | 2007-10-18 | 2009-04-23 | Nxp B.V. | Data processing system with a plurality of processors, cache circuits and a shared memory |
| US8605099B2 (en) * | 2008-03-31 | 2013-12-10 | Intel Corporation | Partition-free multi-socket memory system architecture |
| JP4631948B2 (ja) * | 2008-08-13 | 2011-02-16 | 日本電気株式会社 | 情報処理装置及び順序保証方式 |
| EP2441005A2 (en) | 2009-06-09 | 2012-04-18 | Martin Vorbach | System and method for a cache in a multi-core processor |
| US8397049B2 (en) * | 2009-07-13 | 2013-03-12 | Apple Inc. | TLB prefetching |
| US8615637B2 (en) * | 2009-09-10 | 2013-12-24 | Advanced Micro Devices, Inc. | Systems and methods for processing memory requests in a multi-processor system using a probe engine |
| EP2499576A2 (en) | 2009-11-13 | 2012-09-19 | Richard S. Anderson | Distributed symmetric multiprocessing computing architecture |
| JP5283128B2 (ja) * | 2009-12-16 | 2013-09-04 | 学校法人早稲田大学 | プロセッサによって実行可能なコードの生成方法、記憶領域の管理方法及びコード生成プログラム |
| US8669990B2 (en) * | 2009-12-31 | 2014-03-11 | Intel Corporation | Sharing resources between a CPU and GPU |
| US9081501B2 (en) * | 2010-01-08 | 2015-07-14 | International Business Machines Corporation | Multi-petascale highly efficient parallel supercomputer |
| US8935475B2 (en) * | 2012-03-30 | 2015-01-13 | Ati Technologies Ulc | Cache management for memory operations |
-
2013
- 2013-08-02 US US13/958,399 patent/US9218289B2/en active Active
- 2013-08-05 JP JP2015526608A patent/JP6062550B2/ja active Active
- 2013-08-05 CA CA2864752A patent/CA2864752A1/en active Pending
- 2013-08-05 KR KR1020157004705A patent/KR101735222B1/ko active Active
- 2013-08-05 CN CN201380041399.1A patent/CN104520825B/zh active Active
- 2013-08-05 WO PCT/US2013/053626 patent/WO2014025691A1/en not_active Ceased
Also Published As
| Publication number | Publication date |
|---|---|
| CN104520825B (zh) | 2018-02-02 |
| JP6062550B2 (ja) | 2017-01-18 |
| WO2014025691A1 (en) | 2014-02-13 |
| JP2015524597A (ja) | 2015-08-24 |
| US20140040552A1 (en) | 2014-02-06 |
| CN104520825A (zh) | 2015-04-15 |
| KR20150040946A (ko) | 2015-04-15 |
| US9218289B2 (en) | 2015-12-22 |
| KR101735222B1 (ko) | 2017-05-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9218289B2 (en) | Multi-core compute cache coherency with a release consistency memory ordering model | |
| US10365930B2 (en) | Instructions for managing a parallel cache hierarchy | |
| US9952977B2 (en) | Cache operations and policies for a multi-threaded client | |
| US20210365381A1 (en) | Microprocessor architecture having alternative memory access paths | |
| US9588826B2 (en) | Shared virtual memory | |
| US11107176B2 (en) | Scheduling cache traffic in a tile-based architecture | |
| US8938598B2 (en) | Facilitating simultaneous submission to a multi-producer queue by multiple threads with inner and outer pointers | |
| US10896128B2 (en) | Partitioning shared caches | |
| EP2480985B1 (en) | Unified addressing and instructions for accessing parallel memory spaces | |
| US20160210231A1 (en) | Heterogeneous system architecture for shared memory | |
| US8930636B2 (en) | Relaxed coherency between different caches | |
| US20080246773A1 (en) | Indexes of graphics processing objects in graphics processing unit commands | |
| US20140173258A1 (en) | Technique for performing memory access operations via texture hardware | |
| US7103720B1 (en) | Shader cache using a coherency protocol | |
| US9971699B2 (en) | Method to control cache replacement for decoupled data fetch | |
| US8788761B2 (en) | System and method for explicitly managing cache coherence | |
| CN116804975A (zh) | 具有每扇区高速缓存驻留控件的高速缓存存储器 | |
| US20220066946A1 (en) | Techniques to improve translation lookaside buffer reach by leveraging idle resources | |
| US20110167223A1 (en) | Buffer memory device, memory system, and data reading method | |
| US11782838B2 (en) | Command processor prefetch techniques | |
| US9153211B1 (en) | Method and system for tracking accesses to virtual addresses in graphics contexts |