WO2019143442A1 - Near-memory hardened compute blocks for configurable computing substrates - Google Patents

Near-memory hardened compute blocks for configurable computing substrates Download PDF

Info

Publication number
WO2019143442A1
WO2019143442A1 PCT/US2018/067067 US2018067067W WO2019143442A1 WO 2019143442 A1 WO2019143442 A1 WO 2019143442A1 US 2018067067 W US2018067067 W US 2018067067W WO 2019143442 A1 WO2019143442 A1 WO 2019143442A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
die
configurable computing
data
interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2018/067067
Other languages
English (en)
French (fr)
Inventor
Nuwan Jayasena
Michael Ignatowski
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced Micro Devices Inc
Original Assignee
Advanced Micro Devices Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced Micro Devices Inc filed Critical Advanced Micro Devices Inc
Priority to KR1020207021744A priority Critical patent/KR102789084B1/ko
Priority to EP18900755.2A priority patent/EP3740876A4/en
Priority to CN201880086468.3A priority patent/CN111602124B/zh
Priority to JP2020536026A priority patent/JP7403457B2/ja
Publication of WO2019143442A1 publication Critical patent/WO2019143442A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • G06F13/1694Configuration of memory controller to different memory types
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7839Architectures of general purpose stored program computers comprising a single central processing unit with memory
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7867Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/068Hybrid storage device

Definitions

  • Configurable platforms are being deployed in data centers and are promising candidate architectures for accelerating certain classes of workloads.
  • these configurable platforms typically only outperform graphics processing units (GPUs) and other types of compute-dense processing units for certain types of calculations (those with irregular data flows, computations on non standard bit-width data types, etc.).
  • field- programmable gate array (FPGA) vendors have started incorporating hardened logic, hardcoded logic blocks, or hardened compute blocks (collectively“hardened logic blocks”) for a number of compute element types including central processing units (CPUs), floating point (FP) units and the like in the FPGAs.
  • a conventional technique incorporates multiple discrete devices of each kind in the system at a board level (e.g., a CPU and a GPU along with the FPGA). This technique increases system-level cost and complexity as multiple discrete devices need to be incorporated and coordinated.
  • Another conventional technique incorporates hardened logic blocks on the FPGA device. This technique requires the manufacture of many different devices with varying mixes of hardened logic block types and still runs the risk of not being optimal for any particular workload.
  • Figure 1 is an example high level block diagram of a configurable computing substrate with near-memory hardened logic in accordance with some embodiments
  • Figure 2 is an example detailed block diagram of a base die integrated with a memory stack in accordance with some embodiments
  • Figure 3 is another example block diagram of a configurable computing substrate with near-memory hardened logic in accordance with some embodiments
  • Figure 4 is yet another example block diagram of a configurable computing substrate with near-memory hardened logic in accordance with some embodiments
  • Figure 5 is an example flowchart for using a configurable computing substrate with near-memory hardened logic in accordance with some embodiments.
  • Figure 6 is a block diagram of an example device in which one or more disclosed embodiments may be implemented.
  • Configurable computing substrates have gained considerable attention recently for datacenter deployments and machine learning acceleration.
  • Configurable computing substrates can refer to a field-programmable gate array (FPGA), complex programmable logic devices (CPLD), coarse-grain reconfigurable arrays (CGRA), gate arrays and other similar platforms or devices.
  • FPGA field-programmable gate array
  • CPLD complex programmable logic devices
  • CGRA coarse-grain reconfigurable arrays
  • gate arrays and other similar platforms or devices.
  • such configurable computing substrates can be augmented with hardened logic blocks for efficient support of computations with dense arithmetic or complex control flows. That is, configurable computing substrates typically incorporate hardened logic blocks for functionality that cannot be efficiently implemented using configurable logic blocks.
  • Described herein is a configurable computing system which uses near memory and in-memory hardened logic blocks to enhance the capabilities of configurable computing substrates without requiring additional discrete components to be added to the configurable computing substrates, such as an FPGA device.
  • These hardened logic blocks can include but are not limited to, central processing units (CPUs), CPU cores, graphics processing units (GPUs), GPU cores, hard-coded accelerators, application specific integrated circuit (ASIC) blocks, execution engines, data-parallel floating-point-capable execution engines and other similar devices or logic and combinations thereof.
  • the hardened logic blocks are incorporated into memory modules.
  • the hardened logic blocks are on a base or logic die or a memory die of the memory module.
  • a separate die can be added and included as part of the memory module for implementing the hardened logic blocks.
  • the memory modules include an interface or communication logic to communicate between the configurable computing substrate and the memory module.
  • the interface between the configurable computing substrate and the memory modules can be proprietary on a condition that the interfaces are compatible to enable mix-and-match use.
  • the interface may be a standard such as Joint Electron Device Engineering Council (JEDEC) high bandwidth memory (HBM) to enable the use of commodity memory modules. Standardization of the interface would enhance flexibility by allowing use of third party memory modules.
  • the memory modules can include on-die memory such as static random access memory (SRAM) or other forms of non-configurable logic to enable more efficient processing for a variety of operations.
  • SRAM static random access memory
  • the memory modules can include a portion of configurable computing substrate logic fabric to enable more efficient processing for a variety of operations.
  • the memory modules can include SRAM and a portion of configurable computing substrate logic fabric to enable more efficient processing for a variety of operations.
  • FIG. 1 shows an example high level block diagram of a configurable platform 100 in accordance with certain implementations.
  • the configurable platform 100 includes a configurable computing substrate 110 connected to or in communication with (collectively “connected to”) memory module(s) 120.
  • the memory module(s) 120 can be 3D-stacked memory modules which have a base or logic die 130 stacked with memory dies 140.
  • the configurable computing substrate 110 writes data to the memory dies 140 via the configurable computing substrate interface 200 and the memory interface 210.
  • the hardened logic block 220 reads the data from the memory dies 140 via the memory interface 210.
  • the results from the hardened logic block 220 are written into the memory dies 140 via the memory interface 210.
  • the configurable computing substrate 110 then reads the results from the memory dies
  • the configurable computing substrate 110 and the hardened logic block 220 can use memory store and load operations to write/read the data in/out of memory and operate on it or vice versa.
  • the configurable computing substrate interface 200 between the configurable computing substrate 110 and the memory module 120 may be an industry-standard memory interface (as only loads/stores are needed from outside the memory module 120) or a proprietary one.
  • FIG. 3 shows an example high level block diagram of a configurable platform 300 in accordance with certain implementations.
  • the configurable platform 300 includes a configurable computing substrate 310 connected to or in communication with (collectively“connected to”) a memory module 320.
  • the memory module 320 can be a 3D-stacked memory module which has a base die 330 connected to a stacked memory 340.
  • the base die 330 includes a configurable computing substrate interface 350 (shown as CCS Interface 350 in Figure 3) for communicating with the configurable computing substrate 310.
  • the configurable computing substrate interface 350 is connected to a memory interface 360, which in turn is connected to the stacked memory 340.
  • the memory interface 360 is further connected to a hardened logic block 370.
  • the hardened logic block 370 can include one or more logic blocks; for ease of description, only one logic block is referred to herein.
  • the configurable computing substrate interface 350 is further connected to an on-die memory 380 within the base die 330 which is included in the address space or range of the memory module 320.
  • the on-die memory 380 can be SRAM.
  • the on-die memory 380 provides efficient processing of data when the data is sized compatibly with the capacity of the on-die memory 380 since reads and writes to the stacked memory 340 can be avoided.
  • the on-die memory 380 can be provisioned on the configurable computing substrate 310.
  • the configurable computing substrate interface 350 enables addressing of the on-die memory 380 from the hardened logic block 370 in the memory module 320.
  • the configurable computing substrate 310 writes data via the configurable computing substrate interface 350.
  • the amount of data to be communicated with the hardened logic block determines where the data is stored. If the amount of data is compatible with the capacity of the on-die memory 380, then the configurable computing substrate 310 writes the data to the on-die memory 380 (using the address space associated with the on-die memory 380) via the configurable computing substrate interface 350. If the amount of data is not compatible with the capacity of the on-die memory 380, then the configurable computing substrate 310 writes the data to the stacked memory 340 (using the address space associated with the stacked memory 340) via the configurable computing substrate interface 350 and the memory interface 360.
  • the hardened logic block 370 reads the data from either the stacked memory 340 via the memory interface 360 or from the on-die memory 380.
  • the results from the hardened logic block 220 are written into the stacked memory 340 via the memory interface 360 or in the on-die memory 380, depending on the amount of data associated with the results and using the appropriate address space.
  • the configurable computing substrate 310 then reads the results from the stacked memory 340 via the configurable computing substrate interface 350 and the memory interface 360 or from the on-die memory 380 via the configurable computing substrate interface 350, using the appropriate address space.
  • FIG. 4 shows an example high level block diagram of a configurable platform 400 in accordance with certain implementations.
  • the configurable platform 400 includes a configurable computing substrate 410 connected to or in communication with (collectively“connected to”) a memory module 420.
  • the memory module 420 can be a 3D-stacked memory module which has a base die 430 connected to a stacked memory 440.
  • the base die 430 includes a configurable computing substrate interface 450 (shown as CCS Interface 450 in Figure 4) for communicating with the configurable computing substrate 410.
  • the configurable computing substrate interface 450 is connected to a memory interface 460, which in turn is connected to the stacked memory 440.
  • the memory interface 460 is further connected to hardened logic block 470.
  • the configurable computing substrate 410 communicates with the hardened logic block 470 via the configurable computing substrate interface 450.
  • the nature or type of the computation determines how data is communicated from the configurable computing substrate 410 to the hardened logic block 470.
  • the nature of the computation can refer to streaming computations where the configurable computing substrate fabric 480 perform some step(s) of a computation and the hardened logic block 470 performs the next step(s) (or vice versa). In this instance, there is not a lot of data that is collected between the steps that happen on each device and data simply gets passed from one device to the other device as the data becomes available. Consequently, having configurable computing substrate fabric 480 right next to the hardened logic block 470 helps with the handoff in these types of computations.
  • the computation is such that the configurable computing substrate fabric 480 on the memory module 420 can be used to interface with the hardened logic block 470
  • data is passed from the configurable computing substrate 410 to the hardened logic block 470 via the configurable computing substrate interface 450 and mediated by the configurable computing substrate fabric 480 on the memory module 420.
  • the nature of the computation is such that the configurable computing substrate fabric 480 on the memory module 420 cannot be used to interface with the hardened logic block 470, then the data is stored in the stacked memory 440 via the configurable computing substrate interface 450 and the memory interface 460.
  • the hardened logic block 470 receives the data through the configurable computing substrate fabric 480 on the memory module 420 or accesses the data from the stacked memory 440 via the memory interface 460.
  • the results from the hardened logic block 470 are written into the stacked memory 440 via the memory interface 460 or communicated to the configurable computing substrate 410 via the configurable computing substrate interface 450 and mediated by the configurable computing substrate fabric 480 on the memory module 420, depending on the nature of the computation.
  • the configurable computing substrate 410 then reads the results from the stacked memory 440 via the configurable computing substrate interface 450 and the memory interface 460 or operates on them directly via the configurable computing substrate interface 450 and mediated by the configurable computing substrate fabric 480 on the memory module 420, as appropriate.
  • a configurable platform can include a configurable computing substrate connected to a memory module, where the memory module includes both an on-die memory as described in Figure 3 and a configurable computing substrate fabric as described in Figure 4.
  • the memory module includes both an on-die memory as described in Figure 3 and a configurable computing substrate fabric as described in Figure 4.
  • Such an implementation can consider both data size and the nature of the computation with additional logic to determine priority considerations between the data size and the nature of the computation determinations.
  • the hardened logic block(s) can be implemented on a memory die or on a separate die connected to the memory module.
  • Figure 5 is an example high level flowchart 500 for data processing using
  • a configurable platform which has a configurable computing substrate in communication with hardened logic block(s) implemented on memory module(s).
  • the configurable computing substrate writes data to the memory module (step 510).
  • the data is stored in a memory die or memory stack.
  • the data is stored in an on-die memory associated with the hardened logic bock(s) when the data is compatible with the capacity of the on-die memory.
  • the data is stored in logic or storage elements within an extension of the configurable logic fabric in the memory module.
  • the hardened logic block(s) accesses the data from the memory module (step 520).
  • the data is read from a memory die or memory stack.
  • the results are read from a memory die or memory stack. In another implementation, the results are read from an on-die memory associated with the hardened logic block(s) when the size of the data associated with the results is compatible with the capacity of the on-die memory. In another implementation, the results are accessed from logic or storage elements within an extension of the configurable logic fabric in the memory module.
  • FIG. 6 is a block diagram of an example device 600 in which one or more features of the disclosure can be implemented.
  • the device 600 can include, for example, a computer, a gaming device, a handheld device, a set-top box, a television, a mobile phone, or a tablet computer.
  • the device 600 includes a processor 602, a memory 604, a storage 606, one or more input devices 608, and one or more output devices 610.
  • the device 600 can also optionally include an input driver 612 and an output driver 614. It is understood that the device 600 can include additional components not shown in Figure 6.
  • the processor 602 includes a central processing unit (CPU), a graphics processing unit (GPU), a CPU and GPU located on the same die, or one or more processor cores, wherein each processor core can be a CPU or a GPU.
  • the memory 604 can be located on the same die as the processor 602, or is located separately from the processor 602.
  • the memory 604 includes a volatile or non-volatile memory, for example, random access memory (RAM), dynamic RAM, or a cache.
  • the storage 606 includes a fixed or removable storage, for example, a hard disk drive, a solid state drive, an optical disk, or a flash drive.
  • the input devices 608 include, without limitation, a keyboard, a keypad, a touch screen, a touch pad, a detector, a microphone, an accelerometer, a gyroscope, a biometric scanner, or a network connection (e.g., a wireless local area network card for transmission and/or reception of wireless IEEE 802 signals).
  • the output devices 610 include, without hmitation, a configurable computing substrate, a display, a speaker, a printer, a haptic feedback device, one or more lights, an antenna, or a network connection (e.g., a wireless local area network card for transmission and/or reception of wireless IEEE 802 signals).
  • a network connection e.g., a wireless local area network card for transmission and/or reception of wireless IEEE 802 signals.
  • the input driver 612 communicates with the processor 602 and the input devices 608, and permits the processor 602 to receive input from the input devices 608.
  • the output driver 614 communicates with the processor 602 and the output devices 610, and permits the processor 602 to send output to the output devices 610. It is noted that the input driver 612 and the output driver 614 are optional components, and that the device 600 will operate in the same manner if the input driver 612 and the output driver 614 are not present.
  • a configurable computing platform includes a configurable computing substrate and a memory module connected to the configurable computing substrate.
  • the memory module including a memory die and another die.
  • the another die including a hardened logic block configured to operate on data sent by the configurable computing substrate, a configurable computing substrate interface configured to communicate with the configurable computing substrate and a memory interface configured to communicate with the memory die, the configurable computing substrate interface, and the hardened logic block.
  • the hardened logic block is at least one of: a central processing unit (CPU), a CPU core, a graphics processing unit (GPU), a GPU core, a hard-coded accelerator, an application specific integrated circuit block, an execution engine, and a data-parallel floating-point-capable execution engine.
  • the another die further includes an on-die memory configured to communicate with the configurable computing substrate interface and the hardened logic block.
  • the on-die memory is further configured to communicate with the memory interface and functions as a cache for the memory die.
  • the on-die memory is included in the address range of the memory module.
  • the data from the configurable computing substrate is written to the on-die memory when the size of the data is substantially matched with the capacity of the on-die memory.
  • results from the hardened logic block are written to the on-die memory when a data size of the results is substantially matched with the capacity of the on-die memory.
  • the die further includes a configurable computing substrate fabric configured to communicate with the configurable computing substrate interface and the hardened logic block.
  • the configurable computing substrate sends the data to the hardened logic block via the configurable computing substrate fabric based on a type of computation.
  • the die further includes an on-die memory configured to communicate with the configurable computing substrate interface and the hardened logic block and a configurable computing substrate fabric configured to communicate with the configurable computing substrate interface and the hardened logic block.
  • the die is a logic die.
  • the die is a memory die.
  • a method for computing using a configurable computing platform includes connecting a configurable computing substrate with a memory module.
  • the configurable computing substrate writes data to a memory die included in the memory module via a configurable computing substrate interface and a memory interface, the configurable computing substrate interface and the memory interface being provided on a die included in the memory module.
  • Data from the memory die is accessed by a hardened logic block via the memory interface, the hardened logic block being provided on the die.
  • results are written to the memory die by the hardened logic block via the memory interface.
  • data is written to an on-die memory via the configurable computing substrate interface when the size of the data is substantially matched with a capacity of the on-die memory, the on-die memory being provided on the die.
  • results are written to the on-die memory by the hardened logic block when a data size of the results is substantially matched with the capacity of the on-die memory.
  • a portion of the data is written to the on-die memory when the data is being written to the memory die and subsequent reads are serviced to the portion of the data.
  • the on-die memory is included in the address range of the memory module.
  • the data between the configurable computing substrate and the hardened logic block is communicated via a configurable computing substrate fabric and the configurable computing substrate interface based on a type of computation, the configurable computing substrate fabric being provided on the die.
  • the data is written to an on-die memory via the configurable computing substrate interface when the size of the data is substantially matched with the capacity of the on-die memory, the on-die memory being provided on the die.
  • results are written to the memory die by the hardened logic block when a data size of the results is substantially matched with the capacity of the on-die memory.
  • processors include, by way of example, a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) circuits, any other type of integrated circuit (IC), and/or a state machine.
  • DSP digital signal processor
  • ASICs Application Specific Integrated Circuits
  • FPGAs Field Programmable Gate Arrays
  • Such processors may be manufactured by configuring a manufacturing process using the results of processed hardware description language (HDL) instructions and other intermediary data including netlists (such instructions capable of being stored on a computer readable media).
  • HDL hardware description language
  • netlists such instructions capable of being stored on a computer readable media.
  • the results of such processing may be maskworks that are then used in a semiconductor manufacturing process to manufacture a processor which implements aspects of the embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Human Computer Interaction (AREA)
  • Logic Circuits (AREA)
  • Memory System (AREA)
  • Stored Programmes (AREA)
PCT/US2018/067067 2018-01-16 2018-12-21 Near-memory hardened compute blocks for configurable computing substrates Ceased WO2019143442A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
KR1020207021744A KR102789084B1 (ko) 2018-01-16 2018-12-21 컨피규러블 컴퓨팅 기판을 위한 니어-메모리 강화 컴퓨팅 블록
EP18900755.2A EP3740876A4 (en) 2018-01-16 2018-12-21 LOCAL HARDENED CALCULATION BLOCKS FOR CONFIGURABLE COMPUTER SUBSTRATES
CN201880086468.3A CN111602124B (zh) 2018-01-16 2018-12-21 用于可配置计算基板的近存储器硬化计算块
JP2020536026A JP7403457B2 (ja) 2018-01-16 2018-12-21 構成可能コンピューティング基板についてのニアメモリのハード化された計算ブロック

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15/872,943 2018-01-16
US15/872,943 US10579557B2 (en) 2018-01-16 2018-01-16 Near-memory hardened compute blocks for configurable computing substrates

Publications (1)

Publication Number Publication Date
WO2019143442A1 true WO2019143442A1 (en) 2019-07-25

Family

ID=67213994

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/067067 Ceased WO2019143442A1 (en) 2018-01-16 2018-12-21 Near-memory hardened compute blocks for configurable computing substrates

Country Status (6)

Country Link
US (1) US10579557B2 (https=)
EP (1) EP3740876A4 (https=)
JP (1) JP7403457B2 (https=)
KR (1) KR102789084B1 (https=)
CN (1) CN111602124B (https=)
WO (1) WO2019143442A1 (https=)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114497021A (zh) * 2020-10-27 2022-05-13 安徽寒武纪信息科技有限公司 一种集成电路装置及其加工方法、电子设备和板卡
US12229069B2 (en) * 2020-10-28 2025-02-18 Intel Corporation Accelerator controller hub
US11861366B2 (en) 2021-08-11 2024-01-02 Micron Technology, Inc. Efficient processing of nested loops for computing device with multiple configurable processing elements using multiple spoke counts
CN115810016B (zh) * 2023-02-13 2023-04-28 四川大学 肺部感染cxr图像自动识别方法、系统、存储介质及终端

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050166038A1 (en) * 2002-04-10 2005-07-28 Albert Wang High-performance hybrid processor with configurable execution units
US20060031659A1 (en) * 2004-08-09 2006-02-09 Arches Computing Systems Multi-processor reconfigurable computing system
US20090055596A1 (en) * 2007-08-20 2009-02-26 Convey Computer Multi-processor system having at least one processor that comprises a dynamically reconfigurable instruction set
US20120079177A1 (en) * 2008-08-05 2012-03-29 Convey Computer Memory interleave for heterogeneous computing
US8738891B1 (en) * 2004-11-15 2014-05-27 Nvidia Corporation Methods and systems for command acceleration in a video processor via translation of scalar instructions into vector instructions
US20140176187A1 (en) 2012-12-23 2014-06-26 Advanced Micro Devices, Inc. Die-stacked memory device with reconfigurable logic
US20150088948A1 (en) 2013-09-20 2015-03-26 Altera Corporation Hybrid architecture for signal processing

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0433029A (ja) * 1990-05-24 1992-02-04 Matsushita Electric Ind Co Ltd メモリ装置とその駆動方法
US6301696B1 (en) * 1999-03-30 2001-10-09 Actel Corporation Final design method of a programmable logic device that is based on an initial design that consists of a partial underlying physical template
US20060242611A1 (en) * 2005-04-07 2006-10-26 Microsoft Corporation Integrating programmable logic into personal computer (PC) architecture
US20070053349A1 (en) * 2005-09-02 2007-03-08 Bryan Rittmeyer Network interface accessing multiple sized memory segments
US7539967B1 (en) * 2006-05-05 2009-05-26 Altera Corporation Self-configuring components on a device
US8954685B2 (en) * 2008-06-23 2015-02-10 International Business Machines Corporation Virtualized SAS adapter with logic unit partitioning
US20100070733A1 (en) * 2008-09-18 2010-03-18 Seagate Technology Llc System and method of allocating memory locations
US8105885B1 (en) * 2010-08-06 2012-01-31 Altera Corporation Hardened programmable devices
US9448947B2 (en) * 2012-06-01 2016-09-20 Qualcomm Incorporated Inter-chip memory interface structure
US10079044B2 (en) * 2012-12-20 2018-09-18 Advanced Micro Devices, Inc. Processor with host and slave operating modes stacked with memory
US9720843B2 (en) * 2012-12-28 2017-08-01 Intel Corporation Access type protection of memory reserved for use by processor logic
US9262163B2 (en) * 2012-12-29 2016-02-16 Intel Corporation Real time instruction trace processors, methods, and systems
US9224697B1 (en) * 2013-12-09 2015-12-29 Xilinx, Inc. Multi-die integrated circuits implemented using spacer dies
US9921989B2 (en) * 2014-07-14 2018-03-20 Intel Corporation Method, apparatus and system for modular on-die coherent interconnect for packetized communication
US9870325B2 (en) * 2015-05-19 2018-01-16 Intel Corporation Common die implementation for memory devices with independent interface paths
US9698790B2 (en) * 2015-06-26 2017-07-04 Advanced Micro Devices, Inc. Computer architecture using rapidly reconfigurable circuits and high-bandwidth memory interfaces
US9767028B2 (en) * 2015-10-30 2017-09-19 Advanced Micro Devices, Inc. In-memory interconnect protocol configuration registers
US9977609B2 (en) * 2016-03-07 2018-05-22 Advanced Micro Devices, Inc. Efficient accesses of data structures using processing near memory
KR102548591B1 (ko) * 2016-05-30 2023-06-29 삼성전자주식회사 반도체 메모리 장치 및 그것의 동작 방법

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050166038A1 (en) * 2002-04-10 2005-07-28 Albert Wang High-performance hybrid processor with configurable execution units
US20060031659A1 (en) * 2004-08-09 2006-02-09 Arches Computing Systems Multi-processor reconfigurable computing system
US8738891B1 (en) * 2004-11-15 2014-05-27 Nvidia Corporation Methods and systems for command acceleration in a video processor via translation of scalar instructions into vector instructions
US20090055596A1 (en) * 2007-08-20 2009-02-26 Convey Computer Multi-processor system having at least one processor that comprises a dynamically reconfigurable instruction set
US20120079177A1 (en) * 2008-08-05 2012-03-29 Convey Computer Memory interleave for heterogeneous computing
US20140176187A1 (en) 2012-12-23 2014-06-26 Advanced Micro Devices, Inc. Die-stacked memory device with reconfigurable logic
US20150088948A1 (en) 2013-09-20 2015-03-26 Altera Corporation Hybrid architecture for signal processing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3740876A4

Also Published As

Publication number Publication date
KR102789084B1 (ko) 2025-04-01
US10579557B2 (en) 2020-03-03
JP2021510863A (ja) 2021-04-30
EP3740876A4 (en) 2021-10-06
EP3740876A1 (en) 2020-11-25
CN111602124A (zh) 2020-08-28
CN111602124B (zh) 2024-09-20
JP7403457B2 (ja) 2023-12-22
KR20200100824A (ko) 2020-08-26
US20190220426A1 (en) 2019-07-18

Similar Documents

Publication Publication Date Title
CN109388595B (zh) 高带宽存储器系统以及逻辑管芯
US10522193B2 (en) Processor with host and slave operating modes stacked with memory
US20190272100A1 (en) Stacked memory device and a memory chip including the same
US10579557B2 (en) Near-memory hardened compute blocks for configurable computing substrates
US8661207B2 (en) Method and apparatus for assigning a memory to multi-processing unit
US8539153B2 (en) System on chip and electronic system having the same
CN106814662A (zh) 加速器控制器和控制加速器逻辑的方法
US12436711B2 (en) Providing fine grain access to package memory
JP2021510863A5 (https=)
US20230318825A1 (en) Separately storing encryption keys and encrypted data in a hybrid memory
JP5706833B2 (ja) グラフィクスメモリの非グラフィクス使用
US20230418604A1 (en) Reconfigurable vector processing in a memory
US9898222B2 (en) SoC fabric extensions for configurable memory maps through memory range screens and selectable address flattening
US10198219B2 (en) Method and apparatus for en route translation in solid state graphics systems
KR20190115811A (ko) 확장 메모리 카드를 포함하는 데이터 처리 시스템
US9720830B2 (en) Systems and methods facilitating reduced latency via stashing in system on chips
US12498876B2 (en) Performing distributed processing using distributed memory
US12001370B2 (en) Multi-node memory address space for PCIe devices
US8782302B2 (en) Method and apparatus for routing transactions through partitions of a system-on-chip
KR20240041971A (ko) 다양한 전력 상태를 갖는 디바이스에 대한 계층적 상태 저장 및 복원
US12596479B2 (en) Flexible memory system
US20230317561A1 (en) Scalable architecture for multi-die semiconductor packages
US12511100B2 (en) System-on-a-chip including soft float function circuit
CN114846455B (zh) 系统直接存储器访问引擎卸载

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18900755

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020536026

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20207021744

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2018900755

Country of ref document: EP

Effective date: 20200817