WO2009145608A1 - Unité de traitement de données à architecture flexible - Google Patents

Unité de traitement de données à architecture flexible Download PDF

Info

Publication number
WO2009145608A1
WO2009145608A1 PCT/NL2008/050314 NL2008050314W WO2009145608A1 WO 2009145608 A1 WO2009145608 A1 WO 2009145608A1 NL 2008050314 W NL2008050314 W NL 2008050314W WO 2009145608 A1 WO2009145608 A1 WO 2009145608A1
Authority
WO
WIPO (PCT)
Prior art keywords
processing unit
data processing
fpga
switch fabric
functional elements
Prior art date
Application number
PCT/NL2008/050314
Other languages
English (en)
Inventor
Jos D. L. Haesakkers
Peter Kortekaas
Gerke Stoevelaar
Marco C. De Rooij
Original Assignee
Eonic B.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eonic B.V. filed Critical Eonic B.V.
Priority to PCT/NL2008/050314 priority Critical patent/WO2009145608A1/fr
Publication of WO2009145608A1 publication Critical patent/WO2009145608A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7867Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture

Definitions

  • the present invention relates to a data processing unit comprising a field programmable gate array (FPGA) and input/output interfaces, in which a number of functional elements are implemented in the FPGA.
  • FPGA field programmable gate array
  • a dynamically reconfigurable processing unit which includes a microprocessor, and an embedded Flash memory for non- volatile storage of code, data and bit-streams.
  • the embedded Flash includes a field programmable gate array (FPGA) port.
  • the reconfigurable processing unit further includes a direct memory access (DMA) channel, and an S- RAM embedded FPGA for FPGA reconfigurations.
  • DMA direct memory access
  • S-RAM embedded FPGA has an FPGA programming interface connected to the FPGA port of the Flash memory through the DMA channel.
  • the microprocessor, the embedded Flash memory, the DMA channel and the S-RAM embedded FPGA are integrated as a single chip.
  • Altera has a built-in System On Programmable Chip engine in its Quartus tool.
  • the SOPC Builder tool which is included in the Altera Quartus II software is used to define and generate the system.
  • the SOPC Builder automatically generates the interconnect logic to integrate the components in the hardware system. Selections can be made from a list of standard processor cores and components provided with the Nios II Embedded Design Suite.
  • Custom hardware can be added to accelerate system performance.
  • custom instruction logic can be added to the Nios II core which accelerates CPU performance, or a custom component which offloads tasks from the CPU can be added.
  • the NIOS microprocessor has no memory management unit (MMU) and can not run standard Linux.
  • MMU memory management unit
  • Xilinx has an Embedded Development Kit that enables SOPC design.
  • the flexible 32-bit MicroBlaze soft processor can be implemented on any Xilinx FPGA device without any royalties.
  • Xilinx Embedded Development Kit (EDK) includes the MicroBlaze processor along with a comprehensive set of peripheral IP needed for microcontroller applications in a wide range of applications in automotive, ISM and communications markets and the Xilinx Platform Studio (XPS) tool suite to put together custom embedded processing systems in just a few minutes.
  • the MicroBlaze microprocessor has no MMU and can not run standard Linux.
  • the present invention seeks to provide a flexible architecture data processing unit which is capable of high processing and data throughput performance.
  • a data processing unit according to the preamble defined above is provided, in which the functional elements comprise: at least one embedded processor; an microprocessor interface bus connected to the at least one embedded processor; - a switch fabric connected to the microprocessor interface bus, in which the switch fabric provides a connection with FIFO compatible data sources or destinations; and a pheripheral interconnection bus connected to switch fabric.
  • the at least one embedded processor can be any suitable embedded processor, such as the LEON3 processor from Gaisler Research.
  • the microprocessor interface bus is arranged to control memory access, interrupts and configuration of multiple bus attendees, e.g. in the form of the Advanced Microcontroller Bus Architecture (AMBA) bus.
  • the peripheral interconnection bus can be any high speed standard peripheral interconnection bus, such as the PCI or PCI-X interface.
  • the data processing unit is provided in the form of a Compact PCI carrier that hosts a Mezzanine Card (PMC) in a further embodiment, which allows to use the data processing unit together with other data processing equipment, e.g. in the field of radar data processing.
  • PMC Mezzanine Card
  • the functional elements further comprise at least one interface to a high-speed serial data link, such as the Powerlink, connected to the switch fabric.
  • a high-speed serial data link such as the Powerlink
  • the PowerLink is especially suited for high-speed streaming data applications, and in the present embodiment, this type of interface assures the ability to transfer large amounts of data between various data sources and data destinations.
  • the functional elements further comprise a TCP/IP Offloading Engine (TOE) connected to the microprocessor interface bus and to the switch fabric.
  • TOE TCP/IP Offloading Engine
  • the TOE may be further connected to an Ethernet interface to provide an efficient data path to the outside of the data processing unit.
  • the functional elements further comprise Large FIFO's connected to the switch fabric in a further embodiment, for interfacing with memory units external to the FPGA. In this manner, the (external memory units may be used effectively as FIFO's available to the switch fabric and further elements connected to the switch fabric.
  • the functional elements further comprise a user block connected to the switch fabric (and also possibly to the microprocessor interface bus). This allows users to add additional own IP blocks to the data processing unit as part of the data path which can be controlled by the at least one embedded processor.
  • the functional elements may further comprise remote system upgrade (RSU) circuitry connected to the microprocessor interface bus, which allows the FPGA to reconfigure itself from a serial configuration flash with a selectable image, e.g. for recovery to a factory default image from malicious FPGA programming.
  • RSU remote system upgrade
  • the functional elements may further comprise a Serial Flash Controller (SFC) connected to the microprocessor interface bus and to configuration flash memory external to the FPGA. This functional element allows reprogramming of the on-board serial configuration flash, e.g. using an ethernet or other external interfaces.
  • SFC Serial Flash Controller
  • the data processing unit further comprises a temperature sensor connected to the microprocessor interface bus via a peripheral bus interface. This allows to very efficiently obtain temperature data from the data processing unit, which can be used in various monitoring applications.
  • a flexible architecture data processing unit is provided in the form of a PMC (PCI Mezzanine Card) carrier 10.
  • PMC PCI Mezzanine Card
  • This can be in the form of a single slot 3U Compact PCI board equipped with a PMC mezzanine interface 11, PowerLink interfaces 15 and Ethernet connections 13, 14.
  • the on-board architecture comprising four DDR SDRAM banks 25, 31 , gigabit Ethernet PHY's 26, 28, Temperature Measurement using a temperature sensor 42 and Serial Configuration Flash 41 is controlled by functionality in a big FPGA 1 (such as a Stratix3 FPGA).
  • the FPGA 1 is programmed to comprise a large number of functional elements.
  • the block schematic figure of the PMC Carrier 10, its external connections and internal firmware IP is shown in detail in the figure.
  • the FPGA 1 is programmed to comprise one or more data processors 2.
  • three LEON3 processors are indicated, which are all connected to a microprocessor interface bus (like the Advanced Microcontroller Bus Architecture bus) 29 using high performance bus interfaces 20 (indicated as AHB I/F 20 in the figure).
  • the LEON3 processor 2 is an example of a data processor that can be programmed into an FPGA 1.
  • This Gaisler Research's product consist of user- customizable 32-bit SPARC V8 processor cores, peripheral IP-cores and associated software and development tools.
  • This processor 2 can be operated under control of an operating system such as Linux, e.g. in the form of an embedded Linux with Symmetric Multi Processor (SMP) support on FPGA platform.
  • LINUX support for LEON3 is provided through a special version of the Snapger Embedded Linux distribution. SnapGear Linux is a full source package, containing kernel, libraries and application code for rapid development of embedded Linux systems.
  • the LEON port of SnapGear supports both MMU and non-MMU LEON configurations, as well as the optional V8 mul/div instructions and floating-point unit (FPU).
  • processors 2 provided with memory management units (MMU 22) are used.
  • MMU 22 memory management units
  • the port includes symmetric multi-processing (SMP) support for LEON3 systems with multiple processors.
  • SMP symmetric multi-processing
  • a single cross-compilation tool-chain is provided which is capable of compiling the kernel and applications for any configuration.
  • the AMBA bus 29 furthermore connects to a number of 'standard' elements in a dataprocessing environment using an AHB I/F 20, such as: - an asynchronous memory controller 4, which is in its turn connected to a Flash memory unit 24 outside the FPGA 1 (e.g. a 32M x 16 Flash memory bank); a memory controller 5 (e.g. a double data rate (DDR2) controller), which in its turn is connected to a DDR2 memory unit 25 outside the FPGA 1 ; a gigabit Ethernet media access controller (MAC 6), which in its turn is connected to an Ethernet physical interface transceiver (PHY 26) outside the
  • an AHB I/F 20 such as: - an asynchronous memory controller 4, which is in its turn connected to a Flash memory unit 24 outside the FPGA 1 (e.g. a 32M x 16 Flash memory bank); a memory controller 5 (e.g. a double data rate (DDR2) controller), which in its turn is connected to a DDR2 memory unit 25 outside the FPGA 1
  • FPGA 1 using a Gigabit medium independent interface (GMII), linked to a first Ethernet connector 13.
  • GMII Gigabit medium independent interface
  • AHB/APB AMBA Peripheral Bus
  • AHB/APB AMBA Peripheral Bus
  • I2C interface 43 in this embodiment connected to a temperature sensor 42 outside the FPGA 1
  • IRQCTRL interrupt request controller
  • first I/O block 45 connected to a number of LED' s 48 outside the FPGA 1 for signaling purposes
  • second I/O block 46 and a UART 47 connected to a combined connector 16 of the PMC Carrier 10 (combining e.g. RS232, Triggerbus, JTAG interfacing).
  • a clock unit 49 may be provided external to the FPGA 1 for providing timing signals for one or more functional units programmed into the FPGA 1.
  • a compact PCI bridge 3 may be programmed in the FPGA 1, allowing integration of CompactPCI target interface with a host computer using a cPCI connector 12.
  • the cPCI bridge 3 in the embodiment shown in the figure handles 32-bit cycles at 33 or 66MHz. Via this generic interface the host computer is able to control all kind of IP blocks with single read and write access, transfer large amounts of data with DMA access and handle interrupts.
  • special elements are furthermore programmed into the FPGA 1.
  • a Remote System Upgrade/Serial Configuration Flash Controller connects to the AMBA bus 29.
  • the Serial Flash Controller 40 allows reprogramming of the on-board serial configuration flash unit 41 (outside of the FPGA 1).
  • the new FPGA image can be uploaded from any external interface, like Ethernet, and programmed through the FPGA itself in user mode.
  • the Remote System Upgrade circuitry in Statix3 devices allows the device to reconfigure itself from the (serial) configuration flash unit 41 with a selectable image. This enables the use of a 'factory default image' and several user images.
  • the factory default image advantageously enables recovery from malicious FPGA programming.
  • a switch fabric 9 may be programmed in the FPGA 1 , which switch fabric 9 connects FIFO compatible data sources to FIFO compatible destinations.
  • the number of input ports and output ports is configurable. Connections between input ports and outputs ports can be established under software control. Adding the Switch Fabric 9 allows to provide a high-speed flexible 64-bit data routing. FIFO compatible data sources and destinations may be provided also in the FPGA 1.
  • a first example of FIFO compatible data sources and destinations is a PowerLink interface, programmed in the FPGA 1 as a number of PowerLink Receive units (PLR 33) and PowerLink Transmit units (PLT 34).
  • the PLR's 33 and PLT's 34 are connected to the PowerLink Connector 15.
  • PowerLink is a lightweight, point-to-point, uni-directional serial protocol suitable for high-speed streaming data applications. It has the following advantages: high-speed, low protocol overhead, low gate count, and minimum data transfer latency.
  • PowerLink defines packet encapsulation at the link layer, and data serialization at the physical layer. PowerLink also provides for error checking and data recovery.
  • a second example of FIFO compatible data sources and destinations are Large FIFO's (LFIFO 30).
  • the LFIFO's 30 are connected to the Switch Fabric 9 and to external DDR2 SDRAM units 31 (e.g. three blocks of 32M x 32 DDR2 units as shown in the embodiment of the figure).
  • the LFIFO's 30 allow to configure external DDR2 SDRAM units 31 as standard FIFO queues.
  • the DDR FIFO or LFIFO 30 is a dual clock FIFO that stores the data into Double Data Rate SDRAM units 31.
  • the DDR FIFO 30 is configurable for different DDR SDRAM types and different data bus widths.
  • a wrapper around the DDR FIFO 30 is available to add a Build-In-Self-Test (BIST) for the external DDR memory units 31.
  • BIST Build-In-Self-Test
  • a third example of a FIFO compatible data source/destination is a PCI-X host bridge 32.
  • This PCI-X host bridge 32 is implemented as an integration into the FPGA 1 of a 64-bit 133MHz PCI-X core with both the AMBA bus 29 and the Switch Fabric 9, each using its own AHB IfF.
  • This PCI-X host bridge 32 is a host bridge which can arbiter and communicate the PCI or PCI-X protocol over the PMC mezzanine interface 11. It configures the PCI(X) targets and can handle up to 64-bits cycles at 133MHz.
  • the bridge 32 is controlled via three AMBA bus interfaces; AHB-master, AHB-slave and APB slave.
  • the AHB master interface can initiate DMA cycles. Via the FIFO interfaces automatic high-speed DMA I/O can be done over the PCI-X bus.
  • a fourth example of a FIFO compatible data source/destination is a TCP Offloading Engine (TOE 7), which provided for an integration of a TOE core 7 with a (gigabit) MAC 8, AMBA bus 29 (via AHB I/F) and Switch Fabric 9.
  • TCP Offload Engine or TOE is a technology used in network interface cards to offload processing of the entire TCP/IP stack to the network controller. It is primarily used with high-speed network interfaces, such as gigabit Ethernet and lOgigabit Ethernet where processing overhead of the network stack becomes significant.
  • the TOE 7 allows for very fast access to memory units 31 without loading the AMBA bus 29.
  • a fifth example of a FIFO compatible data source/destination is a User Block 35 as shown in the embodiment of the figure.
  • the user block 35 is an IP development block forming an entity inside the large FPGA 1 as part of the module interconnected with the Switch Fabric 9 and the AMBA bus 29. This allows for users of the PMC carrier 10 to integrate own IP blocks as part of the datapath which can be controlled by software from a LEON3 processor inside the FPGA 1 or an external host via the Compact PCI bus 12 or an Ethernet connection 13, 14.
  • Using the embodiment as shown in the figure in its entirety allows e.g. to provide a direct data throughput path from a hard disk array connected to the PMC I/O connector 11 to a Gigabit Ethernet connector 14, without any influence on the performance of the processors 3. It also allows to define datapaths in a very flexible manner, depending on the environment and applications the PMC carrier 10 is used in.
  • the main element of the PMC carrier 10 is formed by the FPGA 1.
  • the size and further specifications of the FPGA 1 can be chosen to match the requirements for a certain specification. E.g. more than three processors 2 may be provided (or less), and also the type, number and size of memory units 31 may be chosen according to the specific requirements.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Microcomputers (AREA)
  • Logic Circuits (AREA)
  • Stored Programmes (AREA)

Abstract

L'invention porte sur une unité de traitement de données (10) comprenant un réseau prédiffusé programmable par l'utilisateur (FPGA, 1) et des interfaces entrée/sortie (11-16). Un certain nombre d'éléments fonctionnels sont mis en œuvre dans le FPGA (1), tels que : au moins un processeur incorporé (2); un bus d'interface de microprocesseur (29) connecté à l'au moins un processeur incorporé (2); une matrice de commutation (9) connectée au bus d'interface de microprocesseur (29), la matrice de commutation (9) fournissant une connexion à des sources ou destinations de données compatibles avec mode premier entré, premier sorti (FIFO) (7; 30; 32; 33; 34); et un bus d'interconnexion périphérique (32) connecté à la matrice de commutation.
PCT/NL2008/050314 2008-05-27 2008-05-27 Unité de traitement de données à architecture flexible WO2009145608A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/NL2008/050314 WO2009145608A1 (fr) 2008-05-27 2008-05-27 Unité de traitement de données à architecture flexible

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/NL2008/050314 WO2009145608A1 (fr) 2008-05-27 2008-05-27 Unité de traitement de données à architecture flexible

Publications (1)

Publication Number Publication Date
WO2009145608A1 true WO2009145608A1 (fr) 2009-12-03

Family

ID=40286971

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/NL2008/050314 WO2009145608A1 (fr) 2008-05-27 2008-05-27 Unité de traitement de données à architecture flexible

Country Status (1)

Country Link
WO (1) WO2009145608A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102932357A (zh) * 2012-11-07 2013-02-13 中国科学院近代物理研究所 一种加速器高频数字低电平以太网通讯系统及通讯方法
CN104898627A (zh) * 2015-05-29 2015-09-09 江苏海大印染机械有限公司 一种基于pci总线与fpga的印刷数据采集及处理系统
JP2017513404A (ja) * 2014-04-03 2017-05-25 ホアウェイ・テクノロジーズ・カンパニー・リミテッド フィールドプログラマブルゲートアレイ及び通信方法
WO2018049235A1 (fr) * 2016-09-08 2018-03-15 Macnica Americas, Inc. Module de délestage fpga et procédés de commutation sans coupure de flux multimédias en temps réel au niveau de la trame

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997004401A2 (fr) * 1995-07-21 1997-02-06 Philips Electronics N.V. Architecture de processeur multimedia a rapport performance-densite eleve
US20050021871A1 (en) * 2003-07-25 2005-01-27 International Business Machines Corporation Self-contained processor subsystem as component for system-on-chip design

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997004401A2 (fr) * 1995-07-21 1997-02-06 Philips Electronics N.V. Architecture de processeur multimedia a rapport performance-densite eleve
US20050021871A1 (en) * 2003-07-25 2005-01-27 International Business Machines Corporation Self-contained processor subsystem as component for system-on-chip design

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FURBER S ED - FURBER S: "ARM System-on-chip Architecture - Architectural Support for System Development", ARM SYSTEM-ON-CHIP ARCHITECTURE, ADDISON-WESLEY, 1 January 2000 (2000-01-01), pages 207 - 223, XP002494900, ISBN: 978-0-201-67519-1 *
SERGIO SAPONARA ET AL: "FPGA-based Networking Systems for High Data-rate and Reliable In-vehicle Communications", DESIGN, AUTOMATION&TEST IN EUROPE CONFERENCE&EXHIBITION, 2007. DATE '07, IEEE, PI, 1 April 2007 (2007-04-01), pages 1 - 6, XP031092154, ISBN: 978-3-9810801-2-4 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102932357A (zh) * 2012-11-07 2013-02-13 中国科学院近代物理研究所 一种加速器高频数字低电平以太网通讯系统及通讯方法
JP2017513404A (ja) * 2014-04-03 2017-05-25 ホアウェイ・テクノロジーズ・カンパニー・リミテッド フィールドプログラマブルゲートアレイ及び通信方法
US11586572B2 (en) 2014-04-03 2023-02-21 Huawei Technologies Co., Ltd. Field programmable gate array and communication method
CN104898627A (zh) * 2015-05-29 2015-09-09 江苏海大印染机械有限公司 一种基于pci总线与fpga的印刷数据采集及处理系统
WO2018049235A1 (fr) * 2016-09-08 2018-03-15 Macnica Americas, Inc. Module de délestage fpga et procédés de commutation sans coupure de flux multimédias en temps réel au niveau de la trame

Similar Documents

Publication Publication Date Title
EP3400688B1 (fr) Ordinateur massivement parallèle, grappes informatiques accélérées, et routeur bidimensionnel et réseau d'interconnexion pour réseaux prédiffusés programmables par l'utilisateur, et applications
JP6988040B2 (ja) ヘテロジニアスコンピューティングのためのシステム、方法及び装置
US6594713B1 (en) Hub interface unit and application unit interfaces for expanded direct memory access processor
Becker et al. Architecture, memory and interface technology integration of an industrial/academic configurable system-on-chip (CSoC)
Hübner et al. Fast dynamic and partial reconfiguration data path with low hardware overhead on Xilinx FPGAs
KR102593583B1 (ko) Ssd들 상에서의 가속된 데이터 처리를 위한 시스템 및 방법
Neuendorffer et al. Building zynq® accelerators with Vivado® high level synthesis.
US7007111B2 (en) DMA port sharing bandwidth balancing logic
Sharma et al. Wishbone bus architecture-a survey and comparison
Kalte et al. Dynamically reconfigurable system-on-programmable-chip
Rosinger Connecting customized IP to the MicroBlaze soft processor using the Fast Simplex Link (FSL) channel
WO2009145608A1 (fr) Unité de traitement de données à architecture flexible
US6694385B1 (en) Configuration bus reconfigurable/reprogrammable interface for expanded direct memory access processor
CN118043815A (zh) 调试数据流计算机架构
CN118076944A (zh) 可重新配置计算组构中的循环执行期间的数据存储
CN117795496A (zh) 可重新配置计算组构中的并行矩阵运算
TW202240394A (zh) 對可重配置處理器之資料流功能卸載
Lin et al. A system solution for High-Performance, low power SDR
Erusalagandi Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
Wong et al. DONGLE: Direct FPGA-Orchestrated NVMe Storage for HLS
US20230224261A1 (en) Network interface device
Viktorin HW/SW CO-DESIGN FOR THE XILINX ZYNQ PLATFORM
Litz Improving the scalability of high performance computer systems
Solokhina et al. Radiation tolerant heterogeneous Multicore “system on chip” with built-in multichannel SpaceFibre switch for onboard data management and mass storage device: Components, short paper
Schmidt et al. Merging Programming Models and On-chip Networks to Meet the Programmable and Performance Needs of Multi-core Systems on a Programmable Chip

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08753796

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 26/05/2011)

122 Ep: pct application non-entry in european phase

Ref document number: 08753796

Country of ref document: EP

Kind code of ref document: A1