CN105930201A - Functional simulator for reconfigurable dedicated processor core - Google Patents
Functional simulator for reconfigurable dedicated processor core Download PDFInfo
- Publication number
- CN105930201A CN105930201A CN201610262442.2A CN201610262442A CN105930201A CN 105930201 A CN105930201 A CN 105930201A CN 201610262442 A CN201610262442 A CN 201610262442A CN 105930201 A CN105930201 A CN 105930201A
- Authority
- CN
- China
- Prior art keywords
- reconfigurable
- processor core
- algorithm
- computing
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 58
- 230000006870 function Effects 0.000 claims abstract description 26
- 238000012546 transfer Methods 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims description 16
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000004088 simulation Methods 0.000 abstract description 14
- 238000004364 calculation method Methods 0.000 abstract description 7
- 230000009286 beneficial effect Effects 0.000 abstract 1
- 238000005457 optimization Methods 0.000 abstract 1
- 238000013461 design Methods 0.000 description 10
- 238000012360 testing method Methods 0.000 description 8
- 230000018109 developmental process Effects 0.000 description 6
- 238000011161 development Methods 0.000 description 5
- 238000013507 mapping Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000000034 method Methods 0.000 description 2
- 230000000737 periodic effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000033772 system development Effects 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 238000011056 performance test Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45545—Guest-host, i.e. hypervisor is an application program itself, e.g. VirtualBox
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本发明涉及可重构专用处理器核的功能模拟器,包括:对外接口模块,模拟可重构专用处理器核内部寄存器组和内部SRAM的功能,接收需要模拟的配置指令,根据所述配置指令解析得到的任务信息,并将所述任务信息写入一个全局的任务队列;控制模块,模拟可重构专用处理器核内部主控制器的功能,根据所述任务队列在各模块之间传递所述任务信息,从所述任务队列获取当前需要执行的运算任务和即将进行的运算任务,调度运算实现模块执行运算任务;运算实现模块,执行若干种算法输出运算结果数据与运算状态,并执行任务信息、运算结果数据的搬运。有益效果为:模拟速度更快,便于系统级调试和优化;有助于提高效率和节约成本。
The invention relates to a function simulator of a reconfigurable special processor core, comprising: an external interface module, simulating the functions of the internal register group and the internal SRAM of the reconfigurable special processor core, receiving configuration instructions that need to be simulated, and according to the configuration instructions Analyze the task information obtained, and write the task information into a global task queue; the control module simulates the function of the main controller inside the reconfigurable special processor core, and transfers the task information between modules according to the task queue. The task information is obtained from the task queue to obtain the current computing tasks that need to be performed and the computing tasks that are about to be performed, and the scheduling computing implementation module executes computing tasks; the computing computing module executes several algorithms to output computing result data and computing status, and executes tasks Transfer of information and calculation result data. The beneficial effect is that the simulation speed is faster, and it is convenient for system-level debugging and optimization; it helps to improve efficiency and save cost.
Description
技术领域technical field
本发明涉及一种可重构专用处理器核的功能模拟器,适用于可重构专用处理器核的软件系统开发设计。The invention relates to a functional simulator of a reconfigurable special processor core, which is suitable for the software system development and design of the reconfigurable special processor core.
技术背景technical background
在传统SoC系统开发中,软件设计必须要等到整个硬件完成之后才能够进行,这使得整个开发周期变得很长。为了解决这一问题,目前基于SystemC的软硬件协同设计方法变得日趋热门。但即使如此,一旦需要进行整个系统的仿真(包括操作系统、驱动、API及应用程序),基于SystemC的模拟器往往要消耗很长的时间,尤其是周期精确的模拟器。这极大的影响着软件开发的进度。In traditional SoC system development, software design must wait until the entire hardware is completed, which makes the entire development cycle very long. In order to solve this problem, the software-hardware co-design method based on SystemC has become increasingly popular. But even so, once the simulation of the entire system (including operating system, driver, API and application program) is required, the SystemC-based simulator often consumes a long time, especially the cycle-accurate simulator. This greatly affects the progress of software development.
是一个广泛使用的开源计算机模拟器和虚拟机软件。它可以对各种架构的处理器,以及必要的子系统,如网卡等设备进行模拟仿真,资源丰富、仿真速度快。QEMU分为用户模式仿真(user mode emulation)和全系统仿真(full system emulation)两种运行模式:在用户模式仿真下,QEMU能启动那些为不同中央处理器编译的程序;在全系统仿真下,QEMU可以让用户对整个系统,包括中央处理器、外设以及操作系统等整个完整系统进行仿真,极大的方便了对系统源代码进行测试和纠错的工作。is a widely used open source computer emulator and virtual machine software. It can simulate processors of various architectures and necessary subsystems, such as network cards and other devices, with rich resources and fast simulation speed. QEMU is divided into two operating modes: user mode emulation and full system emulation: in user mode emulation, QEMU can start programs compiled for different CPUs; in full system emulation, QEMU allows users to simulate the entire system, including the central processing unit, peripherals, and operating system, which greatly facilitates the testing and error correction of the system source code.
发明内容Contents of the invention
本发明的目的在于克服以上现有技术的不足之处,为可重构专用处理器核进行整个系统的仿真提供一个快速的模拟器,也为API以及软件应用在硬件开发板就位前进行开发和测试提供一个高效可用的平台。具体有以下技术方案实现:The purpose of the present invention is to overcome the above deficiencies in the prior art, to provide a fast simulator for the reconfigurable special processor core to carry out the simulation of the whole system, and to develop API and software applications before the hardware development board is in place And testing provides an efficient and usable platform. Specifically, the following technical solutions are implemented:
所述可重构专用处理器核的功能模拟器,所述可重构专用处理器核包括寄存器组、SRAM、主控制器以及重构控制器,所述可重构专用处理器核的功能模拟器包括:The functional simulator of the reconfigurable special-purpose processor core, the described reconfigurable special-purpose processor core includes register bank, SRAM, main controller and reconstruction controller, the function simulation of the reconfigurable special-purpose processor core Devices include:
对外接口模块,模拟可重构专用处理器核内部寄存器组和内部SRAM的功能,接收需要模拟的配置指令,根据所述配置指令解析得到的任务信息,并将所述任务信息写入一个全局的任务队列;The external interface module simulates the functions of the reconfigurable special-purpose processor core internal register group and internal SRAM, receives the configuration instructions that need to be simulated, parses the task information obtained according to the configuration instructions, and writes the task information into a global task queue;
控制模块,模拟可重构专用处理器核内部主控制器的功能,根据所述任务队列在各模块之间传递所述任务信息,从所述任务队列获取当前需要执行的运算任务和即将进行的运算任务,调度运算实现模块执行运算任务;The control module simulates the function of the main controller inside the reconfigurable special-purpose processor core, transfers the task information between modules according to the task queue, and obtains the current computing tasks and upcoming tasks from the task queue. Computing tasks, scheduling computing implementation modules to perform computing tasks;
运算实现模块,模拟可重构专用处理器核内部的重构控制器、DMA和可重构阵列的功能,执行若干种算法输出运算结果数据与运算状态,并执行任务信息、运算结果数据的搬运;The operation realization module simulates the functions of the reconfigurable controller, DMA and reconfigurable array inside the reconfigurable special-purpose processor core, executes several algorithms to output the operation result data and operation state, and executes the transfer of task information and operation result data ;
所述可重构专用处理核的功能模拟器的进一步设计在于,所述控制模块包括运算控制单元和状态控制单元,运算控制单元根据所述任务信息完成算法的调度与工作模式的管理,所述工作模式包括主模式、从模式以及调试模式;状态控制模块根据运算实现模块的运算状态更新模拟器的对外状态,所述对外状态包括空闲状态、忙状态、完成状态,并根据对应的对外状态发出中断The further design of the functional simulator of the reconfigurable dedicated processing core is that the control module includes an operation control unit and a state control unit, and the operation control unit completes the scheduling of the algorithm and the management of the working mode according to the task information. The working mode includes master mode, slave mode and debugging mode; the state control module updates the external state of the simulator according to the operation state of the operation realization module, and the external state includes idle state, busy state and completion state, and sends to interrupt
所述可重构专用处理核的功能模拟器的进一步设计在于,对外接口模块包括内部存储单元和寄存器单元,分别对应地模拟SRAM和寄存器组;内部存储单元在调试模式下,支持外界对可重构专用处理器核内SRAM的访问,提供运算实现模块的运算内存空间;寄存器组模块模拟可重构专用处理器核的寄存器组,同时解析配置指令并输出,添加算法运算的任务信息。The further design of the functional simulator of the reconfigurable special-purpose processing core is that the external interface module includes an internal storage unit and a register unit, respectively simulating SRAM and a register group correspondingly; The access to the SRAM in the dedicated processor core provides access to the SRAM of the operation implementation module; the register bank module simulates the register bank of the reconfigurable special processor core, and at the same time parses and outputs configuration instructions and adds task information for algorithm operations.
所述可重构专用处理核的功能模拟器的进一步设计在于,运算实现模块包括可实现十七种算法单元的运算集合功能块,所述算法单元分别为:FFT/IFFT运算、向量自相关运算、互相关运算、加减法运算、乘法运算、矩阵求逆运算、加减法运算、乘法运算、点乘运算、协方差运算、实数/复数FIR运算、实数/复数多普勒FIR运算、定浮转换运算以及复数求模运算。The further design of the functional simulator of the reconfigurable special-purpose processing core is that the operation realization module includes an operation set function block that can realize seventeen kinds of algorithm units, and the algorithm units are respectively: FFT/IFFT operation, vector autocorrelation operation , cross-correlation, addition and subtraction, multiplication, matrix inversion, addition and subtraction, multiplication, dot product, covariance, real/complex FIR, real/complex Doppler FIR, constant Float conversion operations and complex modulo operations.
所述可重构专用处理核的功能模拟器的进一步设计在于,所述运算集合功能块中,每个算法单元都单独包含了该算法的运算实现和数据搬运;运算实现中,每个算法单元都根据对应算法的复杂度以及数据规模,将算法的步骤以及调试模式下运算断点的进行划分,控制模块通过调用具体的算法步骤进行各个步骤的运算实现。The further design of the functional simulator of the reconfigurable special-purpose processing core is that in the operation set function block, each algorithm unit includes the operation realization and data transfer of the algorithm separately; in the operation realization, each algorithm unit According to the complexity and data size of the corresponding algorithm, the steps of the algorithm and the operation breakpoints in the debugging mode are divided, and the control module implements the operation of each step by calling specific algorithm steps.
所述可重构专用处理核的功能模拟器的进一步设计在于,运算实现模块中的每个算法单元在计算时使用内部存储单元模拟的SRAM作为运算内存,内部存储单元为运算实现模块提供了直接访问其内存的指针,用于绕过内部存储区模块的读写函数对其直接访存。The further design of the functional simulator of the reconfigurable special-purpose processing core is that each algorithm unit in the operation implementation module uses the SRAM simulated by the internal storage unit as the operation memory when calculating, and the internal storage unit provides direct access to the operation implementation module. A pointer to access its memory, which is used to directly access memory by bypassing the read and write functions of the internal storage area module.
所述可重构专用处理核的功能模拟器的进一步设计在于,控制模块的运算控制单元在调度按步骤划分执行的算法单元的运算实现时,通过定时器进行周期性的定时处理:在定时器每次到达设定值时发出中断,运算控制模块根据所述中断通过对外接口模块调度运算集合功能块完成指定算法的指定步骤后,进入所述空闲状态等待下一次中断,由此不断循环,直到整个运算完成。The further design of the functional simulator of the reconfigurable special-purpose processing core is that the operation control unit of the control module performs periodic timing processing through the timer when scheduling the operation of the algorithm unit executed by step-by-step division: When reaching the set value at every turn, an interruption is sent, and the operation control module, according to the interruption, completes the specified steps of the specified algorithm through the external interface module scheduling operation set function block, enters the idle state and waits for the next interruption, thus continuously looping until The whole operation is completed.
本发明的优点如下:The advantages of the present invention are as follows:
1、本发明采用功能模拟器的设计思路,精简了可重构专用处理器核外部不可见的细节,提高了模拟速度;1. The present invention adopts the design idea of a functional simulator, simplifies the invisible details outside the reconfigurable special processor core, and improves the simulation speed;
2、本发明满足了可重构专用处理器核整个系统仿真的需求,并可以在硬件开发板就位前,给软件工程师提供一个可用的开发平台;2. The present invention satisfies the needs of the whole system simulation of the reconfigurable special processor core, and can provide software engineers with an available development platform before the hardware development board is put in place;
3、本发明基于开源虚拟机QEMU实现,具有丰富的资源可以配合使用,搭建更为简单。3. The present invention is implemented based on the open-source virtual machine QEMU, which has abundant resources that can be used together, and is easier to build.
附图说明Description of drawings
图1是可重构专用处理器核的硬件架构图。Figure 1 is a hardware architecture diagram of a reconfigurable special purpose processor core.
图2是模拟器与可重构专用处理器核内部模块的映射关系图。Fig. 2 is a map of the mapping relationship between the simulator and the internal modules of the reconfigurable special processor core.
图3是模拟器内部模块层次示意图。Figure 3 is a schematic diagram of the internal module hierarchy of the simulator.
图4是算法单元运算量划分前运行状态示意图。Fig. 4 is a schematic diagram of the operating state of the algorithm unit before the calculation amount is divided.
图5是模拟器功能测试结果。Figure 5 is the result of the simulator function test.
图6是相同PC机下模拟器性能测试结果比对。Figure 6 is a comparison of simulator performance test results under the same PC.
具体实施方案specific implementation plan
下面结合附图对本发明方案进行详细说明。The solution of the present invention will be described in detail below in conjunction with the accompanying drawings.
本实施例提供的可重构专用处理器核的功能模拟器的模拟目标是可重构专用处理器核,其硬件架构如图1所示。该处理器核主要由寄存器组、SRAM、主控制器以及重构控制器组成,采用运行时动态可重构技术,利用其内部的可重构单元,它可以实现算法到计算引擎的空间映射,提高整个处理器的灵活性和资源利用率。可重构专用处理器实现了常用信号处理算法的硬件加速,如FIR算法、相关算法、FFT/IFFT、矩阵类运算等。其粗粒度的可重构设计,可以通过粗粒度的静态配置方式改变其内部运算单元的拓扑结构和互联关系,实现各个算法运算硬件资源的复用,以此满足信号处理算法实时性的要求。The simulation target of the functional simulator of the reconfigurable special processor core provided in this embodiment is the reconfigurable special processor core, and its hardware architecture is shown in FIG. 1 . The processor core is mainly composed of register bank, SRAM, main controller and reconfigurable controller. It adopts runtime dynamic reconfigurable technology and uses its internal reconfigurable unit to realize the space mapping from algorithm to computing engine. Improve overall processor flexibility and resource utilization. The reconfigurable special-purpose processor realizes the hardware acceleration of commonly used signal processing algorithms, such as FIR algorithm, correlation algorithm, FFT/IFFT, matrix operations, etc. Its coarse-grained reconfigurable design can change the topology and interconnection of its internal computing units through coarse-grained static configuration, and realize the multiplexing of computing hardware resources for each algorithm, so as to meet the real-time requirements of signal processing algorithms.
本实施例提供的可重构专用处理器核的功能模拟器,基于QEMU实现,依照访问、运算和控制的三大功能特点,剔除该处理器核外部不可见的运行细节,将模拟器抽象为对外接口模块、运算实现模块、控制模块三个部分。对外接口模块是硬件结构里寄存器和内部SRAM的映射,是模拟器可以被外界访问的部分,接收需要模拟的配置指令,根据配置指令解析得到的任务信息,并将所述任务信息写入一个全局的任务队列。运算实现模块是重构控制器、DMA和可重构阵列的映射,是算法运算的实现部分,执行若干种算法输出运算结果数据与运算状态,并执行任务信息、运算结果数据的搬运。控制模块是主控制器的映射,是整个模拟器的内部流程的控制和状态的控制部分,根据所述任务队列在各模块之间传递所述任务信息,从所述任务队列获取当前需要执行的运算任务和即将进行的运算任务,调度运算实现模块执行运算任务,参见图2。The functional simulator of the reconfigurable special-purpose processor core provided by this embodiment is implemented based on QEMU. According to the three major functional characteristics of access, operation and control, the invisible operation details outside the processor core are eliminated, and the simulator is abstracted as There are three parts: external interface module, operation realization module and control module. The external interface module is the mapping between the registers and the internal SRAM in the hardware structure. It is a part of the simulator that can be accessed by the outside world. It receives the configuration instructions that need to be simulated, parses the task information obtained according to the configuration instructions, and writes the task information into a global task queue. The operation implementation module is the mapping of the reconfigurable controller, DMA and reconfigurable array, and is the implementation part of the algorithm operation. It executes several algorithms to output the operation result data and operation state, and performs the transfer of task information and operation result data. The control module is the mapping of the main controller, and is the control part of the internal process and state of the entire simulator. According to the task queue, the task information is transferred between the modules, and the current needs to be executed are obtained from the task queue. Computing tasks and upcoming computing tasks, the scheduling computing implementation module executes computing tasks, see Figure 2.
进一步的,如图3,对外接口模块分为寄存器组单元和内部存储单元两个部分。寄存器组单元负责模拟可重构专用处理器内部的寄存器组。可重构专用处理器内部的寄存器组包括设备配置寄存器、运算配置寄存器、状态寄存器、异常中断寄存器和主模式基地址寄存器。外部可以通过总线对这些模拟的寄存器进行读写,实现状态读取和指令写入等操作。同时,该模块负责解析配置进来的指令,通过运算任务队列增加模拟器将要执行的运算任务。Further, as shown in FIG. 3 , the external interface module is divided into two parts: a register set unit and an internal storage unit. The register bank unit is responsible for simulating the register bank inside the reconfigurable special purpose processor. The register group inside the reconfigurable special-purpose processor includes device configuration registers, operation configuration registers, status registers, exception interrupt registers and main mode base address registers. The outside can read and write these simulated registers through the bus to realize operations such as status reading and instruction writing. At the same time, this module is responsible for parsing the configured instructions, and increasing the computing tasks to be executed by the simulator through the computing task queue.
运算实现模块,是整个模拟器的核心运算部分,由运算集合功能块cal_family构成,内部划分为17个算法单元,包括FFT/IFFT运算、向量自相关运算、互相关运算、加减法运算、乘法运算、矩阵求逆运算、加减法运算、乘法运算、点乘运算、协方差运算、实数/复数FIR运算、实数/复数多普勒FIR运算、定浮转换运算以及复数求模运算。运算集合功能块中,每个算法单元都单独包含了该算法的运算实现和数据搬运;运算实现cal_family中,每个算法单元都根据对应算法的复杂度以及数据规模,将算法的步骤以及调试模式下运算断点的进行划分,控制模块通过调用具体的算法步骤进行各个步骤的运算实现,实现对调试模式的支持,以及防止大规模数据运算导致QEMU假死或卡顿。The operation realization module is the core operation part of the whole simulator. It is composed of the operation set function block cal_family. It is divided into 17 algorithm units internally, including FFT/IFFT operations, vector autocorrelation operations, cross-correlation operations, addition and subtraction operations, and multiplication. Operation, matrix inversion operation, addition and subtraction operation, multiplication operation, point product operation, covariance operation, real/complex FIR operation, real/complex Doppler FIR operation, fixed-float conversion operation and complex modulo operation. In the operation set function block, each algorithm unit separately includes the operation implementation and data transfer of the algorithm; in the operation implementation cal_family, each algorithm unit integrates the steps of the algorithm and the debugging mode according to the complexity and data size of the corresponding algorithm. Under the division of the operation breakpoint, the control module implements the operation of each step by calling specific algorithm steps, realizes the support for the debugging mode, and prevents QEMU from suspended animation or freezing caused by large-scale data operations.
运算实现模块中的每个算法单元在计算时使用内部存储单元模拟的SRAM作为运算内存,内部存储单元为运算实现模块提供了直接访问其内存的指针,用于绕过内部存储区模块的读写函数对其直接访存,以此提高算法运算单元的执行速度。Each algorithm unit in the operation implementation module uses the SRAM simulated by the internal storage unit as the operation memory during calculation. The internal storage unit provides the operation implementation module with a pointer to directly access its memory, which is used to bypass the reading and writing of the internal storage area module The function directly accesses it to improve the execution speed of the arithmetic operation unit.
控制模块在模拟器里被细分为运算控制单元和状态控制单元两个部分。运算控制模块根据运算任务队列中的运算任务,调度cal_family中的算法单元进行运算,支持可重构专用处理器核的主模式、从模式和调试模式的调度模拟。状态控制模块根据cal_family的运行状态,模拟可重构专用处理器的对外状态寄存器的改变和异常中断的抛出。The control module is subdivided into two parts: the operation control unit and the state control unit in the simulator. The operation control module schedules the algorithm units in cal_family to perform operations according to the operation tasks in the operation task queue, and supports the scheduling simulation of the master mode, slave mode and debug mode of the reconfigurable dedicated processor core. The state control module simulates the change of the external state register of the reconfigurable special processor and the throwing of an abnormal interrupt according to the running state of the cal_family.
其中,在从模式模拟状态下,每个运算任务结束后,相应的状态位会被置位,如中断标志位、完成标志位等。待外部主设备确认此次任务后,相应状态位再次清空,运算控制器会根据运算任务队列中的任务及继续执行下个运算任务或者进入空闲状态。Among them, in the simulation state of the slave mode, after each operation task is completed, the corresponding status bit will be set, such as the interrupt flag bit, the completion flag bit, and so on. After the external master device confirms this task, the corresponding status bit is cleared again, and the operation controller will continue to execute the next operation task or enter the idle state according to the tasks in the operation task queue.
在主模式模拟状态下,模拟器会自动在主模式基址寄存器指向的地址空间内查询运算配置寄存器的配置,读取并解析任务加入运算任务队列;在此模式下,模拟器会根据批处理标志位决定是在每个任务完成后响应外部设备还是在任务队列所有任务都被完成后才响应外部设备。In the simulation state of the main mode, the simulator will automatically query the configuration of the operation configuration register in the address space pointed to by the base address register of the main mode, and add the reading and parsing task to the operation task queue; in this mode, the simulator will The flag bit determines whether to respond to the external device after each task is completed or to respond to the external device after all tasks in the task queue are completed.
在调试模式模拟状态下,模拟器会模拟可重构专用处理器核内部的重构控制器中的断点的响应,在cal_family中的算法单元运算到断点位置时,会中断运算,打开内部模拟的SRAM,即内部存储区模块的访问限制,此时外部可以访问内部存储区内的运算中间结果。In the debug mode simulation state, the simulator will simulate the response of the breakpoint in the reconfigurable controller inside the reconfigurable special processor core. When the arithmetic unit in cal_family reaches the breakpoint, it will interrupt the operation and open the internal The simulated SRAM, that is, the access restriction of the internal storage area module, at this time, the external can access the intermediate results of the calculation in the internal storage area.
在运算实现模块的cal_family中,由于算法单元支持的算法都有时间复杂度高、数据量大的特点,故每个算法单元的算法在实现时,都根据每个算法的具体特点,将其按照总运算量划分为N个运算量近似等量的步骤,即每个步骤只包含总运算量的1/N左右,N的大小必须保证每个步骤的运算量不会导致系统卡顿。由控制模块控制每个步骤间歇执行来完成整个算法的运算。参照图4,系统代码和算法运算代码快速交替执行。利用QEMU中的QEMU-Timer定时器进行周期性的定时处理。在定时器每次到达设定值时触发中断,运算控制模块通过接口调度cal_family完成指定算法的指定步骤,进入所述空闲状态等待下一次中断,由此不断循环,直到整个运算完成。这样,虽然无法减小整体运算量,但通过运算量划分方案变为多步骤的运算,其他代码可以在算法两个步骤的间隙执行,不会导致整个系统都卡顿在可重构专用处理器模拟器的算法执行上,由此保证了整个系统运行的连续和流畅。In the cal_family of the operation implementation module, since the algorithms supported by the algorithm unit have the characteristics of high time complexity and large amount of data, the algorithm of each algorithm unit is implemented according to the specific characteristics of each algorithm. The total amount of computation is divided into N steps with approximately the same amount of computation, that is, each step contains only about 1/N of the total computation amount, and the size of N must ensure that the computation amount of each step will not cause the system to freeze. The control module controls the intermittent execution of each step to complete the operation of the entire algorithm. Referring to Fig. 4, the system code and algorithm operation code are executed alternately and rapidly. Use the QEMU-Timer timer in QEMU for periodic timing processing. An interrupt is triggered each time the timer reaches the set value, and the operation control module dispatches cal_family through the interface to complete the specified steps of the specified algorithm, enters the idle state and waits for the next interrupt, and thus continues to circulate until the entire operation is completed. In this way, although the overall calculation volume cannot be reduced, the calculation volume division scheme becomes a multi-step operation, and other codes can be executed between the two steps of the algorithm, which will not cause the entire system to be stuck on the reconfigurable special-purpose processor. The algorithm execution of the simulator ensures the continuous and smooth operation of the entire system.
本发明实施例提供的可重构专用处理器核的功能模拟器,在功能上的模拟上,通过了算法运行的正确性、寄存器读写和内部存储区访问三个方面的测试,测试结果参照图5。在性能的测试上,和基于SystemC的周期精确模拟器进行对比。两者处于尽量相同的环境下,测试在不带操作系统和其他外设的情况下进行,通过API函数配置算法和参数,进行速度的比对,测试结果参照图6,可以看出本模拟器在性能上,尤其在运算量大的情况下,性能远好于周期精确模拟器。The functional simulator of the reconfigurable special-purpose processor core provided by the embodiment of the present invention has passed the tests of the correctness of algorithm operation, register reading and writing, and internal storage area access in terms of functional simulation. The test results refer to Figure 5. In performance testing, it is compared with a cycle-accurate simulator based on SystemC. The two are in the same environment as possible. The test is carried out without the operating system and other peripherals. The algorithm and parameters are configured through the API function to compare the speed. Refer to Figure 6 for the test results. It can be seen that the simulator In terms of performance, especially in the case of a large amount of calculation, the performance is much better than the cycle-accurate simulator.
如图5、6可以看出,本发明为可重构专用处理器核进行整个系统的仿真提供了一个快速的模拟器,也为API以及软件应用在硬件开发板就位前进行开发和测试提供一个高效可用的平台。As can be seen from Figures 5 and 6, the present invention provides a fast simulator for the simulation of the entire system by the reconfigurable dedicated processor core, and also provides a means for API and software applications to be developed and tested before the hardware development board is in place. An efficient and usable platform.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610262442.2A CN105930201B (en) | 2016-04-25 | 2016-04-25 | A kind of functional simulator of restructural application specific processor core |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610262442.2A CN105930201B (en) | 2016-04-25 | 2016-04-25 | A kind of functional simulator of restructural application specific processor core |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105930201A true CN105930201A (en) | 2016-09-07 |
CN105930201B CN105930201B (en) | 2019-03-22 |
Family
ID=56837025
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610262442.2A Active CN105930201B (en) | 2016-04-25 | 2016-04-25 | A kind of functional simulator of restructural application specific processor core |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105930201B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106598647A (en) * | 2016-11-09 | 2017-04-26 | 许继集团有限公司 | Intelligent apparatus development platform |
CN106951211A (en) * | 2017-03-27 | 2017-07-14 | 南京大学 | A kind of restructural fixed and floating general purpose multipliers |
CN106951394A (en) * | 2017-03-27 | 2017-07-14 | 南京大学 | A kind of general fft processor of restructural fixed and floating |
CN107608234A (en) * | 2017-09-20 | 2018-01-19 | 东南大学 | The dynamic accuracy emulation controller and method of a kind of reconfigurable system |
CN110597755A (en) * | 2019-08-02 | 2019-12-20 | 北京多思安全芯片科技有限公司 | Recombination configuration method of safety processor |
CN111488114A (en) * | 2019-01-28 | 2020-08-04 | 北京灵汐科技有限公司 | Reconfigurable processor architecture and computing device |
CN112379868A (en) * | 2020-11-12 | 2021-02-19 | 无锡沐创集成电路设计有限公司 | Programming method for network data packet processing based on reconfigurable chip |
CN112381220A (en) * | 2020-12-08 | 2021-02-19 | 厦门壹普智慧科技有限公司 | Neural network tensor processor |
CN112540888A (en) * | 2020-12-18 | 2021-03-23 | 清华大学 | Debugging method and device for large-scale reconfigurable processing unit array |
CN112580792A (en) * | 2020-12-08 | 2021-03-30 | 厦门壹普智慧科技有限公司 | Neural network multi-core tensor processor |
CN116070565A (en) * | 2023-03-01 | 2023-05-05 | 摩尔线程智能科技(北京)有限责任公司 | Method and device for simulating multi-core processor, electronic equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1609808A (en) * | 2003-10-22 | 2005-04-27 | 松下电器产业株式会社 | Simulator and simulation method |
CN103927484A (en) * | 2014-04-21 | 2014-07-16 | 西安电子科技大学宁波信息技术研究院 | Malicious program behavior capture method based on Qemu |
CN103927219A (en) * | 2014-05-04 | 2014-07-16 | 南京大学 | Accurate-period simulation model for reconfigurable special processor core and hardware architecture thereof |
-
2016
- 2016-04-25 CN CN201610262442.2A patent/CN105930201B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1609808A (en) * | 2003-10-22 | 2005-04-27 | 松下电器产业株式会社 | Simulator and simulation method |
CN103927484A (en) * | 2014-04-21 | 2014-07-16 | 西安电子科技大学宁波信息技术研究院 | Malicious program behavior capture method based on Qemu |
CN103927219A (en) * | 2014-05-04 | 2014-07-16 | 南京大学 | Accurate-period simulation model for reconfigurable special processor core and hardware architecture thereof |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106598647A (en) * | 2016-11-09 | 2017-04-26 | 许继集团有限公司 | Intelligent apparatus development platform |
CN106951211A (en) * | 2017-03-27 | 2017-07-14 | 南京大学 | A kind of restructural fixed and floating general purpose multipliers |
CN106951394A (en) * | 2017-03-27 | 2017-07-14 | 南京大学 | A kind of general fft processor of restructural fixed and floating |
CN106951211B (en) * | 2017-03-27 | 2019-10-18 | 南京大学 | A Reconfigurable Fixed-Floating-Point Universal Multiplier |
CN107608234A (en) * | 2017-09-20 | 2018-01-19 | 东南大学 | The dynamic accuracy emulation controller and method of a kind of reconfigurable system |
CN111488114B (en) * | 2019-01-28 | 2021-12-21 | 北京灵汐科技有限公司 | Reconfigurable processor architecture and computing device |
WO2020156177A1 (en) * | 2019-01-28 | 2020-08-06 | 北京灵汐科技有限公司 | Reconfigurable processor architecture and computing device |
CN111488114A (en) * | 2019-01-28 | 2020-08-04 | 北京灵汐科技有限公司 | Reconfigurable processor architecture and computing device |
CN110597755A (en) * | 2019-08-02 | 2019-12-20 | 北京多思安全芯片科技有限公司 | Recombination configuration method of safety processor |
CN110597755B (en) * | 2019-08-02 | 2024-01-09 | 北京多思安全芯片科技有限公司 | Recombination configuration method of safety processor |
CN112379868A (en) * | 2020-11-12 | 2021-02-19 | 无锡沐创集成电路设计有限公司 | Programming method for network data packet processing based on reconfigurable chip |
CN112381220A (en) * | 2020-12-08 | 2021-02-19 | 厦门壹普智慧科技有限公司 | Neural network tensor processor |
CN112580792A (en) * | 2020-12-08 | 2021-03-30 | 厦门壹普智慧科技有限公司 | Neural network multi-core tensor processor |
CN112580792B (en) * | 2020-12-08 | 2023-07-25 | 厦门壹普智慧科技有限公司 | Neural network multi-core tensor processor |
CN112381220B (en) * | 2020-12-08 | 2024-05-24 | 厦门壹普智慧科技有限公司 | Neural network tensor processor |
CN112540888B (en) * | 2020-12-18 | 2022-08-12 | 清华大学 | Debugging method and device for large-scale reconfigurable processing unit array |
CN112540888A (en) * | 2020-12-18 | 2021-03-23 | 清华大学 | Debugging method and device for large-scale reconfigurable processing unit array |
CN116070565A (en) * | 2023-03-01 | 2023-05-05 | 摩尔线程智能科技(北京)有限责任公司 | Method and device for simulating multi-core processor, electronic equipment and storage medium |
CN116070565B (en) * | 2023-03-01 | 2023-06-13 | 摩尔线程智能科技(北京)有限责任公司 | Method and device for simulating multi-core processor, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN105930201B (en) | 2019-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105930201A (en) | Functional simulator for reconfigurable dedicated processor core | |
Benini et al. | Mparm: Exploring the multi-processor soc design space with systemc | |
Séméria et al. | Methodology for hardware/software co-verification in C/C++ (short paper) | |
Loghi et al. | Analyzing on-chip communication in a MPSoC environment | |
US6427224B1 (en) | Method for efficient verification of system-on-chip integrated circuit designs including an embedded processor | |
CN104750603A (en) | Multi-core DSP (Digital Signal Processor) software emulator and physical layer software testing method thereof | |
US8725486B2 (en) | Apparatus and method for simulating a reconfigurable processor | |
CN112580792B (en) | Neural network multi-core tensor processor | |
TW202236089A (en) | User-space emulation framework for heterogeneous soc design | |
CN103927219A (en) | Accurate-period simulation model for reconfigurable special processor core and hardware architecture thereof | |
US12093752B2 (en) | Processor based logic simulation acceleration and emulation system | |
CN102073480A (en) | Method for simulating cores of multi-core processor by adopting time division multiplex | |
Yoo et al. | Building fast and accurate SW simulation models based on hardware abstraction layer and simulation environment abstraction layer | |
Engblom et al. | Full-system simulation from embedded to high-performance systems | |
WO2023232006A1 (en) | Simulation device, simulation system, simulation method, and storage medium | |
Poss et al. | MGSim—A simulation environment for multi-core research and education | |
Han et al. | Multi-core architectures with dynamically reconfigurable array processors for the WIMAX physical layer | |
CN101196828A (en) | A kind of simulator and method | |
Giorgi | Exploring future many-core architectures: The TERAFLUX evaluation framework | |
El-Moursy et al. | Efficient embedded SoC hardware/software codesign using virtual platform | |
Yoo et al. | Multi-processor SoC design methodology using a concept of two-layer hardware-dependent software | |
Joloboff et al. | Virtual prototyping of embedded systems: Speed and accuracy tradeoffs | |
Yeh et al. | Optimizing the simulation speed of qemu and systemc-based virtual platform | |
Engblom et al. | A fully virtual multi-node 1553 bus computer system | |
Su et al. | Applying ESL in a Dual-Core SoC platform designing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |