CN105930201B - A kind of functional simulator of restructural application specific processor core - Google Patents

A kind of functional simulator of restructural application specific processor core Download PDF

Info

Publication number
CN105930201B
CN105930201B CN201610262442.2A CN201610262442A CN105930201B CN 105930201 B CN105930201 B CN 105930201B CN 201610262442 A CN201610262442 A CN 201610262442A CN 105930201 B CN105930201 B CN 105930201B
Authority
CN
China
Prior art keywords
restructural
application specific
processor core
algorithm
specific processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610262442.2A
Other languages
Chinese (zh)
Other versions
CN105930201A (en
Inventor
潘红兵
李可生
李丽
杨博
陈辉
徐天伟
陆振飞
唐海亮
何书专
李伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201610262442.2A priority Critical patent/CN105930201B/en
Publication of CN105930201A publication Critical patent/CN105930201A/en
Application granted granted Critical
Publication of CN105930201B publication Critical patent/CN105930201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45545Guest-host, i.e. hypervisor is an application program itself, e.g. VirtualBox

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention relates to the functional simulators of restructural application specific processor core, it include: external interface module, simulate the function of restructural application specific processor core internal register group and internal SRAM, receiving needs configuration-direct to be simulated, a global task queue is written according to the mission bit stream that the configuration-direct parses, and by the mission bit stream;Control module, simulate the function of master controller inside restructural application specific processor core, the mission bit stream is transmitted between modules according to the task queue, the processor active task currently needed to be implemented and the processor active task that will be carried out are obtained from the task queue, scheduling computation realizes that module executes processor active task;Module is realized in operation, executes several algorithm output operation result data and operation state, and execute the carrying of mission bit stream, operation result data.It has the beneficial effect that analog rate faster, is convenient for system-level debugging and optimization;Help to improve efficiency and save the cost.

Description

A kind of functional simulator of restructural application specific processor core
Technical field
The present invention relates to a kind of functional simulators of restructural application specific processor core, are suitable for restructural application specific processor core Software system development design.
Technical background
In traditional SoC system development, software design has to carry out after entire hardware is completed, this makes Obtaining the entire development cycle becomes very long.In order to solve this problem, become currently based on the software and hardware cooperating design method of SystemC It obtains increasingly popular.Even if in this way, once needing to carry out emulation (including operating system, driving, API and the application of whole system Program), the simulator based on SystemC will often consume long time, especially cycle-accurate simulator.This is greatly Affect the progress of software development.
It is a widely used open source computer simulator and software virtual machine.It can be to the processing of various frameworks Device and necessary subsystem, such as network interface card equipment carry out analog simulation, resourceful, simulation velocity is fast.QEMU points are user Two kinds of operational modes of mode simulation (user mode emulation) and full-system simulation (full system emulation): Under user mode emulation, QEMU can start the program that those are the compiling of different central processing units;Under full-system simulation, QEMU User can be allowed to emulate the entire holonomic system such as whole system, including central processing unit, peripheral hardware and operating system, pole Big facilitates the work that test and error correction are carried out to system source code.
Summary of the invention
In place of overcoming the above the deficiencies in the prior art, carried out for restructural application specific processor core whole The emulation of a system provides a quick simulator, is also opened before hardware development plate is in place for API and software application Hair and test provide an efficient available platform.Specifically there is following technical scheme realization:
The functional simulator of the restructural application specific processor core, the restructural application specific processor core includes register The functional simulator of group, SRAM, master controller and reconfigurable controller, the restructural application specific processor core includes:
External interface module simulates the function of restructural application specific processor core internal register group and internal SRAM, receives Configuration-direct to be simulated is needed, is written one according to the mission bit stream that the configuration-direct parses, and by the mission bit stream The task queue of a overall situation;
Control module simulates the function of master controller inside restructural application specific processor core, is existed according to the task queue The mission bit stream is transmitted between each module, obtains the processor active task currently needed to be implemented from the task queue and will be carried out Processor active task, scheduling computation realize module execute processor active task;
Module is realized in operation, simulates reconfigurable controller inside restructural application specific processor core, DMA and reconfigurable arrays Function executes several algorithm output operation result data and operation state, and execute mission bit stream, operation result data are removed Fortune;
The further design of the functional simulator of the restructural dedicated processes core is that the control module includes operation Control unit and status control unit, Operations Analysis complete scheduling and the operating mode of algorithm according to the mission bit stream Management, the operating mode includes holotype, slave pattern and debugging mode;Status control module realizes module according to operation Operation state updates the external state of simulator, and the external state includes idle state, busy condition, completion status, and according to Corresponding external state, which issues, interrupts
The further design of the functional simulator of the restructural dedicated processes core is that external interface module includes inside Storage unit and register cell respectively correspond simulation SRAM and register group;Internal storage unit is under debugging mode, branch The access for holding SRAM in the restructural application specific processor core of outer bound pair provides the operation memory headroom that module is realized in operation;Register The register group of the group restructural application specific processor core of module simulation, while parsing configuration-direct and exporting, addition algorithm operation Mission bit stream.
The further design of the functional simulator of the restructural dedicated processes core is that operation realization module includes can be real The operation consolidation function block of existing 17 kinds of algorithm units, the algorithm unit are respectively as follows: FFT/IFFT operation, vector auto-correlation fortune Calculation, computing cross-correlation, signed magnitude arithmetic(al), multiplying, matrix inversion operation, signed magnitude arithmetic(al), multiplying, point multiplication operation, Covariance operation, real/complex FIR operation, real/complex Doppler FIR operation, surely floating translation operation and plural modulus fortune It calculates.
The functional simulator of the restructural dedicated processes core it is further design be, the operation consolidation function block In, each algorithm unit individually contains the operation realization of the algorithm and data are carried;During operation is realized, each algorithm unit All according to the complexity and data scale of corresponding algorithm, operation breakpoint the step of algorithm and under debugging mode draw Point, control module is realized by the operation for calling specific algorithm steps to carry out each step.
The further design of the functional simulator of the restructural dedicated processes core is that each of module is realized in operation For the SRAM that algorithm unit uses internal storage unit to simulate when calculating as operation memory, internal storage unit is operation realization Module provides the pointer for directly accessing its memory, for the read/write function around internal storage region module to its direct memory access.
The functional simulator of the restructural dedicated processes core it is further design be, the operation control list of control module Member carries out periodic Timing Processing by timer when scheduling is realized by the operation that step divides the algorithm unit executed: It issues and interrupts when timer reaches setting value every time, operation control module passes through external interface module schedules according to described interrupt After operation consolidation function block completes the given step of assignment algorithm, interrupted next time into idle state waiting, thus not Disconnected circulation, until entire operation is completed.
Advantages of the present invention is as follows:
1, the present invention uses the mentality of designing of functional simulator, has simplified sightless outside restructural application specific processor core Details improves analog rate;
2, present invention accomplishes the demands of restructural application specific processor core whole system emulation, and can be in hardware development plate Before in place, an available development platform is provided to software engineer;
3, the present invention is based on open source virtual machine Q EMU to realize that there is resource abundant can be used cooperatively, build more simple It is single.
Detailed description of the invention
Fig. 1 is the hardware architecture diagram of restructural application specific processor core.
Fig. 2 is the mapping relations figure of simulator Yu restructural application specific processor core internal module.
Fig. 3 is simulator internal module level schematic diagram.
Fig. 4 is operating status schematic diagram before algorithm unit operand divides.
Fig. 5 is simulator functional test results.
Fig. 6 is that identical PC machine Imitating device the performance test results compare.
Specific embodiment
The present invention program is described in detail with reference to the accompanying drawing.
The simulated target of the functional simulator of restructural application specific processor core provided in this embodiment is restructural dedicated place Device core is managed, hardware structure is as shown in Figure 1.The processor core is mainly by register group, SRAM, master controller and reconfigurable control Device composition, using Dynamic Reconfigurable Technique when operation, using its internal reconfigurable cell, it may be implemented algorithm and draws to calculating The space reflection held up improves the flexibility and resource utilization of entire processor.Restructural application specific processor realizes common letter Hardware-accelerated, such as FIR algorithm, related algorithm, FFT/IFFT, the matrix class operation of number Processing Algorithm.Its coarseness weighs Structure design, can change the topological structure and interconnecting relation of its internal arithmetic unit by the static configuration mode of coarseness, real The multiplexing of existing each algorithm arithmetic hardware resource, the requirement of signal processing algorithm real-time is met with this.
The functional simulator of restructural application specific processor core provided in this embodiment is realized based on QEMU, according to access, fortune The three zones feature calculated and controlled rejects sightless operation details outside the processor core, simulator is abstracted as externally Three module, control module parts are realized in interface module, operation.External interface module is register and inside in hardware configuration The mapping of SRAM is that simulator can be by the part of external world's access, and receiving needs configuration-direct to be simulated, according to configuration-direct solution Obtained mission bit stream is analysed, and a global task queue is written into the mission bit stream.Operation realizes that module is reconstruct control The mapping of device, DMA and reconfigurable arrays processed is the realization part of algorithm operation, executes several algorithm output operation result number According to operation state, and execute the carrying of mission bit stream, operation result data.Control module is the mapping of master controller, is whole The control of the internal process of a simulator and the control section of state, transmitted between modules according to the task queue described in Mission bit stream obtains the processor active task currently needed to be implemented and the processor active task that will be carried out, scheduling fortune from the task queue It calculates and realizes that module executes processor active task, referring to fig. 2.
Further, such as Fig. 3, external interface module is divided into two parts of register group unit and internal storage unit.It posts Storage group unit is responsible for simulating the register group inside restructural application specific processor.Register inside restructural application specific processor Group includes that device configuration register, operation configuration register, status register, abnormal interrupt register and holotype base address are posted Storage.Outside can be written and read these registers simulated by bus, and state of realizing is read and the operations such as instruction write-in. Meanwhile the module is responsible for the instruction that parsing configuration is come in, and increases simulator operation to be executed by processor active task queue and appoints Business.
Module is realized in operation, is the main operational part of entire simulator, by operation consolidation function block cal_family structure At inside is divided into 17 algorithm units, including FFT/IFFT operation, vector auto-correlation computation, computing cross-correlation, addition and subtraction fortune Calculation, multiplying, matrix inversion operation, signed magnitude arithmetic(al), multiplying, point multiplication operation, covariance operation, real/complex FIR Operation, real/complex Doppler FIR operation, surely floating translation operation and plural modulus operation.In operation consolidation function block, often A algorithm unit all individually contains the operation realization of the algorithm and data are carried;In operation realization cal_family, each calculation Method unit is all according to the complexity and data scale of corresponding algorithm, by operation breakpoint the step of algorithm and under debugging mode It is divided, control module is realized by the operation for calling specific algorithm steps to carry out each step, is realized to debugging mode Support, and prevent large-scale data operation from causing QEMU seemingly-dead or Caton.
Operation realizes that each algorithm unit in module uses the SRAM of internal storage unit simulation as fortune when calculating Memory is calculated, internal storage unit provides the pointer for directly accessing its memory for operation realization module, for bypassing storage inside The read/write function of area's module is to its direct memory access, so as to improve the execution speed of algorithm arithmetic element.
Control module is subdivided into two parts of Operations Analysis and status control unit in simulator.Operation control Module is according to the processor active task in processor active task queue, and the algorithm unit dispatched in cal_family carries out operation, and support can weigh The dispatching simulation of the holotype of structure application specific processor core, slave pattern and debugging mode.Status control module is according to cal_family Operating status, simulate the change of external status register and dishing out for abnormal interrupt of restructural application specific processor.
Wherein, under slave pattern emulation mode, after each processor active task, corresponding mode bit can be set, as in Disconnected flag bit, complement mark position etc..After external main equipment confirms this subtask, corresponding state position empties again, operation control Device can according to the task in processor active task queue and continue to execute next processor active task or enter idle state.
Under holotype emulation mode, simulator can be inquired in the address space that holotype base register is directed toward automatically The configuration of operation configuration register reads and parses task addition processor active task queue;In such a mode, simulator can basis batch Processing flag bit decision is the response external equipment or after all tasks of task queue are completed after the completion of each task Ability response external equipment.
Under debugging mode emulation mode, simulator can be simulated in the reconfigurable controller inside restructural application specific processor core The response of breakpoint can interrupt operation when algorithm unit operation in cal_family is to breakpoint location, open internal simulation SRAM, i.e. the access limitation of internal storage region module, the intermediate results of operations in accessible internal storage region external at this time.
In the cal_family that module is realized in operation, the algorithm all having time complexities supported due to algorithm unit are high, The big feature of data volume, therefore the algorithm of each algorithm unit is when realizing, all according to the specific feature of each algorithm, by its according to The step of total operand is divided into N number of operand approximation equivalent, i.e., each step only includes the 1/N or so of total operand, and N's is big The small operand that must assure that each step not will lead to system card.Each step interval is controlled by control module to execute Lai complete At the operation of entire algorithm.Referring to Fig. 4, system code and algorithm operation part are quickly alternately performed.Utilize the QEMU- in QEMU Timer timer carries out periodic Timing Processing.It triggers and interrupts when timer reaches setting value every time, operation control module The given step that cal_family completes assignment algorithm is dispatched by interface, is interrupted next time into idle state waiting, Thus it constantly recycles, until entire operation is completed.Although passing through operand division side in this way, integral operation amount can not be reduced Case becomes the operation of multi-step, other codes can execute in the gap of two steps of algorithm, not will lead to whole system and all blocks Pause and executed in the algorithm of restructural application specific processor simulator, thereby ensures that the continuous and smooth of whole system operation.
The functional simulator of restructural application specific processor core provided in an embodiment of the present invention in simulation functionally, leads to Correctness, register read-write and the internal storage region for having crossed algorithm operation access the test of three aspects, and test result is referring to figure 5.In the test of performance, and the cycle accurate simulator based on SystemC compares.The two is in environment identical as far as possible Under, test carries out in the case where not tape operation system and other peripheral hardwares, by api function placement algorithm and parameter, carries out speed The comparison of degree, test result is referring to Fig. 6, it can be seen that and this simulator is in performance, especially in the case where operand is big, performance It is much better than cycle accurate simulator.
If Fig. 5,6 are as can be seen that the present invention provides one for the emulation that restructural application specific processor core carries out whole system A quick simulator also carries out exploitation for API and software application before hardware development plate is in place and test provides a height Imitate available platform.

Claims (5)

1. a kind of functional simulator of restructural application specific processor core, the restructural application specific processor core include register group, SRAM, master controller, DMA, reconfigurable arrays and reconfigurable controller, it is characterised in that the restructural application specific processor core Functional simulator includes:
External interface module simulates the function of restructural application specific processor core internal register group and internal SRAM, receives and needs The configuration-direct of simulation is written one entirely according to the mission bit stream that the configuration-direct parses, and by the mission bit stream The task queue of office;
Control module simulates the function of master controller inside restructural application specific processor core, according to the task queue in each mould The mission bit stream is transmitted between block, obtains the processor active task currently needed to be implemented and the fortune that will be carried out from the task queue Calculation task, scheduling computation realize that module executes processor active task;
Module is realized in operation, simulates the function of reconfigurable controller inside restructural application specific processor core, DMA and reconfigurable arrays Can, execute several algorithm output operation result data and operation state, and execute mission bit stream, operation result data are removed Fortune;
Operation realizes that module includes the operation consolidation function block that 17 kinds of algorithm units can be achieved, and the algorithm unit is respectively as follows: FFT/IFFT operation, vector auto-correlation computation, computing cross-correlation, vector signed magnitude arithmetic(al), vector multiplication operation, matrix inversion fortune Calculation, matrix signed magnitude arithmetic(al), matrix multiplication operation, point multiplication operation, covariance operation, real/complex FIR operation, real/complex Doppler FIR operation, surely floating translation operation and plural modulus operation.
In the operation consolidation function block, each algorithm unit individually contains the operation realization of the algorithm and data are carried; During operation is realized, each algorithm unit, will be the step of algorithm and tune according to the complexity and data scale of corresponding algorithm Operation breakpoint under die trial formula is divided, and control module is real by the operation for calling specific algorithm steps to carry out each step It is existing.
2. the functional simulator of restructural application specific processor core according to claim 1, it is characterised in that: the control mould Block includes Operations Analysis and status control unit, Operations Analysis according to the mission bit stream complete the scheduling of algorithm with The management of operating mode;Status control unit realizes that the operation state of module updates the external state of simulator, institute according to operation Stating external state includes idle state, busy condition, completion status, and is issued and interrupted according to corresponding external state.
3. the functional simulator of restructural application specific processor core according to claim 1, it is characterised in that: external interface mould Block includes internal storage unit and register cell, respectively corresponds simulation SRAM and register group;Internal storage unit is being adjusted Under die trial formula, the access of SRAM in the outer restructural application specific processor core of bound pair is supported, the operation memory that module is realized in operation is provided Space;Register cell simulates the register group of restructural application specific processor core, while parsing configuration-direct and exporting, and addition is calculated The mission bit stream of method operation.
4. the functional simulator of restructural application specific processor core according to claim 2, it is characterised in that mould is realized in operation Each algorithm unit in block uses the SRAM of internal storage unit simulation as operation memory, internal storage unit when calculating The pointer for directly accessing its memory is provided for operation realization module, for straight to its around the read/write function of internal storage unit Receiving is deposited.
5. the functional simulator of restructural application specific processor core according to claim 4, it is characterised in that: control module Operations Analysis is carried out periodic when scheduling is realized by the operation that step divides the algorithm unit executed by timer Timing Processing: issuing when timer reaches setting value every time and interrupt, and Operations Analysis is according to the interruption by external After mouth mold block scheduling computation consolidation function block completes the given step of assignment algorithm, waited in next time into the idle state It is disconnected, it thus constantly recycles, until entire operation is completed.
CN201610262442.2A 2016-04-25 2016-04-25 A kind of functional simulator of restructural application specific processor core Active CN105930201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610262442.2A CN105930201B (en) 2016-04-25 2016-04-25 A kind of functional simulator of restructural application specific processor core

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610262442.2A CN105930201B (en) 2016-04-25 2016-04-25 A kind of functional simulator of restructural application specific processor core

Publications (2)

Publication Number Publication Date
CN105930201A CN105930201A (en) 2016-09-07
CN105930201B true CN105930201B (en) 2019-03-22

Family

ID=56837025

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610262442.2A Active CN105930201B (en) 2016-04-25 2016-04-25 A kind of functional simulator of restructural application specific processor core

Country Status (1)

Country Link
CN (1) CN105930201B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106598647B (en) * 2016-11-09 2020-10-30 许继集团有限公司 Intelligent device development platform
CN106951211B (en) * 2017-03-27 2019-10-18 南京大学 A kind of restructural fixed and floating general purpose multipliers
CN106951394A (en) * 2017-03-27 2017-07-14 南京大学 A kind of general fft processor of restructural fixed and floating
CN107608234A (en) * 2017-09-20 2018-01-19 东南大学 The dynamic accuracy emulation controller and method of a kind of reconfigurable system
CN111488114B (en) * 2019-01-28 2021-12-21 北京灵汐科技有限公司 Reconfigurable processor architecture and computing device
CN110597755B (en) * 2019-08-02 2024-01-09 北京多思安全芯片科技有限公司 Recombination configuration method of safety processor
CN112379868B (en) * 2020-11-12 2021-06-18 无锡沐创集成电路设计有限公司 Programming method for network data packet processing based on reconfigurable chip
CN112381220B (en) * 2020-12-08 2024-05-24 厦门壹普智慧科技有限公司 Neural network tensor processor
CN112580792B (en) * 2020-12-08 2023-07-25 厦门壹普智慧科技有限公司 Neural network multi-core tensor processor
CN112540888B (en) * 2020-12-18 2022-08-12 清华大学 Debugging method and device for large-scale reconfigurable processing unit array
CN116070565B (en) * 2023-03-01 2023-06-13 摩尔线程智能科技(北京)有限责任公司 Method and device for simulating multi-core processor, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005128692A (en) * 2003-10-22 2005-05-19 Matsushita Electric Ind Co Ltd Simulator and simulation method
CN103927484B (en) * 2014-04-21 2017-03-08 西安电子科技大学宁波信息技术研究院 Rogue program behavior catching method based on Qemu simulator
CN103927219A (en) * 2014-05-04 2014-07-16 南京大学 Accurate-period simulation model for reconfigurable special processor core and hardware architecture thereof

Also Published As

Publication number Publication date
CN105930201A (en) 2016-09-07

Similar Documents

Publication Publication Date Title
CN105930201B (en) A kind of functional simulator of restructural application specific processor core
Séméria et al. Methodology for hardware/software co-verification in C/C++ (short paper)
US6427224B1 (en) Method for efficient verification of system-on-chip integrated circuit designs including an embedded processor
US5546562A (en) Method and apparatus to emulate VLSI circuits within a logic simulator
CN109814990A (en) A kind of distributed parallel collaborative simulation framework
CN103927219A (en) Accurate-period simulation model for reconfigurable special processor core and hardware architecture thereof
CN101887378A (en) Semi-physical real-time simulator and semi-physical real-time simulation system
AU2019246793B2 (en) Generalized virtualization platform for systems using hardware abstraction software layers
Gerstlauer et al. Abstract system-level models for early performance and power exploration
US6319008B1 (en) Avionics simulator
Végh How Amdahl’s Law limits the performance of large artificial neural networks: why the functionality of full-scale brain simulation on processor-based simulators is limited
CN107533473A (en) Efficient wave for emulation generates
CN111353263A (en) Software and hardware design and verification platform system
US11593547B1 (en) Prediction and optimization of multi-kernel circuit design performance using a programmable overlay
CN106842171B (en) A kind of main passive underwater acoustic array signal that task based access control distributes automatically emulation parallel calculating method
Yang et al. An approach to build cycle accurate full system VLIW simulation platform
CN111105341A (en) Framework method for solving computational fluid dynamics with low power consumption and high operational performance
Lim et al. Using virtual platform in embedded system education
EP3435229A1 (en) System integration using virtualization
CN114021311A (en) RTX-based heterogeneous parallel semi-physical simulation device and method
CN102760097B (en) Computer architecture performance simulation method and system
Hoppe et al. Agent-based autonomous vehicle simulation with hardware emulation in the loop
Wei et al. Evaluation of on-chip accelerator performance based on RocketChip
CN102132278B (en) Processor simulation using instruction traces or markups
Pridgen et al. RASSP technology insertion into the synthetic aperture radar image processor application

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant