CN106775597A - A kind of parallel multi-core total system simulator of Loosely Coupled Architecture - Google Patents
A kind of parallel multi-core total system simulator of Loosely Coupled Architecture Download PDFInfo
- Publication number
- CN106775597A CN106775597A CN201611108730.9A CN201611108730A CN106775597A CN 106775597 A CN106775597 A CN 106775597A CN 201611108730 A CN201611108730 A CN 201611108730A CN 106775597 A CN106775597 A CN 106775597A
- Authority
- CN
- China
- Prior art keywords
- module
- simulation
- instruction
- timing
- functional
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004088 simulation Methods 0.000 claims abstract description 193
- 238000012545 processing Methods 0.000 claims abstract description 20
- 238000001514 detection method Methods 0.000 claims abstract description 17
- 238000013461 design Methods 0.000 claims abstract description 16
- 238000012360 testing method Methods 0.000 claims abstract description 11
- 230000006870 function Effects 0.000 claims description 22
- 238000000034 method Methods 0.000 claims description 10
- 230000006399 behavior Effects 0.000 claims description 8
- 230000002159 abnormal effect Effects 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000005094 computer simulation Methods 0.000 claims description 5
- 230000003139 buffering effect Effects 0.000 claims description 4
- 238000012986 modification Methods 0.000 claims description 3
- 230000004048 modification Effects 0.000 claims description 3
- 230000008520 organization Effects 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 238000007689 inspection Methods 0.000 claims description 2
- 230000009191 jumping Effects 0.000 claims description 2
- 230000005540 biological transmission Effects 0.000 claims 1
- 238000000151 deposition Methods 0.000 claims 1
- 238000004321 preservation Methods 0.000 claims 1
- 230000008878 coupling Effects 0.000 abstract description 10
- 238000010168 coupling process Methods 0.000 abstract description 10
- 238000005859 coupling reaction Methods 0.000 abstract description 10
- 238000005516 engineering process Methods 0.000 abstract description 6
- 230000003993 interaction Effects 0.000 abstract description 4
- 238000004891 communication Methods 0.000 description 3
- 230000007717 exclusion Effects 0.000 description 3
- 239000010437 gem Substances 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 230000031774 hair cycle Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 210000002500 microbody Anatomy 0.000 description 1
- 238000003032 molecular docking Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention belongs to field of computer technology, the parallel multi-core total system simulator of specially a kind of Loosely Coupled Architecture.The present invention is mainly made up of functional simulation module, timing simulation module, common interface module and Difference test and adjustment module.The present invention is between functional simulation and timing simulation module using the structure of loose coupling, interaction between the two is reduced, the dynamics of system in parallel execution is increased, the performance of simulation is improve, also so that simulator easily extends according to demand, i.e., with outstanding scalability and simulated performance higher;Detection and correcting module that the present invention is also accessed by design mistake path module, shared drive access detection and correcting module, exception and interruption processing module and shared page table, so as to ensure the accuracy of simulator analog result.The present invention can be used to analyze and assess the operation action and timing information of polycaryon processor.
Description
Technical field
The invention belongs to field of computer technology, and in particular to a kind of total system simulator of parallel multi-core processor.
Background technology
With continuing to develop for multi-core technology, microprocessor architecture becomes to become increasingly complex, and development cost more and more higher is opened
The hair cycle is also more and more long.How the validity of design is verified, the efficiency and success rate to improving design play vital
Effect, simulation is widely used technology in order to reach this purpose.Simulator is designed in processor or architecture is ground
Effect in studying carefully also becomes indispensable.On the one hand, hardware designer can be extended to simulator, so as to following place
Reason device is designed and verifies.Meanwhile, designer can also carry out test processor each functional part in difference using simulator
The result of the configuration various test programs of lower operation, and determine each functional part to processor globality based on the analysis to result
The influence degree of energy, so as to deeply understand the behavior of processor, and then improves the design of processor, improves performance.On the other hand,
Software developer can also be that correspondence hardware platform develops various software products based on simulator platform before hardware operation, plus
The commercialization of fast software and hardware product.
Software analogue technique, come analog processor components of system as directed or the function of whole, and provides various institutes by software program
Need analog result.Particularly, software analogue technique not only can the various functions of analog hardware and behavior exactly, can be with
Flexible modification is modeled structure, component and the behavior of target architecture.Simultaneously as software is easy to extension, it is also possible to effectively
Overcome the shortcomings of hardware verification very flexible, application surface are narrow, the construction cycle is long.Due to these features, software analogue technique conduct
A kind of vital means, are widely used to the every aspect of processor design and architectural study.
At present, main flow architecture simulator mainly includes functional simulation model and timing simulation model two parts.Function
Simulation model typically only completes the various functions simulation of processor hardware, without the microbody system of each part of analog processor
Architectural feature.Therefore, functional simulation model can only be used for the correctness of test program implementing result on a platform, and can not
Performance and sequential behavioral data are given, therefore cannot be analysed in depth, can not also carry out the comparing of different processor design.
Timing simulation model then can according to actual needs, analog processor entirely or partially functional unit within each clock cycle
The various actions and reaction of microarchitecture, including streamline, functional unit and storage organization etc..Timing simulation model not only may be used
To confirm the correctness of result, the various performance indications of processor in program process can also be obtained, therefore be ground in correlation
Study carefully and have widely application in designing.
Due to the popularization of the importance and coenocytism of simulator, the more full functional simulation model of various functions constantly gushes
It is existing, such as Simics, QEMU, COREMU;And various new timing simulation models more accurately and quickly are also continued to bring out, such as
GEMS, MPTLSim, RAMP GOLD.Current main flow multi-core simulation device generally uses close coupling design, i.e. functional simulation and sequential
Simulation needs to be interacted in each clock cycle.On the one hand, such design increased the difficulty of simulator extension:To mould
Intend device and increase new function or timing simulation model generally needing the time of several man-years, as researcher takes the several years
Time integrates M5 and GEMS simulators (gem5), and extends the time that PTLSim is allowed to support QEMU also to spend the several years
(MARSS).On the other hand, interaction excessive in close coupling design have also been introduced many extra expenses, not only reduce multinuclear
The performance of simulator, also counteracts that simulator functional simulation and the efficient parallel possibility of timing simulation.
The content of the invention
It is an object of the invention to provide a kind of implementation dynamics is strong, simulated performance is good and parallel many with expanded function
The total system simulator of core processor.
The total system simulator of the parallel multi-core processor that the present invention is provided, is a kind of simulation system of software realization.Tool
For body, the present invention logically separates functional simulation model and timing simulation model, using the designed holder of complete loose coupling
Structure:Functional simulation model is only responsible for functional simulation part, and timing simulation model is only responsible for the when preamble section of simulator.Dummy instruction
When, simulator is performed using the preferential structure of function, functional simulation model prior to timing simulation model, is that timing simulation model is carried
For related information;The functional simulation of simulator and timing simulation module are handed over by the unrelated general-purpose interface of architecture
Mutually.In order to ensure the accuracy of simulator, the present invention devises Difference test and adjustment module, be responsible for detection functional simulation and when
The difference of behavior between sequence analog module.When a discrepancy is detected, the module is further according to causing the concrete reason of difference to be adjusted
Section.
The total system simulator of the parallel multi-core processor that the present invention is provided, its frame structure can be divided into 4 major parts:
Functional simulation module, timing simulation module, general-purpose interface, Difference test and adjustment module.Wherein:
(1)Functional simulation module, realizes specific functional simulation model, mainly responsible execute instruction and collection application program and behaviour
Make the execution information of system, and the unrelated instruction stream of the architecture needed for Command Resolution into timing simulation and data flow are believed
Breath, is written to the general-purpose interface in the middle of functional simulation module and timing simulation module, is thus delivered to timing simulation module;
(2)Timing simulation module, realizes specific timing simulation model, is responsible for being obtained from general-purpose interface the interface letter of instruction
Breath, timing simulation is carried out to instruction, and more new architecture status information;
(3)General-purpose interface, module the inside includes an instruction buffer and a memory access table(Memory Access
Table, MAT)Structure.The instruction stream information of storage functional simulation Model Transfer is mainly responsible in instruction buffer, and MAT is then main
It is responsible for the traffic flow information that stored memory is accessed;
(4)Difference test and adjustment module, main responsible comparing function simulation are inconsistent with timing simulation behavior, and according to not
Consistent the reason for, calls corresponding processing module to be adjusted respectively.
Operationally, some factors can cause functional simulation and the behavior of timing simulation inconsistent to simulator, so as to influence
The accuracy of simulation.These factors mainly include the order that branch misprediction, interruption and the treatment of exception, shared data are accessed
And the order etc. that shared page table is accessed, hereinafter referred to as accuracy influence factor.In order to improve the accuracy of simulation, the present invention enters
One step devise erroneous path processing module block, exception and interruption processing module, shared drive access detection and amendment mould and
Detection and correcting module that shared page table is accessed, the mistake that amendment accuracy influence factor is introduced when occurring.
The erroneous path processing module of present invention design, the branch prediction function of more preferable simulation modern processors.The mould
Block, when branch misprediction occurs in timing simulation model, is that functional simulation module creation goes out one with current timing simulation module
The completely the same lightweight erroneous path system of system mode, continues executing with by the instruction road of branch prediction in this system
Footpath, after the completion of the instruction of mistake is predicted, does not perform submission and directly exits erroneous path simulation, so as to avoid change actual
The state of streamline.This mode for calling lightweight erroneous path can avoid the note of unnecessary rollback and complex state
Record.
The exception and interruption processing module of present invention design, treatment is abnormal or interrupts functional simulation and timing simulation when occurring
Module execution route it is inconsistent.The processing method of this module is similar to the processing method of erroneous path processing module.Work as interruption
Or it is abnormal when occurring, the module for functional simulation module creation go out one it is completely the same with current timing simulation modular system state
Lightweight erroneous path system, continued executing with this system interrupt or it is abnormal occur after command path, when wrong road
After the completion of the instruction in footpath, do not perform submission and directly exit erroneous path simulation, so as to avoid changing the state of actual streamline.
Detection and correcting module that the shared drive of present invention design is accessed, the functional simulation that detection and amendment are performed in advance
The internal storage access order of model and timing simulation model is inconsistent, with correct shared drive access order it is inconsistent brought when
Sequence inexactness, it is to avoid cause the inaccurate of analog result.The detection and correcting module that shared drive is accessed are operated by two steps
To avoid the access order between functional simulation model and timing simulation model for shared drive inconsistent:First in function mould
Analog model records the order that each processor core accesses internal memory when performing;Then internal memory instruction is submitted in timing simulation model
When, check whether timing simulation model is consistent to the access order of shared drive with functional simulation model, so that it is determined that both
Whether process performing has conflict, in the event of conflict, it is necessary to system mode rollback functional simulation mould according to timing simulation model
Type.
Detection and correcting module that the shared page table of present invention design is accessed, simplify the conflict of shared page table access order
Treatment.Functional simulation module when MMU (Memory Management Unit, memory management unit) miss is run into, no longer
Instruction is continued executing with, when timing simulation module submits to this to instruct(When jumping to correct MMU miss processing paths)
It is further continued for performing.This strategy causes that functional simulation module and timing simulation module are maintained on the access order of shared page table
Unanimously.In this way, all it is that, from erroneous path instruction fetch, can't cause before timing simulation resume module MMU miss
The problem that streamline is stopped.
When needing to be simulated an instruction, the typical simulation flow of system is as follows:First, functional simulation module is held
The instruction that row is modeled.Then, functional simulation module the information of instruction itself and will carry out holding required for timing simulation
Row information passes to timing simulation module by general-purpose interface.Then, timing simulation module will to this instruction simulation when, from logical
With this relevant information for instructing is read in interface, timing simulation is carried out to it.In timing simulation, Difference test and regulation mould
Block will detect the event of influence simulation accuracy, and according to the accuracy influence factor being detected accordingly call error road
The inspection that the detection and correcting module, exception and interruption processing module and shared page table that footpath processing module, shared drive are accessed are accessed
Survey and correcting module is modified to functional simulation model, it is ensured that the path of functional simulation modeling and timing simulation pattern die
The path of plan is consistent, so as to ensure the accuracy of simulation.Finally, when this instruction completes to submit in timing simulation module,
Update the architecture state of timing simulation model.
In order to ensure scalability so that simulator can easily extend new functional simulation model, or even replacing refers to
Order collection framework(ISA), the design of general-purpose interface do not rely on the related letter of the architectures such as functional simulation model and particular, instruction set
Breath.Further, since timing simulation model main analog streamline is relied on and storage organization, general-purpose interface is used for comprising these two aspects
The relevant information of timing simulation.The major function of general-purpose interface is to transmit instruction stream between functional simulation and timing simulation module
And traffic flow information.
The instruction stream information can be recorded in the middle of instruction buffer, and big I is set by configuration file.Buffering area
Each single item save instruction carry out timing simulation needed for information, by functional simulation modeling it is complete one instruction after fill in.
Each single item can regard the trace (trace) of corresponding instruction as, the main information for including three aspects:Streamline relevant information(Work(
Can unit demand and source/destination register etc.), store the architectural state information of access information and modification.
The traffic flow information storage is in a table MAT for bidimensional.First dimension of MAT is one with physical address
It is the hash of key assignments, each node therein is we term it memory address node;Each memory address node possesses one
Access order queue of the functional simulation model to internal memory is preserved, it is suitable by the access of each core for the memory address for recording
Sequence.Which core each nodes records of queue certain specifying informations for once accessing, including current access comes from, be read operation also
It is write operation, and the information such as change to internal memory.
Technical scheme has the following advantages that:The structure of loose coupling between functional simulation and timing simulation model,
Interaction between the two is reduced, the dynamics of system in parallel execution is increased, the performance of simulation is improve;Also so that simulator is easy
Extend according to demand.Difference test and regulation between functional simulation and timing simulation model, it is ensured that the accuracy of simulator.
Brief description of the drawings
The simulator framework of Fig. 1, loose couplings structure.
Fig. 2, mispredicted path handling process.
Fig. 3, internal storage access information record flow.
Fig. 4, internal storage access sequential search flow.
Fig. 5, shared page table access order algorithm.
Fig. 6, design for Universal Interface.
Specific embodiment
In order that the objects, technical solutions and advantages of the present invention are clearer, below in conjunction with drawings and Examples, specifically
It is bright of the invention to be preferable to carry out.Before this it should be noted that term used in this specification and claims book or
It is the implication in common implication or dictionary that word is unable to limited interpretation, and should be based in order to illustrate that it is sent out in the best way
A person of good sense can carry out the implication and concept that the appropriate principle for defining is construed to meet the technology of the present invention thought to the concept of term.With
It, the structure represented in the embodiment and accompanying drawing described in this specification is one of preferred embodiment, can not be complete
Quan represents technological thought of the invention, it will therefore be appreciated that for the present invention may be each in the presence of what can be replaced
Plant equivalent and variation.
Two big Core Features of the present invention in simulator --- between functional simulation and timing simulation, using the knot of loose coupling
Structure, referring to Fig. 1;Between the two using a general-purpose interface communication, referring to Fig. 6.Based on the present invention, simulator can be repaiied neatly
Change functional simulation and timing simulation model, it is also possible to further improve the performance of system.
Realize one:The extension of simulation model
Compared to the framework of other simulators, can be more convenient to carry out difference in functionality simulation model and timing simulation based on the present invention
The extension of model.Below with reference to Fig. 1, how introduce carries out functional simulation extension and timing simulation extension based on the present invention.It is based on
The present invention is extended to simulator and is referred to the specific embodiment.
First, the extended example of functional mode.QEMU is a powerful functional simulator, can support different instruction set
Functional simulation.In simulation, QEMU is converted to instruction the simple TCG of similar RISC(Tiny Code Generator,
Simple code maker)Instruction.When integrated, as long as in TCG instructions and general-purpose interface of the invention(With reference to Fig. 6)Set up mapping,
The work decoding of complicated order can just be avoided.In docking, in addition it is also necessary to carry out necessary optimization to some sophisticated vocabularies.Such as
X86 instructions are converted to that TCG instructions are excessively trifling, and X86 instruction generally requires to be generated 5-6 bars, or even more TCG refer to
Order.This is with real X86 architectures and does not correspond, and the decoder function of PTLSim simulators is borrowed, by PTLSim simulators
Comprising decoder generation microcommand instead of TCG instruction.New functional simulation model can simulate more real X86 bodies
Architecture.For sophisticated vocabulary feature, extended model need to carry out some adjustment to timing simulation model.Such as X86 instructs past
It is past to be split into a plurality of microcommand, and in order to ensure the atomicity of X86 instructions, or these microcommands are submitted completely, or
Influence is not produced, it is necessary to timing simulation model provides extra support on system mode completely.When realizing, we are according to micro- finger
The PC values of order, identify the microcommand group for belonging to an X86 instruction, all of microcommand in this microcommand that and if only if group
When can submit to, all these microcommands are just submitted to;If wherein one microcommand is cleared, all these microcommands all will
It is cleared.
Secondly, the extended example of temporal model.The timing simulation for being currently based on FPGA is the main flow of simulation.Original method is needed
The function and timing simulation interface of complexity are customized, so as to cause quantities huge.Based on this method, can be to the sequential of FPGA
Simulation model customizes the information transfer function unrelated with architecture of the present invention, and after completing the function, FPGA timing simulation devices can
Docked with difference in functionality simulator with facilitating, such as Pin and QEMU.
For function or timing simulation, this extension work contains about 180 line codes altogether.For functional simulation portion
Point, all of work is familiar with QEMU by one but the student of first contacts GEMS completes, and the bimestrial time is taken altogether.
Compared with the extension work for spending several man-years several before, such as gem5 and MARSS, the present invention is in extension New function/sequential
Simulation model aspect shows efficiency higher.
Realize two:The parallel acceleration of multinuclear total system simulator
In order to improve the speed of simulation polycaryon processor, the present invention can be further playing main frame by the way of executed in parallel
Platform is more(Core)The performance of processor.With reference to Fig. 1, the present invention is using functional simulation model and the knot of timing simulation model loose coupling
Structure, in simulation process, the interaction required for functional simulation and timing simulation is few, and frequency is low.Therefore, when functional simulation and sequential
After simulation parallelization, two parallel modules belong to the communication mode of loose coupling, so as to than being obtained based on existing close coupling phantom frame
The parallel performance of get Geng Gao.
After functional simulation module and timing simulation modular concurrent, two modular concurrents are performed, and are delayed by the instruction shared
Deposit and communicated with memory access table structure.Functional simulation module is to shared part addition information, and timing simulation module is from altogether
Enjoy in buffering and read information, specifically such as refer to Fig. 6.During execution, in order to avoid two two modules are simultaneously to the one of shared drive
Individual unit is written and read caused analog result mistake, and mutual exclusion lock machine can be added on instruction buffer and memory access table structure
System, it is ensured that the same unit of shared buffer can only be had access at a moment by a module.In order to reduce the conflict of access, this
Invention can refine the granularity of mutual exclusion lock, and each for mutual exclusion lock being applied to instruction buffer and memory access table structure is specific
In entry, when toward record access information in the middle of a certain specific entry, timing simulation module is also for such functional simulation module
The information in other entries can be accessed, so as to improve concurrency, the performance cost that communication band is come is reduced.
Claims (4)
1. a kind of total system simulator of parallel multi-core processor, it is characterised in that frame structure is divided into 4 major parts:Work(
Can analog module, timing simulation module, common interface module, Difference test and adjustment module;Wherein:
The functional simulation module, for realizing specific functional simulation model, be responsible for execute instruction and collect application program and
The execution information of operating system, and the unrelated instruction stream of the architecture needed for Command Resolution into timing simulation and data flow are believed
Breath, is written to the general-purpose interface in the middle of functional simulation module and timing simulation module, is thus delivered to timing simulation module;
The timing simulation module, for realizing specific timing simulation model, is responsible for obtaining connecing for instruction from general-purpose interface
Message is ceased, and timing simulation is carried out to instruction, and more new architecture status information;
The common interface module, module the inside includes an instruction buffer and a memory access table(MAT)Structure;Refer to
Buffering area is made to be mainly used in depositing the instruction stream information of functional simulation Model Transfer, MAT is then mainly used in stored memory access
Traffic flow information;
The inconsistency of the Difference test and adjustment module, main responsible comparing function simulation and timing simulation behavior, and root
According to it is inconsistent the reason for call corresponding processing module to be adjusted respectively.
2. the total system simulator of parallel multi-core processor according to claim 1, it is characterised in that exist for simulator
4 kinds of accuracy influence factors during operation:Have also been devised erroneous path processing module block, exception and interruption processing module, share
Detection and correcting module that the detection of internal storage access and amendment mould and shared page table are accessed, to correct the generation of accuracy influence factor
When the mistake that introduces;Wherein:
The erroneous path processing module, road is performed for processing functional simulation and timing simulation module when abnormal or interruption occurs
Footpath it is inconsistent;When there is branch misprediction in timing simulation model, be functional simulation module creation go out one with it is current when
The completely the same lightweight erroneous path system of sequence analog module system mode, continues executing with pre- by branch in this system
The command path of survey, after the completion of the instruction of mistake is predicted, does not perform submission and directly exits erroneous path simulation, so as to keep away
Exempt to change the state of actual streamline;
The exception and interruption processing module, perform for processing functional simulation and timing simulation module when abnormal or interruption occurs
Path it is inconsistent;When interrupting or exception occurs, the module goes out one with current timing simulation for functional simulation module creation
The completely the same lightweight erroneous path system of modular system state, after interruption or abnormal generation are continued executing with this system
Command path, after the completion of the instruction of erroneous path, do not perform submission and directly exit erroneous path simulation, so as to avoid more
Change the state of actual streamline;
Detection and correcting module that the shared drive is accessed, for detect and correct the functional simulation model that performs in advance and when
The internal storage access order of sequence simulation model is inconsistent, inaccurate to correct inconsistent the brought sequential of shared drive access order
Property, it is to avoid cause the inaccurate of analog result;The detection and correcting module that shared drive is accessed operate to avoid work(by two steps
Can be inconsistent for the access order of shared drive between simulation model and timing simulation model:Held in functional simulation model first
The order that each processor core accesses internal memory is recorded during row;Then when internal memory instruction is submitted in timing simulation model, check
Whether timing simulation model is consistent to the access order of shared drive with functional simulation model, so that it is determined that both process performings
Whether conflict is had, in the event of conflict, it is necessary to system mode rollback functional simulation model according to timing simulation model;
Detection and correcting module that the shared page table is accessed, the treatment for simplifying the conflict of shared page table access order;Work as work(
Energy analog module does not continue to execute instruction when MMU miss are run into, when timing simulation module submits to this to instruct, i.e.,
It is further continued for performing when jumping to correct MMU miss processing paths, so that functional simulation module and timing simulation module exist
Maintained on the access order of shared page table consistent.
3. the total system simulator of parallel multi-core processor according to claim 2, it is characterised in that when needing to
When instruction is simulated, the simulated technological process of system is as follows:First, functional simulation module performs the instruction being modeled;Then, function
Analog module passes to the execution information the information of instruction itself and required for carrying out timing simulation by general-purpose interface
Timing simulation module;Then, timing simulation module will to this instruction simulation when, read from general-purpose interface this instruction phase
Pass information, timing simulation is carried out to it;In timing simulation, the thing of Difference test and adjustment module detection influence simulation accuracy
Part, and according to the accuracy influence factor the being detected inspection that accordingly call error path processing module, shared drive are accessed
Survey and the detection and correcting module of correcting module, exception and interruption processing module and shared page table access are entered to functional simulation model
Row amendment, it is ensured that the path of functional simulation modeling is consistent with the path of timing simulation modeling, so as to ensure simulation
Accuracy;Finally, when this instruction completes to submit in timing simulation module, the architecture shape of timing simulation model is updated
State.
4. the total system simulator of parallel multi-core processor according to claim 3, it is characterised in that common interface module
Design do not rely on functional simulation model and specific instruction architecture relevant information;Further, since timing simulation model
Main analog streamline is relied on and storage organization, and common interface module is used for the relevant information of timing simulation comprising these two aspects;
The major function of general-purpose interface is transmission instruction stream and the traffic flow information between functional simulation and timing simulation module;Wherein:
In the middle of instruction buffer, big I is set the instruction stream information record by configuration file;The each single item of buffering area
Save instruction carry out timing simulation needed for information, by functional simulation modeling it is complete one instruction after fill in;Each single item is seen
Do be corresponding instruction trace, the main information for including three aspects:Streamline relevant information, storage access information and modification
Architectural state information;
The traffic flow information storage is in a table MAT for bidimensional;First dimension of MAT is one with physical address as key
The hash of value, each node therein is referred to as memory address node;Each memory address node possesses a preservation function mould
Access order queue of the analog model to internal memory, for recording the access order for the memory address by each core;Queue it is every
Which core the specifying information that individual nodes records are once accessed, including current access comes from, and is read operation or write operation, and
The information such as the change to internal memory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611108730.9A CN106775597A (en) | 2016-12-06 | 2016-12-06 | A kind of parallel multi-core total system simulator of Loosely Coupled Architecture |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611108730.9A CN106775597A (en) | 2016-12-06 | 2016-12-06 | A kind of parallel multi-core total system simulator of Loosely Coupled Architecture |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106775597A true CN106775597A (en) | 2017-05-31 |
Family
ID=58874466
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611108730.9A Pending CN106775597A (en) | 2016-12-06 | 2016-12-06 | A kind of parallel multi-core total system simulator of Loosely Coupled Architecture |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106775597A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107589960A (en) * | 2017-08-30 | 2018-01-16 | 北京轩宇信息技术有限公司 | A kind of DSP instruction simulation methods based on register access collision detection |
CN108509373A (en) * | 2018-03-19 | 2018-09-07 | 复旦大学 | A kind of total system analog platform towards SoC research and development of software |
CN113609066A (en) * | 2021-06-25 | 2021-11-05 | 深圳大学 | Multi-core RISCV-CPU simulator based on Rust |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101196827A (en) * | 2007-12-28 | 2008-06-11 | 中国科学院计算技术研究所 | Parallel simulator and method |
CN101477474A (en) * | 2009-01-04 | 2009-07-08 | 中国科学院计算技术研究所 | Combined simulation system and its operation method |
CN104361183A (en) * | 2014-11-21 | 2015-02-18 | 中国人民解放军国防科学技术大学 | Microprocessor micro system structure parameter optimizing method based on simulator |
CN105094949A (en) * | 2015-08-06 | 2015-11-25 | 复旦大学 | Method and system for simulation based on instruction calculation model and feedback compensation |
CN106033368A (en) * | 2015-03-09 | 2016-10-19 | 北京大学 | A multi-core virtual machine determinacy replay method |
-
2016
- 2016-12-06 CN CN201611108730.9A patent/CN106775597A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101196827A (en) * | 2007-12-28 | 2008-06-11 | 中国科学院计算技术研究所 | Parallel simulator and method |
CN101477474A (en) * | 2009-01-04 | 2009-07-08 | 中国科学院计算技术研究所 | Combined simulation system and its operation method |
CN104361183A (en) * | 2014-11-21 | 2015-02-18 | 中国人民解放军国防科学技术大学 | Microprocessor micro system structure parameter optimizing method based on simulator |
CN106033368A (en) * | 2015-03-09 | 2016-10-19 | 北京大学 | A multi-core virtual machine determinacy replay method |
CN105094949A (en) * | 2015-08-06 | 2015-11-25 | 复旦大学 | Method and system for simulation based on instruction calculation model and feedback compensation |
Non-Patent Citations (1)
Title |
---|
胡益斌: "可扩展、周期精确、快速多核模拟器研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107589960A (en) * | 2017-08-30 | 2018-01-16 | 北京轩宇信息技术有限公司 | A kind of DSP instruction simulation methods based on register access collision detection |
CN108509373A (en) * | 2018-03-19 | 2018-09-07 | 复旦大学 | A kind of total system analog platform towards SoC research and development of software |
CN113609066A (en) * | 2021-06-25 | 2021-11-05 | 深圳大学 | Multi-core RISCV-CPU simulator based on Rust |
CN113609066B (en) * | 2021-06-25 | 2024-04-12 | 深圳大学 | Multi-core RISCV-CPU simulator based on Rust |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11836641B2 (en) | Machine learning-based prediction of metrics at early-stage circuit design | |
US8855994B2 (en) | Method to simulate a digital system | |
US7472361B2 (en) | System and method for generating a plurality of models at different levels of abstraction from a single master model | |
US8402442B1 (en) | Common debugger method and system | |
KR20060005286A (en) | High performance design verification apparatus using verification results re-use technique and its rapid verification method using the same | |
CN104899076A (en) | Super-large-scale integrated circuit gate-level net list simulation acceleration method | |
US11366948B2 (en) | Machine-learning enhanced compiler | |
US20180285249A1 (en) | Methodology for unit test and regression framework | |
US7437282B2 (en) | Method and apparatus to provide alternative stimulus to signals internal to a model actively running on a logic simulation hardware emulator | |
CN106775597A (en) | A kind of parallel multi-core total system simulator of Loosely Coupled Architecture | |
US11734480B2 (en) | Performance modeling and analysis of microprocessors using dependency graphs | |
CN107533473A (en) | Efficient wave for emulation generates | |
GB2450130A (en) | An Apparatus for Performing Verification of the Design of a Data Processing System using Alternative Hardware Components | |
CN105094949B (en) | A kind of analogy method and system based on instruction computation model and feedback compensation | |
US8027828B2 (en) | Method and apparatus for synchronizing processors in a hardware emulation system | |
Schirner et al. | Abstract, multifaceted modeling of embedded processors for system level design | |
CN109783837A (en) | Emulator, analogue system, emulation mode and simulated program | |
CN111324948B (en) | Simulation method and simulation system | |
CN116450430A (en) | Verification method, verification system and storage medium for processor | |
Maier et al. | Efficient fault injection for embedded systems: as fast as possible but as accurate as necessary | |
CN109426503A (en) | The method and device of simulation excitation is provided | |
US8056037B2 (en) | Method for validating logical function and timing behavior of a digital circuit decision | |
Jimenez et al. | Functional verification of a RISC-V vector accelerator | |
US8145466B1 (en) | Clustering of electronic circuit design modules for hardware-based and software-based co-simulation platforms | |
CN111143208B (en) | Verification method for assisting FPGA to realize AI algorithm based on processor technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170531 |