CN107562688A - A kind of devices at full hardware data driven unit for chip multi-core communication - Google Patents
A kind of devices at full hardware data driven unit for chip multi-core communication Download PDFInfo
- Publication number
- CN107562688A CN107562688A CN201710803853.2A CN201710803853A CN107562688A CN 107562688 A CN107562688 A CN 107562688A CN 201710803853 A CN201710803853 A CN 201710803853A CN 107562688 A CN107562688 A CN 107562688A
- Authority
- CN
- China
- Prior art keywords
- data
- function
- module
- driven unit
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a kind of devices at full hardware data driven unit for chip multi-core communication, it is arranged between system bus and processor, including packet-receiving module, packet sending module, data extrapolating check module, relation maker and register group.Using the execution of data driven unit driving processor, the sequential expense for avoiding software from realizing.Data driven unit is realized using devices at full hardware, realizes the acceleration of data-driven, improves drive efficiency.The architecture of the device and processor, especially instruction set are unrelated.Using this drive device, enable the system to support the programming of functional language style, be easy to give play to the efficiency of multiprocessor, also allow for writing more complicated control algolithm program, simultaneously no data dependence function can full parellel, support function level concurrent and Out-of-order execution.
Description
Technical field
The invention belongs to chip multi-core communication technical field, more particularly to a kind of fully hard number of packages for chip multi-core communication
According to drive device.
Background technology
More nuclear ages, the internuclear data exchange mechanism of system processor become the key for influenceing performance, on the one hand high property
Can there can be the process cores that need to carry out a large amount of computings in system to provide powerful resolving ability, on the other hand, typical case
Data communication needs are very big, and the operation result of each processor is likely to swap with other processors.With technology
Development, the high bandwidth to match with computing capability can be provided between processor core, but two subject matters still be present:(1)
Programme-control expense is complicated.Communication structure need to introduce the programme-control expense of complexity, reduce communication efficiency;(2)Using trouble.It is logical
Letter program needs programmer specially to optimize, and artificial degree of participation is higher, and use is loaded down with trivial details, is unfavorable for shifting of the monokaryon program to multi-core program
Plant and debug.The quick transmission how software systems effectively carry out data using its framework advantage already turns into research heat with receiving
Point.
The patent of invention of Application No. 201510210877.8 discloses a kind of chip-on communication method being used between multi-core DSP
And data set, it is that each DSP core is configured with independent data communication equipment, and each DSP core passes through data communication equipment
Affairs are sent to purpose DSP core, is received by data communication equipment and sends data, liberated processor to a certain extent, but
It is that it need to provide the global transaction management unit of complexity, and data set receives that to send all be passive type, it is necessary to which processor enters
The complicated management control of row, reduces communication efficiency to a certain extent.
The patent of invention of Application No. 201110164421.4 discloses a kind of intercore communication based on FPGA multiple nucleus systems
Method, it provides two kinds of communication structure modes, and adjacent processor is internuclear by point-to-point communication mode, non-adjacent internuclear logical
The mode for crossing shared buffer, mailbox and mutex communicates, though this structure can provide suitable communication structure according to demand, equally
Need software to coordinate, there is certain programme-control expense.
Another thinking of settlement procedure control overhead and ease of use is from hardware configuration.Data driven computer
(Data-flow computer)It is the Denis of Massachusetts science and engineering(Jack Dennis)Itd is proposed in 1972, it has deviated from traditional Feng Nuo
According to graceful computer thought, final purpose is to realize parallel processing to improve arithmetic speed, after the practice test of 40 years, the model
The validity of its verified parallel aspect.Different from traditional von Karman computer, data-flow computer is not to use
Order-driven operates, but uses data(Operand)Drive a kind of executive mode of operation, i.e., and if only if a certain operation institute
After the operand that needs is all here, this operation just immediately begins to perform.So, if many operations are satisfied by above-mentioned bar
Part, their cans are concurrent with each other to be limited without being performed by instruction sequences, so as to fully open up degree of parallelism.Data flow is
A kind of computation model based on " asynchronism " and " functionality "." asynchronism " refers to data flow operations once receiving after its operand
Start to perform;" functionality " refers to each data flow operations and consumes one group of input, produces one group of output and side effect is not present.
Obviously, " asynchronism " is that the basis of concurrency is opened up in data flow;And " functionality " can ensure that any two concurrent operations can be with
The concurrently execution of any order is without interfering.
But the existing typically no like of business polycaryon processor, communication system and storage system are each independent
Design.The conventional data driving processor of research then only carries the data extrapolating detection module of instruction-level, the meticulous nothing of granularity
Method is interactive with communication system in piece.
The content of the invention
The technical problems to be solved by the invention are the defects of overcoming in the prior art, drive computer soft in available data
In the achievement of study of hardware, the trend that develops with reference to multinuclear microcontroller, propose a kind of supporting with communication network in piece fully hard
Part data driven unit.The hardware of this structure is according to mixing (Hybrid) data-flow computer tissue, with classical processor core
For processing unit so that user can use functional language (Functional Programming Language) style to carry out
Programming.Intercore communication time delay is hidden by using the execution time of function;Using the spatial and temporal distributions relation between function, hide outer
Portion's access time delay of memory, reach with the purpose of reasonable power consumption improving performance.
In order to solve the above technical problems, the present invention provides a kind of devices at full hardware data-driven dress for chip multi-core communication
Put, it is characterized in that, it is arranged between system bus and processor, including packet-receiving module, packet sending module, data
Completeness checks module, relation maker and register group;
Packet-receiving module receives one's own data, while the relation generation in packet-receiving module from system bus
Device calculates data address in the parameter storage during data extrapolating checks module using function number and parameter sequence number, will
Valid data are stored in the address;
Data extrapolating checks that the identification generator in module establishes the effective marker of the data using write operation signal;If one
Data needed for individual function are all marked as effectively, then the function team that the function number enters in data extrapolating inspection module
Row, when function queue not empty, the execution of request signal request processor is generated by interrupt requests;
Processor takes out function number after request signal is received from function queue, and calculates relative program address entries,
Realize that function calculates;
Related data is read from parameter storage in calculating process, identification generator is using this read operation signal by the data
Significance bit reset, wait trigger next time;
Function sends result to successor function after the completion of calculating.
The execution step of processor is:
The result register of packet sending module is write the result into, then takes out data extrapolating inspection from corresponding address space
Successor function information in module relationship memory;Bypass control module according to relevant information judge the data be local data also
It is external data;Local data then sends bypass by data and is sent to packet-receiving module, and external data is then sent after package
Handled to follow-up processor.
Data extrapolating checks in module that function queue is FIFO, and parameter storage and relationship storage are common SRAM
Memory, mark memory is 1 bit memory, and mark memory is in initialization by mark corresponding to untapped memory cell
Position 1, the mark position used are reset.
Parameter is divided into different grooves according to function number by relation maker from relationship storage with parameter label, often
Individual groove memory space is used for storing parameter, and the number of parameter is controlled by relation maker.
After the complete function of computing device, the information of successor function is found in relationship storage using function label, relation is deposited
Reservoir is using " end " marker character come the end of characterization of relation.
Function number is the unique reference number of correlative code section inside processor core, for association code section, the parameter of function
And the relation between successor function.
Parameter sequence number binds specific data and function label.
The beneficial effect that the present invention is reached:
The core of data-driven is to establish data with associative operation to contact.It can be realized under normal circumstances using software programming,
So that calculating performance can be reduced.The present invention uses independent devices at full hardware data driven unit, introduces function number, parameter sequence
Number and " relation " three concepts.Function number is the unique reference number of correlative code section inside present processor core, for association code
Relation between section, parameter and successor function.Parameter sequence number then binds specific data and function label." relation " is then used for
Characterize the data correlation of the function and successor function.Corresponding hardware module is that data extrapolating detection module and relation store
Device.The advantage of the invention is that:
(1)Using the execution of data driven unit driving processor, the sequential expense for avoiding software from realizing.
(2)Data driven unit is realized using devices at full hardware, realizes the acceleration of data-driven, improves drive efficiency
(3)The architecture of the device and processor, especially instruction set are unrelated.Processor can use superscalar processor,
Simplest no pipeline processor can also be used;
(4)Using this device so that system can support functional language style to program, and be easy to give play to the efficiency of multiprocessor,
Also allow for writing more complicated control algolithm program, at the same the function of no data dependence can full parellel, support function level
Concurrent and Out-of-order execution.
Brief description of the drawings
Fig. 1 is data driven unit in the position of system;
Fig. 2 is data driven unit structure chart;
Fig. 3 is data extrapolating detection module;
Fig. 4 is to receive operation pipeline division;
Fig. 5 is to send operation pipeline division.
Embodiment
The invention will be further described below in conjunction with the accompanying drawings.Following examples are only used for clearly illustrating the present invention
Technical scheme, and can not be limited the scope of the invention with this.
<One>, apparatus structure
Such as Fig. 1 and Fig. 2, devices at full hardware data driven unit in system application in one layer between bus and processor, including
Packet-receiving module, packet sending module, data extrapolating check module, relation maker and register group, realize number
According to the receiving of bag, the transmission of packet, data extrapolating detection and driving CPU perform function.Reception process realizes data
The unpacking of bag, data extrapolating detection and driving CPU execution;Transmission process realizes that CPU sends package and the transmission of data.
Bypass functionality can be realized in transmission process.
Workflow is as follows:
Packet-receiving module receives one's own data, while the relation in packet-receiving module from system bus first
Maker then calculates the data in the parameter storage during data extrapolating checks module using function number and parameter sequence number
Address, valid data are stored in the address.Function number is the present processor (place to connect with data driven unit corresponding interface
Manage device) unique reference number of correlative code section inside core, for the pass between association code section, the parameter of function and successor function
System.Parameter sequence number then binds specific data and function label.In the process, due to once writing to parameter storage
Operation, data extrapolating check the effective marker that the identification generator in module can utilize write operation signal to establish the data.Such as
Data needed for one function of fruit are all marked as effectively, then the function that the function number enters in data extrapolating inspection module
Queue, when function queue not empty, it is (i.e. corresponding with data driven unit to generate request signal request processor by interrupt requests
The processor that interface connects) execution.Processor takes out function number after request signal is received from function queue, and calculates
Go out relative program address entries, realize that function calculates.Need to read related data, mark life from parameter storage in calculating process
Grow up to be a useful person and reset the significance bit of the data using this read operation signal, wait triggers next time.
Function needs to send result to successor function after the completion of calculating.Processor writes the result into packet and sends mould first
Block result register, then take out data extrapolating from corresponding address space and check that the successor function in module relationship memory is believed
Breath.Bypass control module judges that the data are local data or external data according to relevant information.Local data then passes through number
Packet-receiving module is sent to according to bypass is sent, and external data is then sent to follow-up processor processing after package.
<Two>, data extrapolating detection module
The general principle of data extrapolating inspection sees Fig. 3, it is necessary to three physical memory elements, wherein function queue are FIFO, ginseng
Number memory and relationship storage are common SRAM memories, and mark memory is 1 bit memory.Parameter is led to relationship storage
Cross relation maker and different grooves (Slot) are flexibly divided into from parameter label according to function number, each groove memory space is used
To store parameter, bit wide is 32, i.e., each supported changeable parameters of function, the number of parameter passes through relation maker control
System, could support up the parameter of 128 32.
After the complete function of computing device, the information of successor function is found in relationship storage using function label.Relation
Memory is using " end " marker character come the end of characterization of relation.As shown in figure 3, number f has two successor functions:Function g parameter
1 and function h parameter 2.Packet sending module utilizes the slot address specified by function number, reads first data first,
That is first " relation ", if not " end ", then be effective successor function, send it to packet sending module, if then table
Show that data have been sent needed for successor function.
Memory is identified in initialization by untapped parameter marker bit corresponding with the memory cell of relationship storage
1 is put, the mark position used is reset.Function f as shown in Figure 3 has three parameters, and when these three parameters arrive separately at, label is deposited
Reservoir puts 1 successively, and when the mark of the groove is 1, parameter sequence number enters function queue, waits the execution of processor.
<Three>, streamline division
The streamline of one data manipulation of reception of devices at full hardware data driven unit, as shown in figure 4, it is divided into 3 grades, first order flowing water
Receiver function label, parameter sequence number are realized, and generates parameter storage address;Second level flowing water receives data and is stored in parameter and posts
Storage, the corresponding parameter identification of renewal, to check whether there is function data complete;Third level flowing water, which is realized to work as, detects function data
It is complete identified press-in queue, generation queue not empty signal renewal status register flag and set up interrupt signal, can
Receive new function identification and parameter sequence number.The streamline of operation is sent as shown in figure 5, being divided into 3 grades, first order flowing water is realized
Purpose processor label is read from relationship storage, and judges whether first relation is " END ", writes the result into deposit
Device;Second level flowing water, which is realized, deposits ID number, function number and the data sequence number write-in label that first relation is not " END "
Device, and give bypass controller and make a decision, first content of relationship storage is read, and judge whether it is " END ";The third level is flowed
Water realizes that this Nuclear Data is transmitted to receiving module by bypass controller by bypass, and non-Nuclear Data is sent into bus, by second
ID number, function number and the data sequence number write-in tag register of relation.And bypass controller is given, read relationship storage the
Three contents, and judge whether it is " END ".When sending multiple subsequent datas, each data can introduce a cycle more.
Described above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, without departing from the technical principles of the invention, some improvement and deformation can also be made, these are improved and deformation
Also it should be regarded as protection scope of the present invention.
Claims (7)
1. a kind of devices at full hardware data driven unit for chip multi-core communication, it is characterized in that, it is arranged on system bus and processing
Between device, including packet-receiving module, packet sending module, data extrapolating check module, relation maker and deposit
Device group;
Packet-receiving module receives one's own data, while the relation generation in packet-receiving module from system bus
Device calculates data address in the parameter storage during data extrapolating checks module using function number and parameter sequence number, will
Valid data are stored in the address;
Data extrapolating checks that the identification generator in module establishes the effective marker of the data using write operation signal;If one
Data needed for individual function are all marked as effectively, then the function team that the function number enters in data extrapolating inspection module
Row, when function queue not empty, the execution of request signal request processor is generated by interrupt requests;
Processor takes out function number after request signal is received from function queue, and calculates relative program address entries,
Realize that function calculates;
Related data is read from parameter storage in calculating process, identification generator is using this read operation signal by the data
Significance bit reset, wait trigger next time;
Function sends result to successor function after the completion of calculating.
2. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1, it is characterized in that,
The execution step of processor is:
The result register of packet sending module is write the result into, then takes out data extrapolating inspection from corresponding address space
Successor function information in module relationship memory;Bypass control module according to relevant information judge the data be local data also
It is external data;Local data then sends bypass by data and is sent to packet-receiving module, and external data is then sent after package
Handled to follow-up processor.
3. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1, it is characterized in that, number
Check in module that function queue is FIFO, and parameter storage and relationship storage are common SRAM memories according to completeness, mark
Memory is 1 bit memory, and mark memory, by mark position 1 corresponding to untapped memory cell, uses in initialization
Mark position reset.
4. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1 or 3, its feature
It is that parameter is divided into different grooves, each groove according to function number by relation maker from relationship storage with parameter label
Memory space is used for storing parameter, and the number of parameter is controlled by relation maker.
5. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1 or 3, its feature
It is after the complete function of computing device, to find the information of successor function, relationship storage profit in relationship storage using function label
With " end " marker character come the end of characterization of relation.
6. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1, it is characterized in that, letter
Number label is the unique reference number of correlative code section inside processor core, for association code section, the parameter of function and successor function
Between relation.
7. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1, it is characterized in that, ginseng
Number sequence number binds specific data and function label.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710803853.2A CN107562688A (en) | 2017-09-08 | 2017-09-08 | A kind of devices at full hardware data driven unit for chip multi-core communication |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710803853.2A CN107562688A (en) | 2017-09-08 | 2017-09-08 | A kind of devices at full hardware data driven unit for chip multi-core communication |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107562688A true CN107562688A (en) | 2018-01-09 |
Family
ID=60979818
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710803853.2A Pending CN107562688A (en) | 2017-09-08 | 2017-09-08 | A kind of devices at full hardware data driven unit for chip multi-core communication |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107562688A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521201A (en) * | 2011-11-16 | 2012-06-27 | 刘大可 | Multi-core DSP (digital signal processor) system-on-chip and data transmission method |
CN103218344A (en) * | 2013-04-28 | 2013-07-24 | 上海大学 | Data communication circuit arranged among a plurality of processors and adopting data driving mechanism |
-
2017
- 2017-09-08 CN CN201710803853.2A patent/CN107562688A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521201A (en) * | 2011-11-16 | 2012-06-27 | 刘大可 | Multi-core DSP (digital signal processor) system-on-chip and data transmission method |
CN103218344A (en) * | 2013-04-28 | 2013-07-24 | 上海大学 | Data communication circuit arranged among a plurality of processors and adopting data driving mechanism |
Non-Patent Citations (2)
Title |
---|
徐云川: ""基于交叉点缓存路由的数据驱动多核处理器研究"", 《中国优秀硕士学位论文全文数据库》 * |
王镇: ""数据驱动交点队列型片上路由器研究"", 《中国优秀硕士学位论文全文数据库》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108701040A (en) | Method, apparatus, and instructions for user-level thread suspension | |
CN106648554B (en) | For improving system, the method and apparatus of the handling capacity in continuous transactional memory area | |
CN104603748B (en) | The processor of instruction is utilized with multiple cores, shared core extension logic and the extension of shared core | |
CN104050023B (en) | System and method for realizing transaction memory | |
TWI294573B (en) | Apparatus and method for controlling establishing command order in an out of order dma command queue, and computer readable medium recording with related instructions | |
CN105760140B (en) | The instruction and logic of state are executed for testing transactional | |
CN101529377B (en) | The methods, devices and systems of communication between multithreading in processor | |
CN105453041B (en) | The method and apparatus for determining and instructing scheduling are occupied for cache | |
CN105612502B (en) | Virtually retry queue | |
CN108388528A (en) | Hardware based virtual machine communication | |
CN108268386A (en) | Memory order in accelerating hardware | |
CN106575218A (en) | Persistent store fence processors, methods, systems, and instructions | |
CN107667358A (en) | For the coherent structure interconnection used in multiple topological structures | |
CN108268282A (en) | Be used to check and store to storage address whether processor, method, system and the instruction of the instruction in long-time memory | |
CN108292239A (en) | Multi-core communication acceleration using hardware queue devices | |
CN104813279B (en) | For reducing the instruction of the element in the vector registor with stride formula access module | |
CN102135949B (en) | Computing network system, method and device based on graphic processing unit | |
CN102073543B (en) | General processor and graphics processor fusion system and method | |
CN107092573A (en) | Work in heterogeneous computing system is stolen | |
CN104025067B (en) | With the processor for being instructed by vector conflict and being replaced the shared full connection interconnection of instruction | |
CN105190538B (en) | System and method for the mobile mark tracking eliminated in operation | |
CN106708753A (en) | Acceleration operation device and acceleration operation method for processors with shared virtual memories | |
CN101454753A (en) | Handling address translations and exceptions for heterogeneous resources | |
CN105426160A (en) | Instruction classified multi-emitting method based on SPRAC V8 instruction set | |
CN103309786A (en) | Methods and apparatus for interactive debugging on a non-pre-emptible graphics processing unit |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180109 |
|
RJ01 | Rejection of invention patent application after publication |