CN107562688A - A kind of devices at full hardware data driven unit for chip multi-core communication - Google Patents

A kind of devices at full hardware data driven unit for chip multi-core communication Download PDF

Info

Publication number
CN107562688A
CN107562688A CN201710803853.2A CN201710803853A CN107562688A CN 107562688 A CN107562688 A CN 107562688A CN 201710803853 A CN201710803853 A CN 201710803853A CN 107562688 A CN107562688 A CN 107562688A
Authority
CN
China
Prior art keywords
data
function
module
driven unit
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710803853.2A
Other languages
Chinese (zh)
Inventor
王镇
张磊
汪健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North Electronic Research Institute Anhui Co., Ltd.
Original Assignee
North Electronic Research Institute Anhui Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North Electronic Research Institute Anhui Co., Ltd. filed Critical North Electronic Research Institute Anhui Co., Ltd.
Priority to CN201710803853.2A priority Critical patent/CN107562688A/en
Publication of CN107562688A publication Critical patent/CN107562688A/en
Pending legal-status Critical Current

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of devices at full hardware data driven unit for chip multi-core communication, it is arranged between system bus and processor, including packet-receiving module, packet sending module, data extrapolating check module, relation maker and register group.Using the execution of data driven unit driving processor, the sequential expense for avoiding software from realizing.Data driven unit is realized using devices at full hardware, realizes the acceleration of data-driven, improves drive efficiency.The architecture of the device and processor, especially instruction set are unrelated.Using this drive device, enable the system to support the programming of functional language style, be easy to give play to the efficiency of multiprocessor, also allow for writing more complicated control algolithm program, simultaneously no data dependence function can full parellel, support function level concurrent and Out-of-order execution.

Description

A kind of devices at full hardware data driven unit for chip multi-core communication
Technical field
The invention belongs to chip multi-core communication technical field, more particularly to a kind of fully hard number of packages for chip multi-core communication According to drive device.
Background technology
More nuclear ages, the internuclear data exchange mechanism of system processor become the key for influenceing performance, on the one hand high property Can there can be the process cores that need to carry out a large amount of computings in system to provide powerful resolving ability, on the other hand, typical case Data communication needs are very big, and the operation result of each processor is likely to swap with other processors.With technology Development, the high bandwidth to match with computing capability can be provided between processor core, but two subject matters still be present:(1) Programme-control expense is complicated.Communication structure need to introduce the programme-control expense of complexity, reduce communication efficiency;(2)Using trouble.It is logical Letter program needs programmer specially to optimize, and artificial degree of participation is higher, and use is loaded down with trivial details, is unfavorable for shifting of the monokaryon program to multi-core program Plant and debug.The quick transmission how software systems effectively carry out data using its framework advantage already turns into research heat with receiving Point.
The patent of invention of Application No. 201510210877.8 discloses a kind of chip-on communication method being used between multi-core DSP And data set, it is that each DSP core is configured with independent data communication equipment, and each DSP core passes through data communication equipment Affairs are sent to purpose DSP core, is received by data communication equipment and sends data, liberated processor to a certain extent, but It is that it need to provide the global transaction management unit of complexity, and data set receives that to send all be passive type, it is necessary to which processor enters The complicated management control of row, reduces communication efficiency to a certain extent.
The patent of invention of Application No. 201110164421.4 discloses a kind of intercore communication based on FPGA multiple nucleus systems Method, it provides two kinds of communication structure modes, and adjacent processor is internuclear by point-to-point communication mode, non-adjacent internuclear logical The mode for crossing shared buffer, mailbox and mutex communicates, though this structure can provide suitable communication structure according to demand, equally Need software to coordinate, there is certain programme-control expense.
Another thinking of settlement procedure control overhead and ease of use is from hardware configuration.Data driven computer (Data-flow computer)It is the Denis of Massachusetts science and engineering(Jack Dennis)Itd is proposed in 1972, it has deviated from traditional Feng Nuo According to graceful computer thought, final purpose is to realize parallel processing to improve arithmetic speed, after the practice test of 40 years, the model The validity of its verified parallel aspect.Different from traditional von Karman computer, data-flow computer is not to use Order-driven operates, but uses data(Operand)Drive a kind of executive mode of operation, i.e., and if only if a certain operation institute After the operand that needs is all here, this operation just immediately begins to perform.So, if many operations are satisfied by above-mentioned bar Part, their cans are concurrent with each other to be limited without being performed by instruction sequences, so as to fully open up degree of parallelism.Data flow is A kind of computation model based on " asynchronism " and " functionality "." asynchronism " refers to data flow operations once receiving after its operand Start to perform;" functionality " refers to each data flow operations and consumes one group of input, produces one group of output and side effect is not present. Obviously, " asynchronism " is that the basis of concurrency is opened up in data flow;And " functionality " can ensure that any two concurrent operations can be with The concurrently execution of any order is without interfering.
But the existing typically no like of business polycaryon processor, communication system and storage system are each independent Design.The conventional data driving processor of research then only carries the data extrapolating detection module of instruction-level, the meticulous nothing of granularity Method is interactive with communication system in piece.
The content of the invention
The technical problems to be solved by the invention are the defects of overcoming in the prior art, drive computer soft in available data In the achievement of study of hardware, the trend that develops with reference to multinuclear microcontroller, propose a kind of supporting with communication network in piece fully hard Part data driven unit.The hardware of this structure is according to mixing (Hybrid) data-flow computer tissue, with classical processor core For processing unit so that user can use functional language (Functional Programming Language) style to carry out Programming.Intercore communication time delay is hidden by using the execution time of function;Using the spatial and temporal distributions relation between function, hide outer Portion's access time delay of memory, reach with the purpose of reasonable power consumption improving performance.
In order to solve the above technical problems, the present invention provides a kind of devices at full hardware data-driven dress for chip multi-core communication Put, it is characterized in that, it is arranged between system bus and processor, including packet-receiving module, packet sending module, data Completeness checks module, relation maker and register group;
Packet-receiving module receives one's own data, while the relation generation in packet-receiving module from system bus Device calculates data address in the parameter storage during data extrapolating checks module using function number and parameter sequence number, will Valid data are stored in the address;
Data extrapolating checks that the identification generator in module establishes the effective marker of the data using write operation signal;If one Data needed for individual function are all marked as effectively, then the function team that the function number enters in data extrapolating inspection module Row, when function queue not empty, the execution of request signal request processor is generated by interrupt requests;
Processor takes out function number after request signal is received from function queue, and calculates relative program address entries, Realize that function calculates;
Related data is read from parameter storage in calculating process, identification generator is using this read operation signal by the data Significance bit reset, wait trigger next time;
Function sends result to successor function after the completion of calculating.
The execution step of processor is:
The result register of packet sending module is write the result into, then takes out data extrapolating inspection from corresponding address space Successor function information in module relationship memory;Bypass control module according to relevant information judge the data be local data also It is external data;Local data then sends bypass by data and is sent to packet-receiving module, and external data is then sent after package Handled to follow-up processor.
Data extrapolating checks in module that function queue is FIFO, and parameter storage and relationship storage are common SRAM Memory, mark memory is 1 bit memory, and mark memory is in initialization by mark corresponding to untapped memory cell Position 1, the mark position used are reset.
Parameter is divided into different grooves according to function number by relation maker from relationship storage with parameter label, often Individual groove memory space is used for storing parameter, and the number of parameter is controlled by relation maker.
After the complete function of computing device, the information of successor function is found in relationship storage using function label, relation is deposited Reservoir is using " end " marker character come the end of characterization of relation.
Function number is the unique reference number of correlative code section inside processor core, for association code section, the parameter of function And the relation between successor function.
Parameter sequence number binds specific data and function label.
The beneficial effect that the present invention is reached:
The core of data-driven is to establish data with associative operation to contact.It can be realized under normal circumstances using software programming, So that calculating performance can be reduced.The present invention uses independent devices at full hardware data driven unit, introduces function number, parameter sequence Number and " relation " three concepts.Function number is the unique reference number of correlative code section inside present processor core, for association code Relation between section, parameter and successor function.Parameter sequence number then binds specific data and function label." relation " is then used for Characterize the data correlation of the function and successor function.Corresponding hardware module is that data extrapolating detection module and relation store Device.The advantage of the invention is that:
(1)Using the execution of data driven unit driving processor, the sequential expense for avoiding software from realizing.
(2)Data driven unit is realized using devices at full hardware, realizes the acceleration of data-driven, improves drive efficiency
(3)The architecture of the device and processor, especially instruction set are unrelated.Processor can use superscalar processor, Simplest no pipeline processor can also be used;
(4)Using this device so that system can support functional language style to program, and be easy to give play to the efficiency of multiprocessor, Also allow for writing more complicated control algolithm program, at the same the function of no data dependence can full parellel, support function level Concurrent and Out-of-order execution.
Brief description of the drawings
Fig. 1 is data driven unit in the position of system;
Fig. 2 is data driven unit structure chart;
Fig. 3 is data extrapolating detection module;
Fig. 4 is to receive operation pipeline division;
Fig. 5 is to send operation pipeline division.
Embodiment
The invention will be further described below in conjunction with the accompanying drawings.Following examples are only used for clearly illustrating the present invention Technical scheme, and can not be limited the scope of the invention with this.
<One>, apparatus structure
Such as Fig. 1 and Fig. 2, devices at full hardware data driven unit in system application in one layer between bus and processor, including Packet-receiving module, packet sending module, data extrapolating check module, relation maker and register group, realize number According to the receiving of bag, the transmission of packet, data extrapolating detection and driving CPU perform function.Reception process realizes data The unpacking of bag, data extrapolating detection and driving CPU execution;Transmission process realizes that CPU sends package and the transmission of data. Bypass functionality can be realized in transmission process.
Workflow is as follows:
Packet-receiving module receives one's own data, while the relation in packet-receiving module from system bus first Maker then calculates the data in the parameter storage during data extrapolating checks module using function number and parameter sequence number Address, valid data are stored in the address.Function number is the present processor (place to connect with data driven unit corresponding interface Manage device) unique reference number of correlative code section inside core, for the pass between association code section, the parameter of function and successor function System.Parameter sequence number then binds specific data and function label.In the process, due to once writing to parameter storage Operation, data extrapolating check the effective marker that the identification generator in module can utilize write operation signal to establish the data.Such as Data needed for one function of fruit are all marked as effectively, then the function that the function number enters in data extrapolating inspection module Queue, when function queue not empty, it is (i.e. corresponding with data driven unit to generate request signal request processor by interrupt requests The processor that interface connects) execution.Processor takes out function number after request signal is received from function queue, and calculates Go out relative program address entries, realize that function calculates.Need to read related data, mark life from parameter storage in calculating process Grow up to be a useful person and reset the significance bit of the data using this read operation signal, wait triggers next time.
Function needs to send result to successor function after the completion of calculating.Processor writes the result into packet and sends mould first Block result register, then take out data extrapolating from corresponding address space and check that the successor function in module relationship memory is believed Breath.Bypass control module judges that the data are local data or external data according to relevant information.Local data then passes through number Packet-receiving module is sent to according to bypass is sent, and external data is then sent to follow-up processor processing after package.
<Two>, data extrapolating detection module
The general principle of data extrapolating inspection sees Fig. 3, it is necessary to three physical memory elements, wherein function queue are FIFO, ginseng Number memory and relationship storage are common SRAM memories, and mark memory is 1 bit memory.Parameter is led to relationship storage Cross relation maker and different grooves (Slot) are flexibly divided into from parameter label according to function number, each groove memory space is used To store parameter, bit wide is 32, i.e., each supported changeable parameters of function, the number of parameter passes through relation maker control System, could support up the parameter of 128 32.
After the complete function of computing device, the information of successor function is found in relationship storage using function label.Relation Memory is using " end " marker character come the end of characterization of relation.As shown in figure 3, number f has two successor functions:Function g parameter 1 and function h parameter 2.Packet sending module utilizes the slot address specified by function number, reads first data first, That is first " relation ", if not " end ", then be effective successor function, send it to packet sending module, if then table Show that data have been sent needed for successor function.
Memory is identified in initialization by untapped parameter marker bit corresponding with the memory cell of relationship storage 1 is put, the mark position used is reset.Function f as shown in Figure 3 has three parameters, and when these three parameters arrive separately at, label is deposited Reservoir puts 1 successively, and when the mark of the groove is 1, parameter sequence number enters function queue, waits the execution of processor.
<Three>, streamline division
The streamline of one data manipulation of reception of devices at full hardware data driven unit, as shown in figure 4, it is divided into 3 grades, first order flowing water Receiver function label, parameter sequence number are realized, and generates parameter storage address;Second level flowing water receives data and is stored in parameter and posts Storage, the corresponding parameter identification of renewal, to check whether there is function data complete;Third level flowing water, which is realized to work as, detects function data It is complete identified press-in queue, generation queue not empty signal renewal status register flag and set up interrupt signal, can Receive new function identification and parameter sequence number.The streamline of operation is sent as shown in figure 5, being divided into 3 grades, first order flowing water is realized Purpose processor label is read from relationship storage, and judges whether first relation is " END ", writes the result into deposit Device;Second level flowing water, which is realized, deposits ID number, function number and the data sequence number write-in label that first relation is not " END " Device, and give bypass controller and make a decision, first content of relationship storage is read, and judge whether it is " END ";The third level is flowed Water realizes that this Nuclear Data is transmitted to receiving module by bypass controller by bypass, and non-Nuclear Data is sent into bus, by second ID number, function number and the data sequence number write-in tag register of relation.And bypass controller is given, read relationship storage the Three contents, and judge whether it is " END ".When sending multiple subsequent datas, each data can introduce a cycle more.
Described above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, without departing from the technical principles of the invention, some improvement and deformation can also be made, these are improved and deformation Also it should be regarded as protection scope of the present invention.

Claims (7)

1. a kind of devices at full hardware data driven unit for chip multi-core communication, it is characterized in that, it is arranged on system bus and processing Between device, including packet-receiving module, packet sending module, data extrapolating check module, relation maker and deposit Device group;
Packet-receiving module receives one's own data, while the relation generation in packet-receiving module from system bus Device calculates data address in the parameter storage during data extrapolating checks module using function number and parameter sequence number, will Valid data are stored in the address;
Data extrapolating checks that the identification generator in module establishes the effective marker of the data using write operation signal;If one Data needed for individual function are all marked as effectively, then the function team that the function number enters in data extrapolating inspection module Row, when function queue not empty, the execution of request signal request processor is generated by interrupt requests;
Processor takes out function number after request signal is received from function queue, and calculates relative program address entries, Realize that function calculates;
Related data is read from parameter storage in calculating process, identification generator is using this read operation signal by the data Significance bit reset, wait trigger next time;
Function sends result to successor function after the completion of calculating.
2. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1, it is characterized in that,
The execution step of processor is:
The result register of packet sending module is write the result into, then takes out data extrapolating inspection from corresponding address space Successor function information in module relationship memory;Bypass control module according to relevant information judge the data be local data also It is external data;Local data then sends bypass by data and is sent to packet-receiving module, and external data is then sent after package Handled to follow-up processor.
3. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1, it is characterized in that, number Check in module that function queue is FIFO, and parameter storage and relationship storage are common SRAM memories according to completeness, mark Memory is 1 bit memory, and mark memory, by mark position 1 corresponding to untapped memory cell, uses in initialization Mark position reset.
4. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1 or 3, its feature It is that parameter is divided into different grooves, each groove according to function number by relation maker from relationship storage with parameter label Memory space is used for storing parameter, and the number of parameter is controlled by relation maker.
5. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1 or 3, its feature It is after the complete function of computing device, to find the information of successor function, relationship storage profit in relationship storage using function label With " end " marker character come the end of characterization of relation.
6. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1, it is characterized in that, letter Number label is the unique reference number of correlative code section inside processor core, for association code section, the parameter of function and successor function Between relation.
7. a kind of devices at full hardware data driven unit for chip multi-core communication according to claim 1, it is characterized in that, ginseng Number sequence number binds specific data and function label.
CN201710803853.2A 2017-09-08 2017-09-08 A kind of devices at full hardware data driven unit for chip multi-core communication Pending CN107562688A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710803853.2A CN107562688A (en) 2017-09-08 2017-09-08 A kind of devices at full hardware data driven unit for chip multi-core communication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710803853.2A CN107562688A (en) 2017-09-08 2017-09-08 A kind of devices at full hardware data driven unit for chip multi-core communication

Publications (1)

Publication Number Publication Date
CN107562688A true CN107562688A (en) 2018-01-09

Family

ID=60979818

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710803853.2A Pending CN107562688A (en) 2017-09-08 2017-09-08 A kind of devices at full hardware data driven unit for chip multi-core communication

Country Status (1)

Country Link
CN (1) CN107562688A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521201A (en) * 2011-11-16 2012-06-27 刘大可 Multi-core DSP (digital signal processor) system-on-chip and data transmission method
CN103218344A (en) * 2013-04-28 2013-07-24 上海大学 Data communication circuit arranged among a plurality of processors and adopting data driving mechanism

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521201A (en) * 2011-11-16 2012-06-27 刘大可 Multi-core DSP (digital signal processor) system-on-chip and data transmission method
CN103218344A (en) * 2013-04-28 2013-07-24 上海大学 Data communication circuit arranged among a plurality of processors and adopting data driving mechanism

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
徐云川: ""基于交叉点缓存路由的数据驱动多核处理器研究"", 《中国优秀硕士学位论文全文数据库》 *
王镇: ""数据驱动交点队列型片上路由器研究"", 《中国优秀硕士学位论文全文数据库》 *

Similar Documents

Publication Publication Date Title
CN108701040A (en) Method, apparatus, and instructions for user-level thread suspension
CN106648554B (en) For improving system, the method and apparatus of the handling capacity in continuous transactional memory area
CN104603748B (en) The processor of instruction is utilized with multiple cores, shared core extension logic and the extension of shared core
CN104050023B (en) System and method for realizing transaction memory
TWI294573B (en) Apparatus and method for controlling establishing command order in an out of order dma command queue, and computer readable medium recording with related instructions
CN105760140B (en) The instruction and logic of state are executed for testing transactional
CN101529377B (en) The methods, devices and systems of communication between multithreading in processor
CN105453041B (en) The method and apparatus for determining and instructing scheduling are occupied for cache
CN105612502B (en) Virtually retry queue
CN108388528A (en) Hardware based virtual machine communication
CN108268386A (en) Memory order in accelerating hardware
CN106575218A (en) Persistent store fence processors, methods, systems, and instructions
CN107667358A (en) For the coherent structure interconnection used in multiple topological structures
CN108268282A (en) Be used to check and store to storage address whether processor, method, system and the instruction of the instruction in long-time memory
CN108292239A (en) Multi-core communication acceleration using hardware queue devices
CN104813279B (en) For reducing the instruction of the element in the vector registor with stride formula access module
CN102135949B (en) Computing network system, method and device based on graphic processing unit
CN102073543B (en) General processor and graphics processor fusion system and method
CN107092573A (en) Work in heterogeneous computing system is stolen
CN104025067B (en) With the processor for being instructed by vector conflict and being replaced the shared full connection interconnection of instruction
CN105190538B (en) System and method for the mobile mark tracking eliminated in operation
CN106708753A (en) Acceleration operation device and acceleration operation method for processors with shared virtual memories
CN101454753A (en) Handling address translations and exceptions for heterogeneous resources
CN105426160A (en) Instruction classified multi-emitting method based on SPRAC V8 instruction set
CN103309786A (en) Methods and apparatus for interactive debugging on a non-pre-emptible graphics processing unit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180109

RJ01 Rejection of invention patent application after publication