CN101634979A - Multi-core processor satisfying SystemC syntax - Google Patents

Multi-core processor satisfying SystemC syntax Download PDF

Info

Publication number
CN101634979A
CN101634979A CN200810117019A CN200810117019A CN101634979A CN 101634979 A CN101634979 A CN 101634979A CN 200810117019 A CN200810117019 A CN 200810117019A CN 200810117019 A CN200810117019 A CN 200810117019A CN 101634979 A CN101634979 A CN 101634979A
Authority
CN
China
Prior art keywords
fifo
event
unit
processor
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200810117019A
Other languages
Chinese (zh)
Other versions
CN101634979B (en
Inventor
陈曦
黄毅
刘祥
张金龙
刘玉东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN2008101170199A priority Critical patent/CN101634979B/en
Publication of CN101634979A publication Critical patent/CN101634979A/en
Application granted granted Critical
Publication of CN101634979B publication Critical patent/CN101634979B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a multi-core processor satisfying SystemC, which comprises an array formed by a plurality of exchanging units for exchanging data through connection, and a plurality of processing units connected with the exchanging units, used for processing data and comprising local resource units, wherein the processing units are connected to other processing units by an exchanging unit adapter and an exchanging unit array through the local resource units, and the local resource units comprise one or more of an event processing cell block of the SystemC, a first-in-first-out queue cell block of the SystemC and a mutual exclusion and semaphore cell block of the SystemC. Through the multi-core processor, the mapping supporting a SystemC syntax unit to resources in a chip is achieved so that the local resources can be applied to the resource sharing and the synchronization between any plurality of processing units and between the processing unites and peripherals.

Description

A kind of polycaryon processor of satisfying SystemC syntax
Technical field
The present invention relates to a kind of polycaryon processor, more specifically, the present invention relates to a kind of polycaryon processor of satisfying SystemC syntax requirement.
Background technology
In recent years, the polycaryon processor technology has obtained flourish.Trace it to its cause, mainly contain following some: the first, design requirement; The cost of silicon is more and more lower, and integrated level is more and more higher, and the regular texture of multinuclear can the more silicon area of simple and effective utilization; The second, business demand; Multimedia service becomes increasingly complex, and requires the dirigibility of platform and handling property more and more higher, also requires power consumption to keep in allowed limits simultaneously, and polycaryon processor can potentially provide peak performance power consumption ratio, can supply more high performance calculating and dirigibility; The 3rd, Time To Market (Time to market) demand; Time To Market requires shorter and shorter, and the concurrent development of polycaryon processor can better meet required Time To Market.
Usually, polycaryon processor has following two kinds of development models.
First kind polycaryon processor does not change design cycle and the programming mode that existing order is carried out, and just adopts more advanced technique of compiling to adapt to the framework of multinuclear.
Multinuclear here role only is to replace monokaryon and more computing function is provided.No matter at present most processors is monokaryon or multinuclear, all the programming model carried out in proper order of employing.Under this model,, introduced support multi-task operation system in order to support multitask.Operating system can be for carrying out multitasking programming and the executed in parallel code provides approach.But under the situation that exists operating system and a plurality of tasks in parallel to exist, it is relatively very complicated that whole embedded system becomes again, and debugging difficulty is compared with the monokaryon single task greatly to be increased.A kind of debud mode is a debugging breakpoints, and when processor suspended execution at the breakpoint place, outside initial conditions still may change, because the time-out of processor, the condition that mistake takes place may not be reproduced.Another debud mode is printout, and print in the place that may make mistakes, because the possibility of result of printing is very various, mistake is difficult to the location.In case and processor carries out and to make mistakes, and may printing itself can not work before mistake takes place.The another one problem after the operating system introduced is the electric weight waste that the free time running of processor causes.Because a plurality of tasks are arranged, though peripheral hardware can stop as required, when processor should enter energy-saving mode and when recover to become from energy-saving mode is difficult to determine, thereby causes the electric weight waste.According to statistics, for above-mentioned reasons, cause only about half of embedded system project failure.
(2) second class polycaryon processors adopt parallel language and programming mode, come the physical structure of design processor according to the needs of parallel language and programming mode.The polycaryon processor that adopts this mode to design can closely cooperate with parallel language, and expectation overcomes the shortcoming of first kind processor debugging difficulty and electric weight waste.
Present polycaryon processor all belongs to the first kind, and the second class processor still is at an early stage of development, and does not have ripe design to use.
Summary of the invention
For overcoming the defective that has the polycaryon processor debug difficulties now and do not support parallel language, the present invention proposes a kind of polycaryon processor of satisfying SystemC syntax requirement.
According to an aspect of the present invention, the polycaryon processor of a kind of systemC of satisfying has been proposed, comprise: the array that a plurality of crosspoints that are used for swap data are connected to form, the a plurality of processing units that comprise the local resource unit that are used for data processing that are connected with described crosspoint, described processing unit links to each other with described crosspoint through the crosspoint adapter by the local resource unit, and then link to each other with other a plurality of processing units, wherein, described local resource unit comprises the event handling unit group of SystemC, in the fifo queue unit group of SystemC and the mutual exclusion of SystemC and the semaphore unit group one or more.
Wherein, described processing unit also comprises a processor core, processor core and crosspoint adapter bridge, wherein, described processor core is communicated by letter with described crosspoint adapter with crosspoint adapter bridge by described processor core, and described crosspoint adapter connects event handling unit group, the fifo queue unit group of SystemC and mutual exclusion and the semaphore unit group of SystemC of the SystemC in described crosspoint and the described local resource.
Described polycaryon processor also comprises timer group and on-chip memory, and described timer group comprises a plurality of timers, and each timer is exported overtime notification signal to the corresponding unit of described event handling unit group.
Described polycaryon processor, controller when also comprising the processor core operation, controller when described processor core local resource unit, described on-chip memory, described timer group are connected to the operation of described processor core respectively, controller sends the processor time-out and continues exercise notice when described processor core moves, output operation/stop signal.
Wherein, described processor core links to each other with described processor core by local bus with crosspoint adapter bridge, and described processor core is connected with described crosspoint adapter with crosspoint adapter bridge, described processor core and crosspoint adapter bridge are used for processor core and directly visit other processing unit and peripheral unit on the sheet, and are used for communicating by letter between processor core instruction and data bus interface and the crosspoint adapter.
Wherein, described event handling unit group comprises the sc_event according to the SystemC grammer, the various publicly-owned function of sc_event_queue, the logical circuit that the responsive table of the various forms of wait (...) function and SC_THREAD and SC_METHOD is realized, described logical circuit is used to send the signal that activates and suspend processor, and can realize a plurality of sc_event or the defined function of sc_event_queue syntactic units so that the mapping algorithm of SDK (Software Development Kit) will be positioned at a plurality of event handlings unit group of different processing units to be made up, the function of wait (...) function, the responsive table function of SC_THREAD and SC_METHOD.
Wherein, described mutual exclusion and semaphore unit group comprise a plurality of mutual exclusions and semaphore unit, described mutual exclusion and semaphore unit comprise the sc_mutex (name) based on the SystemC grammer, sc_mutex.lock (), sc_mutex.trylock (), sc_mutex.unlock (), sc_semaphore.wait (), sc_semaphore.trywait (), sc_semaphore.post (), sc_semaphore (init_value), sc_semaphore (name, init_value) logical circuit of function code realization, described logical circuit is used for when processor core moves controller and sends the signal that activates and suspend processor, and by the mapping algorithm of SDK (Software Development Kit) a plurality of mutual exclusions and the semaphore unit that is positioned at different processing units reconfigured, realize sc_mutex or the defined function of sc_semaphore syntactic units.
Wherein, described fifo queue group comprises a plurality of fifo queues unit, event signal is read to the event handling unit of correspondence group output data writing events signal and data in described fifo queue unit, and comprise sc_fifo (name according to the SystemC grammer, size), sc_fifo (size), sc_fifo.read (), sc_fifo.nb_read (), sc_fifo.write (), sc_fifo.nb_write (), sc_fifo.num_available (), sc_fifo.num_free (), sc_fifo.data_written_event (), sc_fifo.data_read_event (), the logical circuit that function code realizes, described logical circuit is used for when processor core moves controller and sends the signal that activates and suspend processor, and can realize the defined function of sc_fifo syntactic units so that the mapping algorithm of SDK (Software Development Kit) will be positioned at a plurality of mutual exclusions and the semaphore unit of different processing units to be reconfigured.
Wherein, described event handling unit comprises:
Incident sends engine, links to each other respectively with local bus with described crosspoint adapter, is used to send the event notification data bag;
Incident receives engine, links to each other respectively with local bus with described crosspoint adapter, is used to receive the event notification data bag;
The processor core implementation controller, controller links to each other respectively when moving with described processor with described incident receiving element, is used for processing unit internal event notice and described event notification data bag;
Recover the executive condition registers group, link to each other respectively with described local bus, be used for temporary incident control and incident cancellation with described processor core implementation controller;
The transmit control register group sends engine with described incident and links to each other with local bus, comprises transmission incident address register and the incident of transmission control register.
Wherein, the sc_event.notify () of personal code work and sc_event_queue.notify () are translated in described event handling unit: processor core writes the destination address of this incident of reception and writes trigger value to the incident of transmission control register to described transmission incident address register, described incident sends engine and sends the event notification data bag to described crosspoint adapter, wherein, the destination address of described packet is the value of transmission incident address register.
Wherein, described fifo queue unit comprises:
Local FIFO;
Data sending engine links to each other with described crosspoint adapter, is used to send packet;
The Data Receiving engine links to each other with described crosspoint adapter, is used to receive packet;
The FIFO access controller, link to each other respectively with described data sending engine, described Data Receiving engine, described local FIFO, be used for transmission and reception according to the state control data bag of local FIFO, controller when output processor time-out or run signal are moved to processor, output write operation and read operation event notice are to event handling unit group;
The SC_FIFO registers group is connected with the FIFO access controller with local bus interface, comprises unblock read register and unblock read port status register.
Wherein, the sc_fifo.nb_read () of personal code work operates in the described fifo queue unit and is translated into: processor core reads described unblock read register and unblock read port status register, and whether the data of the value representation unblock read register of unblock read port status register are effective.
Wherein, described mutual exclusion and semaphore unit comprise:
The resource count device links to each other respectively with the SC_MU_SEM registers group with data sending engine, is used for resource count;
Data sending engine links to each other with described crosspoint adapter, is used to send packet;
The Data Receiving engine links to each other with described crosspoint adapter, is used to receive packet;
The SC_MU_SEM registers group, link to each other respectively with described data sending engine, described Data Receiving engine, described resource count device and described local bus, be used for transmission and reception according to the state control data bag of resource count device, and the renewal resource count, the SC_MU_SEM registers group comprises initialization register.
Wherein, the sc_smaphore of personal code work (init_value) operates in the described SC_MU_SEM unit by following translation: processor core writes initialization register with init_value, and initialization initial resource counting is init_value.
By using the present invention, realized support SystemC syntactic units mapping resources in the chip, make that local resource can be used between any a plurality of processing unit, the resource sharing between processing unit and the peripheral hardware is with synchronously.
Description of drawings
The synoptic diagram of existing 8 core processors based on two-dimensional array of Fig. 1;
Fig. 2 is improved according to an embodiment of the invention polycaryon processor synoptic diagram;
Fig. 3 is the processing unit synoptic diagram that is used for the polycaryon processor of SystemC optimization according to an embodiment of the invention;
Fig. 4 is event handling cellular construction block diagram according to an embodiment of the invention;
Fig. 5 is fifo queue cellular construction block diagram according to an embodiment of the invention;
Fig. 6 is mutual exclusion and semaphore cellular construction block diagram according to an embodiment of the invention;
Fig. 7 is the composition structural representation according to the event handling unit group of polycaryon processor of the present invention;
Fig. 8 is the use synoptic diagram according to the event handling unit group of polycaryon processor of the present invention;
Fig. 9 is the processing unit resource distribution synoptic diagram of MP3 decoding process sct_mp3;
Figure 10 is external event trigger element synoptic diagram according to an embodiment of the invention;
Figure 11 forms and inner connection diagram for SC_MU_SEM unit according to an embodiment of the invention;
The MP3+ motion JPEG decodes and plays device synoptic diagram that Figure 12 writes for polycaryon processor according to the present invention;
The wireless mesh network access point synoptic diagram that Figure 13 writes for polycaryon processor according to the present invention;
Figure 14 is the wireless mesh network access point data processing synoptic diagram of Figure 13;
Figure 15 forms and inner connection diagram for fifo queue unit according to an embodiment of the invention;
Figure 16 is the process synoptic diagram of SC_FIFO shared resource according to an embodiment of the invention.
Embodiment
Below in conjunction with the drawings and specific embodiments the polycaryon processor that a kind of satisfying SystemC syntax provided by the invention requires is described in detail.
Wish to adopt the software development language of SystemC at the polycaryon processor that SystemC optimizes as multinuclear.SystemC is the expanding library of C++, and 1999, companies such as the Cadence of EDA industry, Synopsys, ARM organized the EDA language of exploitation based on C++ in concert, thereby have been born SystemC.In 2006, SystemC formally became ieee standard, became VHDL, Verilog the third natural language afterwards that all eda softwares are supported.
SystemC can provide higher design efficiency, more effective design cycle, thereby can help to solve volatile complexity, the problems such as pressure, cost rising of going on the market that IC industry faces.
The SystemC language itself is the expansion of C++, and therefore, though SystemC is widely used in the system verification in the integrated circuit (IC) design at present, it in fact also can be used for doing multinuclear embedded development software.If polycaryon processor is supported the specific syntax in the SystemC language, will effectively simplify the development difficulty of embedded system based on the multinuclear embedded development of SystemC so, improve development efficiency, and realize more effectively power consumption of processing unit management.
Fig. 1 is that the application number that proposes for same applicant with the application is the synoptic diagram based on 8 core processors of two-dimensional array of a satisfying SystemC in 200710308574.5 the patented claim.In described polycaryon processor, comprise the network-on-chip that constitutes by a plurality of crosspoint B and connecting line C thereof.Processing unit F and peripheral unit A1-A12 are connected to crosspoint by crosspoint adapter E on the sheet, thereby communicate by network-on-chip.All peripheral hardware A1-A12 are the interfaces of processor and PERCOM peripheral communication, and peripheral hardware can comprise USB (universal serial bus) and universal asynchronous serial interface (USB﹠amp; 2XUARTS) A1, also can comprise synchronous dynamic random access memory interface (SDRAMC) A2, A5, A8 and A12, also comprise digital television broadcasting Asynchronous Serial Interface (DVB ASI) A3, digital television broadcasting synchronous serial interface (DVB SPI) A9, LCD interface (LCDC) A4, flash interface (Nand flashC) A6, peripheral interconnecting interface (PCIH) A7, external event trigger element A11, HD video output interface (YCbCr) A10, these peripheral hardwares constitute the periphery of whole network-on-chip and communicate by letter with extraneous.In polycaryon processor shown in Figure 1, be designated PE (0,0), PE (0,1), PE (0,2), PE (1,0) respectively, eight processing unit F of PE (1,2), PE (2,0), PE (2,1) and PE (2,2) communicate by network-on-chip.D is a global resource (GR) unit, is used for the synchronous and data sharing between distance processor core far away.Between any two adjacent processing unit F, also comprise a local resource unit G.
Comparing with common polycaryon processor of the polycaryon processor that SystemC optimizes, difference is to have local resource unit and global resource.Local resource unit and global resource unit are that the grammar request according to SystemC designs, and guarantee that the issued transaction level code of SystemC can be corresponding one by one with hardware resource.
The polycaryon processor that disclosed SystemC optimizes in this application has common two-dimensional array structure, the network-on-chip that a plurality of crosspoint B and connecting line C thereof constitute.Processing unit F connects E by the crosspoint adapter and is used to communicate by letter with other processing unit or peripheral hardware A1-A12 to crosspoint S.In addition, this polycaryon processor also comprises: be connected being used between the adjacent processing unit synchronously and the local resource unit G of data sharing between the adjacent described processing unit F, and be connected with described crosspoint B at least one be used for the global resource cells D of synchronous and data sharing between all processing units.Because local resource unit G is between adjacent processing unit F, a local resource unit G can only be used by two processing unit F that connected, and in fact having retrained syntactic units such as sc_mutex, sc_event in the personal code work that is mapped to local resource, sc_semaphore, sc_event_queue, sc_fifo can only use simultaneously for two processes.Therefore, local resource unit G service efficiency is lower.
As shown in Figure 2, a kind of according to an embodiment of the invention polycaryon processor of improved SystemC optimization.Described polycaryon processor comprises that a plurality of crosspoint B and connecting line C thereof constitute network-on-chip, and the processing unit H that SystemC optimizes is connected to crosspoint S by crosspoint adapter E, communicates with other processing unit or a plurality of peripheral hardware A1-A12.
Processing unit directly is not connected with other processing unit by its local resource unit that has, but be connected with bus through the crosspoint adapter by the local resource unit, and finish Control on Communication by other Control Component shown in following Fig. 3, realize that a local resource unit can be used by a plurality of processing unit F that connected.
Fig. 3 is the processing unit synoptic diagram that is used for the polycaryon processor of SystemC optimization according to an embodiment of the invention.
The processing unit of polycaryon processor comprises a processor core H1, and examines and crosspoint adapter bridge H5 by optional timer group H3, optional on-chip memory H4, selectable process device that local bus H2 is connected with processor core.Processor crosspoint adapter bridge H5 is connected with crosspoint adapter E with local bus H2, it is the passage that processor core H1 directly communicates by letter with the external world, be used for processor core and directly visit other processing unit and peripheral unit on the sheet, and be used for communicating by letter between processor core instruction and data bus interface and the crosspoint adapter.On-chip memory H4 in crosspoint adapter E and the processing unit is connected, and allows the on-chip memory in the extraneous access process unit.
The processing unit of the SystemC of polycaryon processor optimization also comprises fifo queue unit group H62 and the mutual exclusion of SystemC optimization and the processor core local resource unit H6 that semaphore unit group H63 constitutes that event handling unit group H61, a SystemC who is optimized by SystemC optimizes according to an embodiment of the invention.Mutual exclusion and semaphore unit group H63 that fifo queue unit group H62 that event handling unit group H61, the SystemC that SystemC optimizes optimizes and SystemC optimize are connected respectively to crosspoint adapter E, are used for communicating by letter with extraneous.Controller H8 when processor core local resource unit H6, on-chip memory H4, timer group H3 also are connected respectively to the processor core operation, and by signal H71, H72, H731, H74, controller H8 transmission processor time-out and continuation exercise notice when H75 moves to processor core.Controller H8 merges the back to processor core output operation/stop signal H9 with these notices during the processor core operation.
Mutual exclusion that fifo queue unit group H62 that event handling unit group H61, the SystemC that the SystemC of local resource unit H6 optimizes optimizes and SystemC optimize and semaphore unit group H63 can not rely on other submodule in the local resource unit and exist.That is to say that local resource H6 can comprise one or more of mutual exclusion that fifo queue unit group H62 that event handling unit group H61, SystemC that SystemC optimizes optimize and SystemC optimize and semaphore unit group H63.
Timer group H3 comprises a plurality of timers, and each timer is to overtime notification signal of corresponding unit output of event handling unit group, and a plurality of notification signals constitute sets of signals M1.
Event handling unit group is according to the sc_event of SystemC grammer, the various publicly-owned function of sc_event_queue, the responsive table of the various forms of wait (...) function and SC_THREAD and SC_METHOD is realized logical circuit, be used to send the signal that activates and suspend processor, and a plurality of event handlings unit group that the mapping algorithm that allows SDK (Software Development Kit) will be positioned at different processing units makes up, and finishes a plurality of sc_event or the defined function of sc_event_queue syntactic units, the function of wait (...) function, the responsive table function of SC_THREAD and SC_METHOD.
A kind of realization of event handling unit group as shown in Figure 4, the realization of a typical event handling unit comprises: the incident that connects crosspoint adapter E and transmit control register group H6111 sends engine H6110; Be connected to the native processor Bus Interface Unit H612 of transmit control register group H6111, reception incident address register H613; Be connected to the recovery executive condition registers group H615 of processor core implementation controller H616 and local bus interface unit H612; The incident that connects crosspoint adapter E and receiver address register H613 receives engine H610; Be connected to remote events reception notification signal H614, local event reception notification signal, the processor core implementation controller H616 of controller H8 when recovering executive condition registers group H615 and output processor time-out/run signal H619 and moving to processor.
Transmit control register group H6111 comprises transmission incident address register H61111 and the incident of transmission control register H61112.The sc_event.notify () and the sc_event_queue.notify () of personal code work are translated into: processor core H1 writes the destination address of this incident of reception and writes a trigger value to the incident of transmission control register H61112 to the incident of transmission address register.After the incident of transmission control register H61112 was written into, incident sent engine H6110 and sends an event notification data bag, and the destination address of this packet is the value of transmission incident address register H61111.For a plurality of situations that may receive this event notice are arranged, processor core H1 repeats to the incident of transmission address register and writes an operation that receives the destination address of this incident and write a trigger value to the incident of transmission control register H61112.
Wait in the personal code work (sc_event) and wait (sc_event_queue) are translated into to write receiver address and write in event controller register H6151 in the incident of reception address register H613 and receive order.The typical use-pattern of event controller register H6151 is: lowest bit is 0 expression sc_event_queue pattern, it is 1 expression sc_event pattern, minimum second bit is 1 expression beginning processor core current event, be 0 to represent processor core to finish dealing with, minimum the 3rd bit is that the reception of 1 presentation of events enables, and is that 0 expression close event receives engine.
Receive this event notification data bag and find that its destination address equates with the incident of reception address register H613 when incident receives engine H610, then received a remote events to a pulse notification processor nuclear of H614 output implementation controller H616.
The processor core implementation controller receives from the processing unit internal event notice (the overtime notification event of Tathagata self-timer group) of H611 with from the remote events of H614, and according to the value output control signal H619 notification processor time-out/execution of event controller register H6152;
The processor core implementation controller adds up according to the SystemCsc_event_queue grammer the sc_event_queue incident that receives, H6151 is written into when incident cancellation register, aggregate-value is cleared, corresponding to sc_event_queue.cancelall () grammer.
Event handling shown in Figure 4 unit is a kind of simple realization of event handling unit, a plurality of such event handling unit can constitute an event handling unit group, the SystemC grammer of supporting comprises wait (sc_event) and wait (sc_event_queue) sc_event.notify (), sc_event.cancel (), sc_event_queue.notify (), sc_event_queue.cancelall (), SC_THREAD<<sc_event and SC_METHOD<<sc_event, can only satisfy fairly simple demands of applications, take wait (...) function, its grammer comprises:
1, wait ()---wait for that event occurs in the responsive table.
2, wait (const sc_event﹠amp; E)---waiting event takes place, as following Example
sc_event?e1;
....
wait(e1);
After incident e1 took place, process will activate, and the statement of wait (e1) back will be performed.
3, wait (sc_event_or_list﹠amp; )---one of waiting event takes place, as following Example
sc_event?e1,e2,e3;
....
wait(e1|e2|e3);
Process will activate after one of e1, e2 or e3 take place, and the statement of wait (e1|e2|e3) back will be performed.
4, wait (sc_event_and_list﹠amp; )---waiting event all takes place.As following Example
sc_event?e1,e2,e3;
wait(e1&e2&e3);
After e1, e2 or the whole generations of e3, process will activate, wait (e1﹠amp; E2﹠amp; E3) statement of back will be performed.
5, wait (double v, sc_time_unit tu)---wait for a period of time by v and tu decision.As following Example
......
wait(100,SC_NS);
Process will activate after will being suspended for 100 nanoseconds, and (100, SC_NS) statement of back will be performed wait.
6, wait (double v, sc_time_unit tu, const sc_event﹠amp; E)---the generation of waiting event e, if but in a period of time of v and tu decision incident do not take place and will no longer wait for.As following Example:
sc_event?e1;
.....
wait(100,SC_NS,e1);
If have incident e1 generation or time to surpass for 100 nanoseconds in 100 nanoseconds, process will be activated.
7, wait (double v, sc_time_unit tu, sc_event_and_list﹠amp; E1) and wait (double v, sc_time_unit tu, sc_event_or_list﹠amp; E1)---function performance is similar noted earlier.As following Example:
Wait (100, SC_NS, e1|e2|e3); // wait for e1, e2 or e3, overtime after 100 nanoseconds.
Wait (100, SC_NS, e1﹠amp; E2﹠amp; E3); // waiting for e1, e2 and e3 take place, and be overtime after 100 nanoseconds.
All (double v, sc_time_unit tu) two parameters in front can substitute with a sc_time shape parameter.As following Example:
sc_time?t(100,SC_NS);
Wait (t); // be equivalent to wait (100, SC_NS)
Wait (t, e1); // be equivalent to wait (100, SC_NS, e1);
Wait (t, e1|e2|e3); // be equivalent to wait (100, SC_NS, e1|e2|e3)
Wait (t, e1﹠amp; E2﹠amp; E3); // be equivalent to wait (100, SC_NS, e1﹠amp; E2﹠amp; E3);
Fig. 5 has provided event handling unit group complicated more of polycaryon processor according to an embodiment of the invention but has realized example efficiently, not only support above-mentioned all wait (...) grammers, but also support the complicated responsive table of SC_THREAD and SC_METHOD.As shown in Figure 7, comprise as lower member according to group H61 inside, event handling of the present invention unit.
(1) transmit control register group H6111, a reception incident address list H6112, recovery executive condition registers group H615, an event description table H6113, these registers and table are connected to a local bus interface unit H612; Processor core H1 operates these registers group and table by local interface unit H612, the work of control whole event processing unit group.
(2) incident sends engine H6110.Incident sends engine H6110 and is connected to transmit control register group H6111 and event description table H6113.Transmit control register group H6111 is made up of a plurality of transmit control register H61113.The content of transmit control register H61113 is the description list start address and the description list length of the incident that will send, and the relevant position has been stored optional ID, this incident unique address sign of this incident and received the address of the event handling unit group of this incident among the event description table H6113.When processor core H1 carries out write operation to a certain transmit control register H61113, show that then the incident of this register correspondence is triggered.The description list start address of the incident that transmit control register group H6111 will send and description list length are sent to incident and send engine H6110, incident sends engine searched events description list H6113, according to the desired packet format of crosspoint the optional ID of this incident and this incident unique address sign are broken into packet, packet is sent to purpose event handling unit group respectively according to the address of the event handling unit group of this incident of reception.
(3) incident receives engine H610, connects to receive list of thing H613 and receive event id register H6114, and notifies to signal H614 outgoing event.Each register that receives list of thing H6112 has comprised the unique address sign of waiting the incident that receives.Incident receives the packet of its reception of engine H610 from crosspoint adapter E, search and receive list of thing H6112, if the unique address of the reception incident that packet comprised sign is included in receive in the list of thing, then notify to signal H614 outgoing event.
(4) processor core implementation controller H616, overtime timer H6115.Processor core implementation controller H616 is connected to incident and receives engine H610, and the reception incident receives the incident generation signal H614 that engine H610 sends.Processor core implementation controller H616 is also connected to one and recovers executive condition registers group H615 and overtime timer H6115.Recovering executive condition registers group H615 comprises and event registers H6151 or event registers H6152, event type register H6154, incident cancellation register H6155, overtime condition register H6153.Or event registers H6152 and with each bit of event registers H6151 incident corresponding to the incident unique address sign of an incident receiver address tabulation, the corresponding bit of event type register H6154 identifies SC_EVENT or the SC_EVENT_QUEUE that this incident is the SystemC grammer, represent SC_EVENT_QUEUE such as 1,0 represents SC_EVENT.Processor core implementation controller H616 adds up the incident of each SC_EVENT_QUEUE type, and the corresponding bit of cancelling register H6155 when incident is written into effective value, and then aggregate-value is cleared.When overtime condition register H6153 is written into, time out period information H6116 is sent to overtime timer H6115, and triggers overtime timer and bring into operation.When time out period arrival, then overtime incident is sent to processor core implementation controller H616.When being written into event registers H6151, the then new value with event registers H6151 is sent to processor core implementation controller H616, processor core implementation controller H616 arrives the processor operation controller to signal H6117 output processor " time-out ", receive when all receiving engine H610 by incident with represented each incident of event registers H6151, then the processor core implementation controller is to signal H6117 output " execution " signal H6117 controller when processor moves.When or event registers H6152 be written into, value then new or event registers is sent to processor core implementation controller H616, processor core implementation controller H616 is to signal H6117 output processor " time-out " to the processor operation controller, when or the represented any one event of event registers H6152 receive engine by incident and receive, processor core implementation controller H616 controller when signal H6117 output " executions " moves to processor then.Whenever, when the aggregate-value of the incident of SC_EVENT_QUEUE type is not 0, this moment if or event registers H6152 or be written into event registers H6151, think that then this incident just is received, aggregate-value subtracts 1 simultaneously.
LEvent[...] H611 is processor local event input, the read-write incident of overtime, the fifo queue group that produces such as the timer group.These incidents are directly sent into processor core implementation controller H616.For processor core implementation controller H616, the disposal route of these incidents is identical with the incident H614 that receives engine from incident.
Below with an embodiment, the use of the event handling unit group of the polycaryon processor of optimizing at the SystemC grammer of the present invention is described.The user writes a MP3 player based on the polycaryon processor of Fig. 2.The user can use keyboard to switch music, as shown in Figure 6, this realize to need two processes, and one is SC_THREAD process sct_mp3 M1, its Mp3 that decodes, its principle of work is that the every generation of local timer is once interrupted, then read a frame music, decode, if receive the music handover event ie_switch N3 that sends from SC_METHOD process scm_menu, then read relevant information, switch to next music.The another one process is SC_METHOD process scm_menu N2, and the keyboard interrupt N5 that its processing is used for changes menu demonstration or notice sct_mp3 switching music, stops broadcast etc. according to user's keyboard input information.
The false code of SC_THREAD process sct_mp3 is as follows:
Wait(ie_switch);
If (beginning playing back music condition satisfies)
{
T1.start_periodic (FRAME_TIME); //T1 is local timer
While(true)
{Wait(ie_switch|t1.tout);
If (t1_tout_triggered) { the new frame of decoding; }
Else if (ie_switch_triggered) { switches music or stops broadcast; }
}
}
The false code of SC_METHOD process scm_menu is as follows:
This process to the rising edge sensitivity of keyboard interrupt: sensitive<<kb_int.pos ();
The process main code is as follows:
while(true)
{wait();
Read keypad information;
If (switch music or stop music) ie_switch.notify ();
Else upgrades the menu displaying contents;
}
Fig. 7 is the processing unit resource distribution situation map of MP3 decoding process sct_mp3, and as shown in the figure, processor core H1 carries out initialization to the T1 that is positioned at H3, corresponding code T1.start_periodic (FRAME_TIME).Afterwards, T1 brings into operation.When processor core is carried out Wait (ie_switch|t1.tout), write event handling unit group H61's or event registers H6152, suppose one 32 with/or low 28 corresponding Revent[... of event registers] H614, high 4 corresponding LEvent[...] H611, then this moment or register should be written into 0x10000001, inform that event handling unit group of received is from TimerEvent[0] incident and from the ie_switch incident of crosspoint adapter (hereinafter describing in detail), the StopRun signal is put lowly afterwards, makes that processor core stops to carry out.When taking place overtime is that the t1.tout incident takes place, this incident is passed through TimerEvent[0] signal H611A is delivered to event handling unit group H61 in the monocycle impulse mode, H61 finds that processor recovers executive condition and is satisfied, H6117 puts height with the StopRun signal, expression continues " execution ", and processor continues operation.
For external event Kb_int N5, be sent to the peripheral hardware of Fig. 2: the external event trigger element, as A11.Kb_int N5 is actually a signal from the outside, at first must be converted to internal signal by the external event trigger element, it has rising edge and negative edge, corresponding rising edge of difference and negative edge incident, therefore, the external event trigger element incident that is different from the event handling unit of the processing unit inside difference that sends part is that its event description table comprises negative edge incident and two description lists of rising edge incident (as F8 and the F9 of Fig. 5).Concrete structure such as Fig. 8 of external event trigger element A11, processing unit negative edge incident and two description lists of rising edge incident (as the H6113 of Figure 10), the transmit control register group (H6110 of Fig. 5) to outside Event triggered unit A11 in system initialization that receives external event carried out initialization.Take place as incident, external event trigger element A11 just sends to events corresponding the event handling unit group of corresponding processing unit.In this example, the reception rising edge incident of Kb_int, Kb_int is connected to EEvent[0] P1.Low 16 of rising edge event description table start address and length register is event description table start address, and high 16 is length.The rising edge event description table start address and the length register H61114 of processor core P (0,1) external event trigger element incident 0 are set to 0x00020000.In this example, suppose that rising edge address descriptor table only stores incident unique address sign and receive the address of the event handling unit group of this incident.So.EEvent[0] the unique address of rising edge be identified at and can add side-play amount 0 sign here with the global address of external event trigger element, this address is written into the address 0 of rising edge event description table.The address that receives the event handling unit group of this incident is processor core P (0, a 1) place processing unit, and therefore, the base address of the event handling unit group of processor core P (0,1) place processing unit is written to the address 1 of rising edge event description table.When kb_int became 1 by 0, the external event trigger element detected this variation, and a packet is sent to the event handling unit group of processor core P (0,1) place processing unit according to the information of rising edge event description table.The payload of this packet comprises EEvent[0] incident unique address sign and receive the address of event handling unit group of processor core P (0,1) the place processing unit of this incident.
At initial phase, processor core P (0,1) is also with EEvent[0] incident unique address sign be written to reception incident 0 address register of the event handling unit group of process N2 (as shown in Figure 6) place processing unit.The packet that takes place when the external event trigger element is received engine H610 reception by the incident of the event handling unit group H61 of process N2 place processing unit, it is marking matched that it is checked through reception incident 0 address register and the incident unique address in the packet of event handling unit group of process N2 place processing unit, then passes through Revent[0 with pulse singaling] H614 delivers to H6117 with incident reception information.When process scm_menuN2 carries out wait (), write 0x1 to H6151, the incident Revent[0 of reception incident 0 address register sign is waited in expression] H614, and it is low to cause processor core implementation controller H616 that RunStop H6117 is put, and processor suspends.H616 receives Revent[0 when the processor core implementation controller] H614, discovery or register executive condition are satisfied, and RunStop H6117 is put height, and processor continues to carry out.
Ie_switch is in this example one internal event that is triggered, received by processor core P (0,0) by processor core P (0,1).Still low 16 that suppose event description table start address and length register are event description table start address, and high 16 is length.The description list start address and the description list length register H61113 of processor core P (0,1) incident 0 are set to 0x00020000, suppose that the event description table only stores incident unique address sign and receive the address of the event handling unit group of this incident.So, the ie_switch unique address is identified at here can use processor core P (0,1) the transmit control register group H6111 global address of the event handling unit group of the processing unit at place adds side-play amount 0 sign, and this address is written into the address 0 of event description table H6112.The address that receives the event handling unit group of this incident is processor core P (0,0) place processing unit, therefore, processor core P (0,0) base address of the event handling unit group of place processing unit is written to the address 1 of event description table of the event handling unit group of processor core P (0,1) place processing unit.As processor core P (0,1) calls ie_switch.notify (), the description list start address of incident 0 is written into 0x00020000 (the same with its initial value) with description list length register C9, incident sends and reads the address 0 of event description table and the address 1 of event description table after engine C7 is checked through this operation, and the payload that their packings enter packet is sent to the event handling unit group of processor core P (0,0) place processing unit.Processor core P (0,0) reception incident 0 address of the reception incident address list H6112 of the event handling unit group of place processing unit is initialized at initial phase: the transmit control register group H6111 global address of the event handling unit group of the processing unit at processor core P (0,1) place adds side-play amount 0.As mentioned before, when processor core P (0,0) runs to Wait (ie_switch|t1.tout), its event handling unit group or register should be written into 0x10000001, RunStop C21 is put low, causes processor core out of service.As processor core P (0,1) incident of the event handling unit group of the processing unit at place sends the packet of engine transmission by the reception of incident reception engine, and pass through Revent[0 in the monocycle impulse mode] pass to processor core implementation controller H616, then RunStopH6117 is put height, and processor core is continued operation.
According to IEEE Std P1666, sc_fifo<T〉be the FIFO passage of having realized in the SystemC storehouse, wherein T is the data type of storing among the FIFO.The Chinese name of FIFO is called fifo queue, and is all comparatively commonly used in the software and hardware design.Sc_fifo<T〉.write (﹠amp; T) method of FIFO, sc_fifo<T are write in representative〉.read () is the method for reading FIFO, returns the data of group head unit.Sc_fifo<T〉.Num_free () is used to inquire about FIFO and also has how many dummy cells, sc_fifo<T〉what data .num_available () inquiry FIFO also have to read.
Sc_fifo<T〉.read (T﹠amp; ) and sc_fifo<T .read () is that obstructive type is read method, if FIFO is for empty when reading, then they wait until that FIFO has data to write fashionable ability return data, their read operation is successful forever.Sc_fifo<T〉.nb_read (T﹠amp; ) be non-obstructive type read operation, it always returns at once.If the FIFO non-NULL is then read the FIFO success, otherwise reading failure.Sc_fifo<T〉.num_available () returns currently also has how many data cells to read.Sc_fifo<T〉.data_written_event () is used for return data and writes incident.
Sc_fifo<T〉.write (T﹠amp; ) and sc_fifo<T .write () is the obstructive type write method, if FIFO is for full when writing, then they are waited until when FIFO has dummy cell and just data are write and return, their write operation is successful forever.Sc_fifo<T〉.nb_write (T﹠amp; ) be non-obstructive type write operation, it always returns at once.If FIFO is non-full, then writes the FIFO success, otherwise write failure.Sc_fifo<T〉.num_free () returns current how many dummy cells that also have.Sc_fifo<T〉.data_read_event () is used for return data and reads incident.
Sc_fifo<T〉constructed fuction have two, as follows:
sc_fifo(int?size_=16)
sc_fifo(const?char*name_,int?size_=16)
Wherein, the degree of depth of size_ definition FIFO, what name_ defined is the passage name of FIFO, they all have default value.
Compare with general FIFO, SC_FIFO has increased trial and has read, attempts writing and remaining the quantity of free cells, and has increased data_written_event () and data_read_event ().On the basis of supporting the whole grammers of SC_FIFO, under the situation of multinuclear, resource problem on incident efficient when also needing to solve in essence multiprocessor nuclear and the sheet simultaneously to the operation of same FIFO.
As Fig. 3, fifo queue unit group H62 comprises a plurality of fifo queues unit, and event signal is read to the event handling unit of a correspondence group data writing events signal of output and data in each unit, and these signals constitute sets of signals M2 jointly.The sc_fifo.read () according to the SystemC grammer is realized in each fifo queue unit, sc_fifo.nb_read (), sc_fifo.write (), sc_fifo.nb_write (), sc_fifo.num_available (), the logical circuit that sc_fifo.num_free () function code realizes, be used for when processor core moves controller and send the signal that activates and suspend processor, and allow the mapping algorithm of SDK (Software Development Kit) will be positioned at a plurality of mutual exclusions of different processing units and semaphore unit to reconfigure and finish the defined function of sc_fifo syntactic units jointly.
A typical case according to the fifo queue unit H627 of SystemC grammer realizes that it comprises: a local FIFO H626 who is connected to FIFO access controller H628 as Fig. 9; The local bus interface H629 that connects local bus H2 and SC_FIFO registers group H628; The FIFO access controller that connects local FIFO H626, data sending engine H621, Data Receiving engine H622, SC_FIFO registers group, output processor time-out/run signal H625 is controller H8 when processor moves, and output data is read event signal H624, output data writing events signal H623; Be connected to the data sending engine H621 of FIFO access controller H628 and crosspoint adapter E; Be connected to the Data Receiving engine H621 of FIFO access controller H628 and crosspoint adapter E.FIFO access controller H628 control is to the visit of local FIFOH626.Function sc_fifo.read (), sc_fifo.nb_read () in the personal code work that moves among the native processor nuclear H1, sc_fifo.write (), sc_fifo.nb_write (), sc_fifo.num_available (), sc_fifo.num_free () are corresponding to the operation of processor core to the SC_FIFO registers group.When processor core H1 reading of data (corresponding sc_fifo.read ()) but when FIFO H626 is empty, perhaps when processor core H1 write data (corresponding sc_fifo.write ()) but FIFO H626 is when full, FIFO access controller H628 is by the execution of signal H625 output " time-outs " time-out processor.When the write operation that takes place local FIFO H626, FIFO access controller H628 sends write operation by H623 and notifies; When the read operation that takes place local FIFO H626, FIFO access controller H628 sends read operation by H624 and notifies.Non-native processor is checked the visit of FIFO and is undertaken by data sending engine H621 and Data Receiving engine H622.When non-native processor nuclear reading of data (corresponding sc_fifo.read ()) but when FIFO H626 be sky, perhaps when non-native processor nuclear write data (corresponding sc_fifo.write ()) but FIFO H626 is when full, the FIFO access controller sends the retry response packet immediately to remote processor nuclear.
A specific embodiment such as a Figure 10 more, the function of the fifo queue unit that SystemC shown in Figure 10 optimizes: have multiple mode of operation:
(1) pattern 0: read-write is local, and this SC_FIFO unit is fully by local processing unit control.The SC_FIFO unit becomes a basic FIFO under this pattern.
(2) pattern 1: reader ground, and the expression data are stored in this locality, and local process only reads FIFO.
(3) pattern 2: write long-rangely, the expression data are stored in long-rangely, and local process is only write FIFO, and write data is sent to the SC_FIFO unit of far-end.
(4) mode 3: write this locality, the expression data storage is in this locality.The SC_FIFO that is positioned at other processing unit carries out long-range read operation.
(5) pattern 4: read long-range.The expression data storage is in the long-range SC_FIFO that is positioned at other processing unit.
As shown in figure 10, inner composition of the fifo queue unit H627 of described SystemC optimization and annexation are as described below.
Be connected to the Data Receiving engine H622 of crosspoint adapter E, data sending engine H621 and local FIFO H6210, its function and move as described below:
The read-write FIFO that receives from far-end SC_FIFO unit by crosspoint adapter E asks, when far-end SC_FIFO unit requests read data in the local SC_FIFO unit but local FIFO when empty, Data Receiving engine H622 notification data sends engine H621 and replys to RTY who carries the wait duration information at least of request far-end SC_FIFO unit transmission by crosspoint adapter E, waits for that duration is generally the value of hereinafter described SC_FIFO_CWT H62013; When far-end SC_FIFO unit requests reads data among the local SC_FIFO but local FIFO when being non-NULL, Data Receiving engine H622 notification data sends engine H621 and sends one by crosspoint adapter E to far-end SC_FIFO unit and carry the data of a local FIFO H6210 width unit at least and the ACK of local FIFO usage count information H62010 replys.When far-end SC_FIFO unit requests write local SC_FIFO_H6210 but local FIFO when full, Data Receiving engine H622 notification data sends engine and waits for that to asking one of far-end SC_FIFO unit transmission to be carried at least duration information RTY replys, and waits for that duration is generally the value of hereinafter described SC_FIFO_CWT H62013.When far-end SC_FIFO unit requests is write among the local SC_FIFO but local FIFO H6210 is non-when full, Data Receiving engine H622 notification data sends engine H621 and sends a RTY who carries the remaining space count information H62011 of local FIFO at least to request far-end SC_FIFO unit and reply.
The read-write FIFO ACK and the RTY that receive from other SC_FIFO unit by crosspoint adapter E reply.When local SC_FIFO cell operation in pattern 2 and 4, receive RTY and reply, behind the information wait timeout, notification data sends engine H621 and resends request often in the wait of carrying in replying according to RTY.Reply when Data Receiving engine H622 receives ACK, notification data transmission engine H621 receives ACK and replys.If comprise data during ACK replys, then notify local FIFO H6210 with data storage.
Be connected to the data sending engine H621 of crosspoint adapter E, Data Receiving engine H622 and local FIFO, its function is as described below:
{ ACK notice, RTY notice, re-transmission notice } according to Data Receiving engine H622 answers ACK and NAK to reply and resend the last request of data frame that sends by the crosspoint adapter to far-end SC_FIFO transmission.
When the SC_FIFO cell operation in pattern 2, when among the local FIFO K14 during non-NULL, become the write request bag to send to the data encapsulation of storage and send far-end SC_FIFO unit.
When the SC_FIFO cell operation in pattern 4, when receiving when reading to notify, send data read request to far-end SC_FIFO unit from the data of SC_FIFO registers group K24.
Be connected to the interface module of SC_FIFO registers group and local bus, finish communication with the processor core of place processing unit.
Be connected to the SC_FIFO registers group H620 of all other unit, finish the read-write of SC_FIFO location register, form by following register:
SC_FIFO identification register SC_FIFO_ID H6201, the name that is used to store SC_FIFO.
Control register SC_FIFO_CTRL H62012 is used to distinguish the mode of operation of this SC_FIFO unit, as mentioned before.
The SC_FIFO obstruction is write register SC_FIFO_BW H6207.When processor core is write this register, then the SC_FIFO registers group writes data among the local FIFO, finishes just up to write operation and returns.Time-out/execution signal K15 output { suspends } in this process, and operation is returned back output and { carried out }.
Register SC_FIFO_NBW H6208 is write in the SC_FIFO unblock.When processor core is write this register, then the SC_FIFO registers group writes data among the local FIFO, if this moment, local SC_FIFO was full, then SC_FIFO_NBWS H6209 is updated to { write operation failure }.
SC_FIFO unblock write port status register SC_FIFO_NBWS H6209.Store last unblock and write the result of register manipulation { write operation failure, write operation success }.
On behalf of the remaining space of whole SC_FIFO, SC_FIFO remaining space counter register SC_FIFO_NF H62011 count by the remaining space counting of local FIFO.
SC_FIFO takies element count register SC_FIFO_NA H62012, by the element count that takies that element count is represented whole SC_FIFO that takies of local FIFO.
SC_FIFO blocks read register SC_FIFO_BR.The processor core of place processing unit reads SC_FIFO by this register obstruction.Pattern 1 time, operation is carried out always, has data and data are read to return to processor core in local FIFO.Pattern 4 times, SC_FIFO registers group H620 sends engine H621 with notification data and sends data request packet, operation is carried out always, receives ACK up to the Data Receiving engine and replys, and data entrained during ACK replys return to processor core by SC_FIFO registers group H620.Time-out/execution signal H625 output { suspends } in this process, and operation is returned back output and { carried out }.
SC_FIFO unblock read register SC_FIFO_NBR H6205.The processor core of place processing unit reads SC_FIFO by this register unblock.Pattern 1 time, if having data among the local FIFO, then SC_FIFO registers group H620 reads data and returns to processor core, is { read operation failure } otherwise upgrade SC_FIFO_NBRS.Pattern 4 times, SC_FIFO registers group H620 sends engine H621 with notification data and sends data request packet, if Data Receiving engine H622 receives ACK and replys, data entrained during ACK replys return to processor core H1 by SC_FIFO registers group H620, if Data Receiving engine H622 receives ACK and replys, it is { read operation failure } that SC_FIFO registers group H620 upgrades SC_FIFO_NBRS.
SC_FIFO unblock read port status register SC_FIFO_NBRS H6206, the state of last unblock read operation can be { read operation failure, read operation success }.
Local SC_FIFO address register SC_FIFO_ADDR H6203, the global address of this SC_FIFO unit, data sending engine K2 is the source address of packet with this address.
Far-end SC_FIFO address register SC_FIFO_RMT_ADDR H6203.The address of pairing long-range SC_FIFO unit, this SC_FIFO unit.Data sending engine K2 is the destination address of packet with this address.
Stand-by period register SC_FIFO_CWT H62013.Entrained stand-by period length during RTY replys, representative value is 2.
Be connected to the local FIFO H6210 of all other unit, inside comprises a common FIFO read-write controller, the read-write operation of control FIFO.The reading and writing data request may be from data sending engine H621, Data Receiving engine H622 and SC_FIFO registers group H620.When being written into, sending data by signal Data_written_event H623 and write incident; When being read out, sending data by signal Data_read_eventH624 and read incident.
With an example, the process based on SC_FIFO shared resource of the present invention is described below.
Suppose that the process sct_producer in the user program writes data to a SC_FIFO_sc_fifo0, and two other process sct_comsumer1 and sct_comsumer2 are from the sc_fifo0 reading of data.As shown in figure 11, three processes are mapped to PE (0,0), PE (1,0) and PE (1 respectively, 1), and sc_fifo0 is mapped to PE (0,0), PE (1,0) and first SC_FIFO unit M1, M2, the M3 of the SC_FIFO unit group of PE (1,1), constitutes the function of sc_fifo0 jointly.M1 works in mode 3, and data storage is in this locality; M2 and M3 work in pattern 4.Sct_producer writes data to M1, and is stored in M1.When process sct_comsumer1 and sct_comsumer2 need data, just can be by M2 and M3 to the M1 reading of data, in the process of reading of data, the processor core at process sct_comsumer1 or sct_comsumer2 place is in halted state.When the Data Receiving engine H622 of M1 receives data read request from M2 or M3, it is to local FIFO H6210 reading of data.When data are read out, the FIFO read-write controller of local FIFO H6210 represents that to pulse of signal H624 output " data are read " incident takes place.Data Receiving engine H622 notification data sends M2 or the M3 that engine H621 sends to the data that read requests data reading.Further suppose the process sct_producer reading of data from SC_FIFO sc_fifo1 in the user program, and two other process sct_comsumer1 and sct_comsumer2 write data to sc_fifo0.As shown in figure 16, sc_fifo1 is mapped to second SC_FIFO unit M4, M5, the M6 of the SC_FIFO unit group of PE (0,0), PE (1,0) and PE (1,1), and they constitute the function of sc_fifo1 jointly.M4 works in pattern 1, and M5, M6 work in pattern 2.When process sct_comsumer1 to M5 or and sct_comsumer2 write data to M6, data at first are stored among M5 or the M6, by the data sending engine H621 of M5 and M6 data are delivered to M4.When the Data Receiving engine H622 of M4 receives data, then data are write local FIFO H6210, the FIFO read-write controller of H6210 inside is represented " generation of data writing events " to pulse of signal H623 output.
Sc_mutex is a basic passage of SystemC syntactic definition.In operating system, mutual exclusion (mutex) is used for protecting shared resource, reads while write shared resource to avoid a plurality of processes, causes the uncertain of system action.Mutual exclusion has locking and non-locking two states.If there is process need use resource by mutual exclusion protection, and that mutual exclusion does not at this moment have is locked, and then this process just can lock mutual exclusion, the resource that the acquisition that it just can be unique is at this moment protected by this mutual exclusion, and permission is carried out any legal operation to resource.When mutual exclusion by the locking of other process, apply for that at this moment the process of mutual exclusion will get clogged, up to the process of locking mutual exclusion with the mutual exclusion release.Can lock mutual exclusion by sc_mutex.lock () process, if mutual exclusion is locked, at this moment the process of application locking just gets clogged and is unlocked up to mutual exclusion.Whether can inquire about mutual exclusion by sc_mutex.trylock () process locked, with the decision whether use sc_mutex.lock () thus the locking mutual exclusion avoid process to get clogged.Can the release mutual exclusion by sc_mutex.unlock () function process.
Mutual exclusion SC_MUTEX is used for protecting exclusive resource.Because SOPA adopts SystemC as the program development language, therefore, all member variables nearly all are the mutual exclusive resource that needs use sc_mutex to protect.
Semaphore is used for transmitting the synchronizing information of many parts of shared resources.
Sc_semaphore is another important basic passage of SystemC definition, in the Chinese teaching material of Principles of Operating System, usually semaphore is translated as semaphore.Semaphore and mutual exclusion all are used for protecting shared resource, but they are different again.Semaphore is the publicly-owned resources effective means of management that operating system provides.Semaphore is represented the quantity of available resources entity, so can think that semaphore is exactly a resource count device, what its limited is the quantity of using the process of certain shared resource (being also referred to as critical resource) simultaneously.The value representative of semaphore counting be exactly the quantity of current still available shared resource.
Sc_semaphore.wait () method obtains a semaphore, and its action effect is the right to use that obtains a resource, makes the semaphore counting subtract one.This is one and blocks function that when the counting of semaphore had been 0 (representative does not have available resources to distribute), this function will get clogged.Sc_semaphore.trywait () is corresponding unblock function; Sc_semaphore.post () is the function that discharges resource; What sc_semaphore.get_value () returned is current semaphore counting.
The constructed fuction of sc_semaphore has two:
explicit?sc_semaphore(int?init_value_);
sc_semaphore(const?char*name_,int?init_value_);
Wherein init_value_ is the initial count of semaphore, must not have default value greater than 0, can not finish implicit type conversion.Name_ is the passage name.
Because the similarity of mutual exclusion and semaphore, they can adopt same hardware resource to realize, so mutual exclusion and semaphore unit group H63 become one of important component part of local resource H6.
In existing processor, mutual exclusion and semaphore generally adopt operating system to simulate, and in the maintenance resources counter, close processor and interrupt, and can realize simple semaphore and mutual exclusion.
Because the specific (special) requirements of SystemC grammer, and the polycaryon processor of optimizing from SystemC considers in economize on electricity and the angle being convenient to debug, designed mutual exclusion and semaphore unit group in processing unit.As Fig. 3, mutual exclusion and semaphore unit group H63 comprise a plurality of independently mutual exclusions and semaphore unit.Each mutual exclusion and semaphore unit are according to the sc_mutex.lock () of SystemC grammer, sc_mutex.trylock (), sc_mutex.unlock (), sc_semaphore.wait (), sc_semaphore.trywait (), sc_semaphore.post (), the logical circuit that the logical circuit function code that sc_semaphore.get_value () function code realizes realizes, be used for when processor core moves controller and send the signal that activates and suspend processor, and allow the mapping algorithm of SDK (Software Development Kit) will be positioned at a plurality of mutual exclusions of different processing units and semaphore unit to reconfigure and finish sc_mutex or the defined function of sc_semaphore syntactic units jointly.
A typical case of mutual exclusion and semaphore unit realizes as shown in figure 12, comprising: the SC_MU_SEM registers group H631 that is connected to resource count device H634, data sending engine H630, Data Receiving engine H632 and local bus interface H637; The local bus interface H637 that connects local bus H2 and SC_MU_SEM registers group device H631; The resource count device H634 that connects SC_MU_SEM registers group device H631, Data Receiving engine H632, it goes back output processor time-out/run signal H638 controller H8 when processor moves; Be connected to the data sending engine H630 of SC_MU_SEM registers group H631 and crosspoint adapter E; Be connected to the Data Receiving engine H632 of SC_MU_SEM registers group H631, resource count device H634 and crosspoint adapter E.Function sc_mutex.lock (), sc_mutex.trylock () in the personal code work that moves among the native processor nuclear H1, sc_mutex.unlock (), sc_semaphore.wait (), sc_semaphore.trywait (), sc_semaphore.post (), sc_semaphore.get_value () are corresponding to the operation of processor core to SC_MU_SEM registers group H631 corresponding registers.When processor core H1 called sc_mutex.lock () or sc_semaphore.wait () but resource count H635 is zero, resource count controller H634 is by the execution of signal H636 output " time-outs " time-out processor.Non-native processor is checked the visit of FIFO and is undertaken by data sending engine H632 and Data Receiving engine H633.When non-native processor nuclear called sc_mutex.lock () or sc_semaphore.wait () but resource count H635 is zero, resource count controller H634 sends the retry response packet immediately to remote processor nuclear.
Improved mutual exclusion and semaphore unit specific implementation are as shown in figure 13, composition and the inside of the mutual exclusion of Figure 13 and semaphore unit embodiment are connected to: the local bus interface H637 of connection processing device local bus H2 and SC_MU_SEM registers group H631, processor core H1 control the work of whole SC_MU_SEM unit by this interface accessing SC_MU_SEM registers group H631; Connection all other unit except that far-end is waited for element address FIFO H633 provide the SC_MU_SEM registers group H631 of state and control information for all other unit; The data sending engine H630 that connects SC_MU_SEM registers group H631 and crosspoint adapter E; Be connected to the Data Receiving engine H632 of crosspoint adapter E; The resource count device H634 of controller H8 and Data Receiving engine H632 when connecting SC_MU_SEM registers group H631, processor core operation, the far-end that connects Data Receiving engine H632 is waited for element address FIFO H633.
The content of SC_MU_SEM initialization register H6316 is the working method of SC_MU_SEM and the initial value of resource count device.At least 2 bits of working method also can be more, such as: the most-significant byte of SC_MU_SEM initialization register is the working method of SC_MU_SEM, and low 24 is resource count device initial value.
The working method of SC_MU_SEM comprises:
Working method 0:SC_MUTEX, the local maintenance resource count;
Working method 1:SC_MUTEX, the remote maintenance resource count;
Working method 2:SC_SEMAPHORE, the local maintenance resource count;
Working method 3:SC_SEMAPHORE, the remote maintenance resource count;
SC_MU_SEM registers group module has been safeguarded all registers of SC_MU_SEM unit, specifically comprises following each register as shown in the figure.
Semaphore and mutex identification register SC_MU_SEM_ID H6310 are used to store the name name_ of SC_MUTEX or SC_SEMAPHORE, such as the name_ of sc_semaphore (constchar*name_, int init_value_).This register is that optionally width can be greater than 32 bits.
SC_MU_SEM initialization register SC_MU_SEM_INIT H6316 determines the working method of SC_MU_SEM and the initial value of resource count device.The initial value of resource count device is corresponding to the init_value_ of sc_semaphore (int init_value_).
Resource release/obstruction obtains/resource obstruction locking/unlocking register SC_MU_SEM_LW H6313, is used for release and obstruction and obtains a semaphore, and block mutex of the locking and unlocking.Such as: write 0 representative/obstruction to this register to obtain/locking of resource obstruction, write 1 and represent resource release/release.
Resource attempts obtaining/resource trial lock register SC_MU_SEM_TLW H6314, is used for attempting obtaining a semaphore, and attempts mutex of locking.When processor core reads this register, just represent and attempt obtaining semaphore or resource trial locking mutex.
Local SC_MU_SEM address register SC_MU_SEM_ADDR H6311 is used to provide the address that this SC_MU_SEM unit is different from other unit in all chips.
Far-end SC_MU_SEM address register SC_MU_SEM_RMT_ADDR H6311 is used to provide the address that far-end SC_MU_SEM unit is different from other unit in all chips.When far-end SC_MU_SEM transmission is replied, need this address.
Optional consumer's stand-by period register SC_MU_SEM_CWT H6317.This register decision is when the value of local SC_MU_SEM unit entrained stand-by period information when far-end SC_MU_SEM unit sends RTY.The typical method that produces the stand-by period is to provide a fixed value, such as 2 microseconds.
Producer's register SC_MU_SEM_OWNER H6318 of the owner/semaphore of mutex.Mutex only produces response to current owner's a unlocking request, and semaphore only produces response to the producer's of semaphore increase semaphore request.When this register is 0, then all unlocking request and the request that increases semaphore are produced response.
Semaphore currency register SC_MU_SEM_VALUE H6315, the currency of storage signal batching counter.
Each Elementary Function of SC_MU_SEM unit and signal flow are as follows:
Working method 1 and working method 3 times, write SC_MU_SEM locking/waiting register SC_MU_SEM_LW H6313, SC_MU_SEM when processor core and attempt locking/waiting register SC_MU_SEM_TLW H634, data sending engine H630 is packaged into SC_MU_SEM_ADDR H6311, SC_MU_SEM_RMT_ADDR H6311, { locking/wait request (1), trial locking/wait request (2), release/unlocking request identification (3) } request data package and sends to crosspoint adapter E according to the desired packet format of crosspoint;
Working method 1 and working method 3 times, because the available resources counting is in far-end SC_MU_SEM unit.Do not having available resources in far-end SC_MU_SEM unit and far-end waits for that the Data Receiving engine H632 of far-end SC_MU_SEM unit will return a retry RTY for request data package and reply under the full situation of unit F IFO H633.The value of the stand-by period information of carrying during RTY replys is the SC_MU_SEM_CWT H6317 of far-end SC_MU_SEM unit.Reply the back and wait for the SC_MU_SEM_CWT time receiving RTY, the Data Receiving engine sends engine with request msg and resends the last request data package that sends.
Working method 0 and working method 2 times, H632 has received request data package when the Data Receiving processing unit, and then SC_MU_SEM_RMT_ADDR and the SC_MU_SEM_ADDR of oneself with request msg compares, if identical, then receive this packet and carry out subsequent treatment, otherwise directly abandon.
Described subsequent treatment comprises:
When request package is one { locking/wait request (2) is attempted in locking/wait request (1) }, then notify the resource count device to count and reduce 1; If the resource count device is notified successfully, then Data Receiving processing unit H632 replys to the SC_MU_SEM unit transmission ACK that sends this request data package at once, and ACK reply data bag carries { (4) are asked successfully in locking/wait } or { attempt locking/wait and ask successfully (5) } information at least; If resource count device notice is obtained the resource failure, then for { attempting locking/wait request (2) }, Data Receiving processing unit H632 sends RTY to the SC_MU_SEM unit that sends this request data package at once and replys, RTY reply data bag carries the value of { attempting locking/wait request failure (6) } and self SC_MU_SEM_CWT at least, and for { locking/wait request (1) }, then Data Receiving processing unit H632 stores into far-end with the SC_MU_SEM_ADDR of the request package that receives and waits for element address FIFO H633.But, if far-end waits for that element address FIFO H633 is full before, then Data Receiving processing unit H632 replys to the SC_MU_SEM unit transmission RTY that sends this request data package at once, and RTY reply data bag carries the value of { locking/wait request failure (7) } and self SC_MU_SEM_CWT at least.When any moment resource available again, then Data Receiving processing unit H632 takes out the SC_MU_SEM_ADDR of team that far-end is waited for element address FIFO H633, and send ACK to the SC_MU_SEM unit of this SC_MU_SEM_ADDR representative and reply, ACK reply data bag carries { (4) are asked successfully in locking/wait } at least.
When request package is one { release/unlocking request }, notice resource count device will be counted and increase by 1.Mutex only produces response to current owner's a unlocking request, and the user also can the setting signal amount only produces response to the producer's of semaphore increase semaphore request.The producer of the owner/semaphore of mutex is identified by register SC_MU_SEM_OWNER.When the SC_MU_SEM_ADDR coupling of SC_MU_SEM_OWNER and the request data package that receives, the resource count device just can actually increase by 1, otherwise the counting that the resource count device can be ignored from the Data Receiving engine increases notice.
Below with two Application Examples, illustrate based on mutual exclusion of the present invention and semaphore the application of shared resource.
First example illustrates the basic use of SC_MU_SEM unit, comprises how each register of SC_MU_SEM unit carries out initialization in actual use.
In this example, the user writes a MP3+ motion JPEG decodes and plays device based on the polycaryon processor of Fig. 2, as shown in figure 14.This system needs two processes, and one is SC_METHOD process scm_mp3, its Mp3 that decodes, and its principle of work is that the every generation of local timer Timer1 is once interrupted, and then reads a frame music, decodes; Another one SC_THREAD process is sct_jpeg, 10 music frames of every decoding, the JPEG picture and showing on screen of decoding.The false code of top-level module is as follows:
SC_MODULE(mp3_mp4)
{
void?scm_mp3;
sc_semaphore s1(“s1”);
sc_timer?t1(“t1”);
SC_CTOR(mp3_mp4)
{
SC_METHOD(scm_mp3);
sensitive<<t1.event;
t1.start();
SC_THREAD(sct_jpeg)
}
};
The false code of SC_METHOD process scm_mp3 is as follows:
Void?mp3_mp4::scm_mp3()
{
decode_music();
if(count==10)s1.post();
}
Void?mp3_mp4::sct_jpeg()
{
s1.wait();
decode_jpeg();
}
Suppose that mapping process is mapped to processing unit PE (0,0) and processing unit P (1,1) respectively with above-mentioned two processes.
The SC_MU_SEM that any two (perhaps a plurality of) are positioned at different PE can constitute SC_MU_SEM combination, finishes in the personal code work function of a sc_memaphore or sc_mutex.In this example, the SC_MU_SEM that is made of the SC_MU_SEM0 Q1 module of the SC_MU_SEM0 Q2 module of PE (0,0) and PE (1,1) of semaphore s1 is to realizing.
The SystemC that developing instrument is finished the user automatically translates to C.This process obtains 2 results, the one, the C code of the process scm_mp3 of PE (0,0), the 2nd, the process sct_jpeg of PE (1,1).
The C language expression of scm_mp3 is as follows:
An interruption of void scm_mp3//processor
{
decode_music();
Resource of // distribution, s1.post ();
It is 32 bit registers of x that //REG32 (x) represents the address
REG32(PE_0_0_SC_MU_SEM_0_BASE+0x18=1;
}
The peripheral hardware setup code of PE (0,0) is as follows:
int?main()
{
WriteID(PE_0_0_SC_MU_SEM_0_BASE)=”PE1,1,S1”;
REG32 (PE_0_0_SC_MU_SEM_0_BASE+0x10)=0x001; // local SEM
REG32 (PE_0_0_SC_MU_SEM_0_BASE+0x14)==0; // resource is 0, Not used
REG32(PE_0_0_SC_MU_SEM_0_BASE+0x20)=AddressofPE(0,0).SC_MU_S
EM0. local address register;
REG32(PE_0_0_SC_MU_SEM_0_BASE+0x24)=Addressof
PE (1,1) .SC_MU_SEM0. local address register;
Start_timer(timer1,periodic,Interval)
Enble_interrupt_and_stop();
}
The code of PE (1,1) is as follows:
int?main();//sct_jpeg
{
WriteID(PE_1_1_SC_MU_SEM_0_BASE)=”PE1,1,S1”;
REG32 (PE_1_1_SC_MU_SEM_0_BASE+0x10)=0x101; // local SEM
REG32 (PE_1_1_SC_MU_SEM_0_BASE+0x14)==0; // resource is 0, Not used
REG32(PE_1_1_SC_MU_SEM_0_BASE+0x20)=Address of
PE (0,0) .SC_MU_SEM0. local address register;
REG32(PE_1_1_SC_MU_SEM_0_BASE+0x24)=Address of
PE (1,1) .SC_MU_SEM0. local address register;
While(true)
{
REG32(PE_1_1_SC_MU_SEM_0_BASE+0x1)=0;//s1.wait
decode_jpeg();
}
}
Second embodiment tells about with the realization of one 4 antenna wireless mesh network access point and how to use the SC_MU_SEM unit to use as semaphore and resource sharing based on semaphore of the present invention.When this example had been described consumer when semaphore more than one, a plurality of consumer's processes are the situation of request signal amounts simultaneously.The request message of consumer's process is stored in far-end and waits among the FIFO H633 of element address.When this FIFO was full, TRY was replied in the consumer's who arrives in the back the request of obtaining semaphore at once, comprises the wait duration in TRY replys.The SC_MU_SEM unit that the wait duration is replied by transmission is determined, adopts the fixing duration of waiting for to reply in this example, in more complicated design, also can adopt variable duration to reply according to an algorithm.
As Figure 15, the process sct_eth0 among the figure is positioned at PE (0,0) L5, and it is communicated by letter with WiFi interface L1, L10's; Process sct_eth1 is positioned at PE (2,0) L13, and management is communicated by letter with WiFi interface L3, L9's.Process sct_eth0 and sct_eth1 are placed on the payload of the packet that receives among the chip external memory L2, and packet header is stored in the formation pkt_q that is arranged in global resource D.Process sct_route1 L15 and sct_route2 L16 move routing algorithm respectively, and the packet of storing is forwarded to corresponding WiFi interface.Therefore the queue heads that process sct_route1L15 and sct_route2 L16 visit pkt_q simultaneously needs a semaphore sem2 protection queue heads; Process sct_eth0 and sct_eth1 visit the tail of the queue of pkt_q simultaneously, need a sem1 protection rear of queue.In addition, also comprise a rx_cnt variable in global resource D, it adopts mutex mu1 to protect, the sum of the packet that record is received from four WiFi interfaces.
As Figure 16, sem1 is made of jointly the SC_MU_SEM unit J1 that is positioned at PE (0,0), the SC_MU_SEM unit J6 that is positioned at the SC_MU_SEM unit J3 of GR and is positioned at PE (2,0).The value of J3 maintenance signal amount, and this value is initialized to 1.When sct_eth0 (or sct_eth1) calls sem1.wait (), a semaphore locking request is dealt into J3 from J1 (or J6).The value that detects semaphore as J3 is not 0, then the value of semaphore is reduced 1, and sends ACK and reply to J1 (perhaps J6), thereby process sct_eth0 has obtained semaphore, and the header packet information of the packet that receives is joined pkt_q formation afterbody.If it is 0 that J3 detects the value of semaphore, then will join the far-end wait element address FIFO G16 of J3 from local address in the locking request of J1 (or J6).When the value of semaphore is 1 again, then far-end waits for that the request address of element address FIFO G16 is removed, and J3 sends the SC_MU_SEM unit J1 (or J6) of ACK to the request address correspondence.
Similarly, process sct_route1 and sct_route1 are by team's head of sem2 visit pkt_q.
When sct_eth0 (or sct_eth1) receives a packet, it also needs to revise rx_cnt, and this variable is protected by mu1.Mu1 is made of jointly the SC_MU_SEM unit J2 that is positioned at PE (0,0), the SC_MU_SEM unit J7 that is positioned at the SC_MU_SEM unit J4 of GR and is positioned at PE (2,0).Process sct_eth0 (or sct_eth1) is by J1 (perhaps J7) request locking mu1, and then a locking request is sent to J4.The state that detects semaphore as J4 is non-locking, then with the locking of this semaphore, and send ACK and reply to J2 (perhaps J7), thereby process sct_eth0 has obtained the access right of rx_cnt, and process sct_eth0 adds 1 with rx_cnt.Be locking if J4 detects semaphore, then will join the far-end wait element address FIFO G16 of J3 from local address in the locking request of J2 (or J7).When mutex is unlocked, then far-end waits for that the request address of element address FIFO G16 is removed, and J3 sends the SC_MU_SEM unit J1 (or J6) of ACK to the request address correspondence, finishes locking.Process sct_eth0 (or sct_eth1) is by J1 (perhaps J7) request release mu1, and a unlocking request is sent to J4.The owner who detects current mutex as J4 is J2 (perhaps J7), then finishes the semaphore release, and sends ACK to J2.If the owner of current mutex is not J2 (perhaps J7), then sends TRY and be notified to J2 (perhaps J7) release failure.
Data Receiving engine data reception engine data reception engine data receives engine data reception engine data reception engine data reception engine data reception engine data and receives engine data reception engine data reception engine data reception engine data reception engine
Controller H8 has multiple implementation during the processor core operation, modal a kind of the realization is to adopt input more than one and door: as all input signal H71, H72, H73, H74, when H75 is " RUN ", its just by B9 output " RUN " signal to processor core H1.
It should be noted that at last, above embodiment is only in order to describe technical scheme of the present invention rather than the present technique method is limited, the present invention can extend to other modification, variation, application and embodiment on using, and therefore thinks that all such modifications, variation, application, embodiment are in spirit of the present invention and teachings.

Claims (14)

1, a kind of polycaryon processor that satisfies systemC, comprise: the array that a plurality of crosspoints that are used for swap data are connected to form, the a plurality of processing units that comprise the local resource unit that are used for data processing that are connected with described crosspoint, described processing unit links to each other with described crosspoint through the crosspoint adapter by the local resource unit, and then link to each other with other a plurality of processing units, wherein, described local resource unit comprises the event handling unit group of SystemC, in the fifo queue unit group of SystemC and the mutual exclusion of SystemC and the semaphore unit group one or more.
2, the polycaryon processor of claim 1, wherein, described processing unit also comprises a processor core, processor core and crosspoint adapter bridge, wherein, described processor core is communicated by letter with described crosspoint adapter with crosspoint adapter bridge by described processor core, and described crosspoint adapter connects event handling unit group, the fifo queue unit group of SystemC and mutual exclusion and the semaphore unit group of SystemC of the SystemC in described crosspoint and the described local resource.
3, the polycaryon processor of claim 2 also comprises timer group and on-chip memory, and described timer group comprises a plurality of timers, and each timer is exported overtime notification signal to the corresponding unit of described event handling unit group.
4, the polycaryon processor of claim 3, controller when also comprising the processor core operation, controller when described processor core local resource unit, described on-chip memory, described timer group are connected to the operation of described processor core respectively, controller sends the processor time-out and continues exercise notice when described processor core moves, output operation/stop signal.
5, the polycaryon processor of claim 2, wherein, described processor core links to each other with described processor core by local bus with crosspoint adapter bridge, and described processor core is connected with described crosspoint adapter with crosspoint adapter bridge, described processor core and crosspoint adapter bridge are used for processor core and directly visit other processing unit and peripheral unit on the sheet, and are used for communicating by letter between processor core instruction and data bus interface and the crosspoint adapter.
6, the polycaryon processor of claim 1, wherein, described event handling unit group comprises the sc_event according to the SystemC grammer, the various publicly-owned function of sc_event_queue, the logical circuit that the responsive table of the various forms of wait (...) function and SC_THREAD and SC_METHOD is realized, described logical circuit is used to send the signal that activates and suspend processor, and can realize a plurality of sc_event or the defined function of sc_event_queue syntactic units so that the mapping algorithm of SDK (Software Development Kit) will be positioned at a plurality of event handlings unit group of different processing units to be made up, the function of wait (...) function, the responsive table function of SC_THREAD and SC_METHOD.
7, the polycaryon processor of claim 1, wherein, described mutual exclusion and semaphore unit group comprise a plurality of mutual exclusions and semaphore unit, described mutual exclusion and semaphore unit comprise the sc_mutex (name) based on the SystemC grammer, sc_mutex.lock (), sc_mutex.trylock (), sc_mutex.unlock (), sc_semaphore.wait (), sc_semaphore.trywait (), sc_semaphore.post (), sc_semaphore (init_value), sc_semaphore (name, init_value) logical circuit of function code realization, described logical circuit is used for when processor core moves controller and sends the signal that activates and suspend processor, and by the mapping algorithm of SDK (Software Development Kit) a plurality of mutual exclusions and the semaphore unit that is positioned at different processing units reconfigured, realize sc_mutex or the defined function of sc_semaphore syntactic units.
8, the polycaryon processor of claim 1, wherein, described fifo queue group comprises a plurality of fifo queues unit, event signal is read to the event handling unit of correspondence group output data writing events signal and data in described fifo queue unit, and comprise sc_fifo (name according to the SystemC grammer, size), sc_fifo (size), sc_fifo.read (), sc_fifo.nb_read (), sc_fifo.write (), sc_fifo.nb_write (), sc_fifo.num_available (), sc_fifo.num_free (), sc_fifo.data_written_event (), sc_fifo.data_read_event (), the logical circuit that function code realizes, described logical circuit is used for when processor core moves controller and sends the signal that activates and suspend processor, and can realize the defined function of sc_fifo syntactic units so that the mapping algorithm of SDK (Software Development Kit) will be positioned at a plurality of mutual exclusions and the semaphore unit of different processing units to be reconfigured.
9, the polycaryon processor of claim 6, wherein, described event handling unit comprises:
Incident sends engine, links to each other respectively with local bus with described crosspoint adapter, is used to send the event notification data bag;
Incident receives engine, links to each other respectively with local bus with described crosspoint adapter, is used to receive the event notification data bag;
The processor core implementation controller, controller links to each other respectively when moving with described processor with described incident receiving element, is used for processing unit internal event notice and described event notification data bag;
Recover the executive condition registers group, link to each other respectively with described local bus, be used for temporary incident control and incident cancellation with described processor core implementation controller;
The transmit control register group sends engine with described incident and links to each other with local bus, comprises transmission incident address register and the incident of transmission control register.
10, the polycaryon processor of claim 9, wherein, the sc_event.notify () of personal code work and sc_event_queue.notify () are translated in described event handling unit: processor core writes the destination address of this incident of reception and writes trigger value to the incident of transmission control register to described transmission incident address register, described incident sends engine and sends the event notification data bag to described crosspoint adapter, wherein, the destination address of described packet is the value of transmission incident address register.
11, the polycaryon processor of claim 8, wherein, described fifo queue unit comprises:
Local FIFO;
Data sending engine links to each other with described crosspoint adapter, is used to send packet;
The Data Receiving engine links to each other with described crosspoint adapter, is used to receive packet;
The FIFO access controller, link to each other respectively with described data sending engine, described Data Receiving engine, described local FIFO, be used for transmission and reception according to the state control data bag of local FIFO, controller when output processor time-out or run signal are moved to processor, output write operation and read operation event notice are to event handling unit group;
The SC_FIFO registers group is connected with the FIFO access controller with local bus interface, comprises unblock read register and unblock read port status register.
12, the polycaryon processor of claim 11, wherein, the sc_fifo.nb_read () of personal code work operates in the described fifo queue unit and is translated into: processor core reads described unblock read register and unblock read port status register, and whether the data of the value representation unblock read register of unblock read port status register are effective.
13, the polycaryon processor of claim 7, wherein, described mutual exclusion and semaphore unit comprise:
The resource count device links to each other respectively with the SC_MU_SEM registers group with data sending engine, is used for resource count;
Data sending engine links to each other with described crosspoint adapter, is used to send packet;
The Data Receiving engine links to each other with described crosspoint adapter, is used to receive packet;
The SC_MU_SEM registers group, link to each other respectively with described data sending engine, described Data Receiving engine, described resource count device and described local bus, be used for transmission and reception according to the state control data bag of resource count device, and the renewal resource count, the SC_MU_SEM registers group comprises initialization register.
14, the polycaryon processor of claim 13, wherein, the sc_smaphore of personal code work (init_value) operates in the described SC_MU_SEM unit by following translation: processor core writes initialization register with init_value, and initialization initial resource counting is init_value.
CN2008101170199A 2008-07-22 2008-07-22 Multi-core processor satisfying SystemC syntax Active CN101634979B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008101170199A CN101634979B (en) 2008-07-22 2008-07-22 Multi-core processor satisfying SystemC syntax

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008101170199A CN101634979B (en) 2008-07-22 2008-07-22 Multi-core processor satisfying SystemC syntax

Publications (2)

Publication Number Publication Date
CN101634979A true CN101634979A (en) 2010-01-27
CN101634979B CN101634979B (en) 2011-09-07

Family

ID=41594171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008101170199A Active CN101634979B (en) 2008-07-22 2008-07-22 Multi-core processor satisfying SystemC syntax

Country Status (1)

Country Link
CN (1) CN101634979B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012110445A1 (en) 2011-02-15 2012-08-23 Commissariat A L'energie Atomique Et Aux Energies Alternatives Device for accelerating the execution of a c system simulation
WO2014124852A2 (en) 2013-02-15 2014-08-21 Commissariat A L'energie Atomique Et Aux Energies Alternatives Device and method for accelerating the update phase of a simulation kernel
CN104346317A (en) * 2013-07-23 2015-02-11 中兴通讯股份有限公司 Shared resource access method and device
CN104823164A (en) * 2012-12-06 2015-08-05 相干逻辑公司 Processing system with synchronization instruction
CN103176926B (en) * 2011-08-03 2017-07-28 Arm有限公司 Integrated circuit and method for debugging barrier transaction
CN108459901A (en) * 2018-01-24 2018-08-28 深圳市普威技术有限公司 A kind of processing method of process lock, apparatus and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100524286C (en) * 2007-10-29 2009-08-05 中国科学院计算技术研究所 Multiple core processing system and its management method
CN100580630C (en) * 2007-12-29 2010-01-13 中国科学院计算技术研究所 Multi-core processor meeting SystemC grammar request and method for acquiring performing code

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012110445A1 (en) 2011-02-15 2012-08-23 Commissariat A L'energie Atomique Et Aux Energies Alternatives Device for accelerating the execution of a c system simulation
US9612863B2 (en) 2011-02-15 2017-04-04 Commissariat A L'energie Atomique Et Aux Energies Alternatives Hardware device for accelerating the execution of a systemC simulation in a dynamic manner during the simulation
CN103176926B (en) * 2011-08-03 2017-07-28 Arm有限公司 Integrated circuit and method for debugging barrier transaction
CN104823164A (en) * 2012-12-06 2015-08-05 相干逻辑公司 Processing system with synchronization instruction
CN104823164B (en) * 2012-12-06 2019-07-16 相干逻辑公司 The synchronous method of multicomputer system and maintenance with isochronous controller
WO2014124852A2 (en) 2013-02-15 2014-08-21 Commissariat A L'energie Atomique Et Aux Energies Alternatives Device and method for accelerating the update phase of a simulation kernel
CN104346317A (en) * 2013-07-23 2015-02-11 中兴通讯股份有限公司 Shared resource access method and device
CN108459901A (en) * 2018-01-24 2018-08-28 深圳市普威技术有限公司 A kind of processing method of process lock, apparatus and system

Also Published As

Publication number Publication date
CN101634979B (en) 2011-09-07

Similar Documents

Publication Publication Date Title
CN101634979B (en) Multi-core processor satisfying SystemC syntax
Agarwal et al. The MIT Alewife machine: Architecture and performance
CN100565472C (en) A kind of adjustment method that is applicable to multiprocessor karyonide system chip
US7849441B2 (en) Method for specifying stateful, transaction-oriented systems for flexible mapping to structurally configurable, in-memory processing semiconductor device
CN101635006B (en) Mutual exclusion and semaphore cell block of multi-core processor satisfying SystemC syntax
CN103150279B (en) Method allowing host and baseboard management controller to share device
US7360068B2 (en) Reconfigurable signal processing IC with an embedded flash memory device
CN100568247C (en) A kind of event handling unit group that satisfies the polycaryon processor of systemC grammer
CN103221937A (en) Load/store circuitry for a processing cluster
US8365111B2 (en) Data driven logic simulation
CN101630305A (en) Flexible management method for reconfigurable components in high-efficiency computer
CN101329702A (en) First-in first-out queue unit set of multi-core processor satisfying SystemC grammar
CN101770362B (en) Distributed dynamic process generating unit meeting System C processor
CN100580630C (en) Multi-core processor meeting SystemC grammar request and method for acquiring performing code
Barba et al. A comprehensive integration infrastructure for embedded system design
Wang et al. Synthesizing operating system based device drivers in embedded systems
Bertolotti et al. Modular design of an open-source, networked embedded system
US20020183997A1 (en) Apparatus and method for specifying the configuration of mixed-language simulation models
CN105893036A (en) Compatible accelerator extension method for embedded system
Yu et al. Transaction level platform modeling in systemc for multi-processor designs
CN110109849B (en) CAN equipment driving device and method based on PCI bus
Virtanen et al. A processor architecture for the TACO protocol processor development framework
CN104025026A (en) Accessing Configuration and Status Registers for a Configuration Space
Nunes et al. A profiler for a heterogeneous multi-core multi-FPGA system
Fischer et al. Towards interprocess communication and interface synthesis for a heterogeneous real-time rapid prototyping environment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant