CN101196826A - Multi-core processor meeting SystemC grammar request and method for acquiring performing code - Google Patents

Multi-core processor meeting SystemC grammar request and method for acquiring performing code Download PDF

Info

Publication number
CN101196826A
CN101196826A CNA2007103085745A CN200710308574A CN101196826A CN 101196826 A CN101196826 A CN 101196826A CN A2007103085745 A CNA2007103085745 A CN A2007103085745A CN 200710308574 A CN200710308574 A CN 200710308574A CN 101196826 A CN101196826 A CN 101196826A
Authority
CN
China
Prior art keywords
unit
processor
event
resource unit
systemc
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2007103085745A
Other languages
Chinese (zh)
Other versions
CN100580630C (en
Inventor
陈曦
范东睿
张�浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN200710308574A priority Critical patent/CN100580630C/en
Publication of CN101196826A publication Critical patent/CN101196826A/en
Application granted granted Critical
Publication of CN100580630C publication Critical patent/CN100580630C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a polynuclear processor satisfying SystemC grammatical requirements and a method of acquiring the related execution codes. The polynuclear processor comprises: an array formed by connection of a plurality of switching units for switching data, a plurality of processing units for data processing connected with the switching units; a local resource unit connected between adjacent processing units for synchronizing and data sharing between adjacent processing units, as well as at least one global resource unit connected with the switching unit for synchronizing and data sharing between all the processing units. The method comprises: step S1, translating SystemC software model into codes which can be coded by an instruction set complier of the processing unit; step S2, mapping the process in the software model on the processing unit, and mapping grammatical elements in the SystemC respectively on the local resource unit and the global resource unit. The invention can obviously lower the developing difficulty of the embedded type system.

Description

The polycaryon processor that satisfying SystemC syntax requires and obtain the method for its run time version
Technical field
The present invention relates to a kind of polycaryon processor, especially a kind of polycaryon processor of satisfying SystemC issued transaction level grammar request and obtain the method for its run time version.
Background technology
In existing computer system, realized the executed in parallel of a plurality of threads based on operating system.The method for designing of carrying out a plurality of threads simultaneously meets human basic mode of thinking, the ultimate principle that also meets the parallel development in objective world and carry out.
Before the polycaryon processor technology occurred, computer software was that serial is carried out in essence.Before polycaryon processor occurred, field programmable gate array (FPGA) was the most popular natural parallel large scale integrated circuit of technician.The human use FPGA a lot of year mainly is applied as insensitive application of quick A SIC prototype verification and cost such as cellular basestation.Each processing unit (PE) of this meticulous gate array is generally some (as 2,4 or 8) bit storage element and some (such as 4,5 or 6) input look-up table.
Along with the progress of integrated circuit technology, the silicon cost is more and more lower, and integrated level is more and more higher, and the polycaryon processor epoch of the interior a plurality of processor cores of chip arrive.
The people that appear as of multinuclear SOC (system on a chip) (SoC) have brought parallel processing device of new generation.The core of multinuclear SOC (system on a chip) is a polycaryon processor, and from the development angular divisions, polycaryon processor can be divided into two classes:
(1) first kind polycaryon processor does not change design cycle and the programming mode that existing order is carried out, and just adopts more advanced technique of compiling to adapt to the framework of multinuclear.
Multinuclear here role is to replace monokaryon that more computing function is provided.No matter at present most processors is monokaryon or multinuclear, all the programming model carried out in proper order of employing.Under this model, in order to support multitask, people have introduced support multi-task operation system.Operating system carries out multitasking programming for people and the executed in parallel code provides approach.But under the situation that exists operating system and a plurality of tasks in parallel to exist, it is very complicated that whole embedded system becomes, and debugging difficulty is compared with the monokaryon single task greatly to be increased.A kind of debud mode is a debugging breakpoints.When processor was carried out the time-out execution of breakpoint place, outside initial conditions still may change, because processor suspends, the condition that mistake takes place may not be reproduced.Another debud mode is printed for adopting, and print in the place that may make mistakes, and the possibility of result of printing is very various, and mistake is difficult to the location.In case and processor carries out and to make mistakes, and may printing itself can not work before mistake takes place.The another one problem after the operating system introduced is the electric weight waste that the free time running of processor causes.Because a plurality of tasks are arranged, though peripheral hardware can stop as required, when processor should enter energy-saving mode and recover to become from energy-saving mode is difficult to determine, thereby causes the electric weight waste.According to statistics, for above-mentioned reasons, cause only about half of embedded system project failure.
Language and programming mode that (2) second class processor adopting are parallel design its framework according to the needs of parallel language and programming mode.The polycaryon processor that adopts this mode to design can closely cooperate with parallel language, is expected to overcome the shortcoming of first kind processor debugging difficulty and electric weight waste.
Yet present polycaryon processor all belongs to the first kind, and the second class processor still is at an early stage of development.From the angle of computer science, its main cause is parallel language such as prematurities still such as OCAR, SISAL and PCN.
Summary of the invention
The present invention seeks to deficiency at the polycaryon processor of prior art, provide a kind of satisfying SystemC issued transaction level grammar request polycaryon processor (SystemC Native Array Processor, SNAP) and obtain the method for its run time version.
In order to achieve the above object, the invention provides following technical scheme:
A kind of polycaryon processor comprises: the array that a plurality of crosspoints that are used for swap data are connected to form, and a plurality of processing units that are used for data processing are connected with described crosspoint; Be connected being used between the adjacent processing unit synchronously and the local resource unit of data sharing between the adjacent described processing unit, and be connected with described crosspoint at least one be used for the global resource unit of synchronous and data sharing between all processing units.
Preferably, described processing unit comprises processor core or processor and the processor that is connected with this processor core or processor suspends control module and crosspoint adapter; Described crosspoint adapter is connected with described crosspoint.
Preferably, described processor time-out control module comprises that processor suspends and the condition register of resuming operation; This processor time-out is connected with described local resource unit with the condition register of resuming operation.
Preferably, described local resource unit comprises at least one temporal event unit, at least one can remember event queue, at least one mutex unit, at least one semaphore unit and I/O queue that at least one is two-way.
Preferably, described temporal event unit comprises: according to the logical circuit that the sc_event.notify () of SystemC grammer, sc_event.cancel () function code realize, this logical circuit is used for sending the signal of active processor and from the signal of this processing unit reception cancellation incident to the adjacent processing unit that is connected with local resource unit A16.
Preferably, describedly remember event queue and comprise: according to the sc_event_queue.notify () of SystemC grammer, the logical circuit that sc_event_queue.cancel () function code realizes, this logical circuit is used for sending the signal of active processor and receiving the signal of cancellation incident from this processing unit to the adjacent processing unit that is connected with the local resource unit.
Preferably, described mutex unit comprises: according to sc_mutex.lock (), the sc_mutex.trylock () of SystemC grammer, the logical circuit that sc_mutex.unlock () function code realizes, this logical circuit be used for to the adjacent processing unit that is connected with the local resource unit send active processor signal, suspend the signal of processor, the state of mutex is offered processing unit reads and receive mutex request signal from processing unit.
Preferably, described semaphore unit comprises: the logical circuit of realizing according to sc_semaphore.wait (), the sc_semaphore.trywait () of SystemC grammer, sc_semaphore.post (), sc_semaphore.get_value () function code, this logical circuit be used for to the adjacent processing unit that is connected with the local resource unit send active processor signal, suspend the signal of processor, the state of semaphore is offered processing unit reads and the update signal amount.
Preferably, described two-way I/O queue comprises: the logical circuit of realizing according to sc_fifo.read (), the sc_fifo.nb_read () of SystemC grammer, sc_fifo.write (), sc_fifo.nb_write (), sc_fifo.num_available (), sc_fifo.num_free () function code, this logical circuit be used for to the adjacent processing unit that is connected with the local resource unit send active processor signal, suspend the signal of processor, the state of I/O queue is offered that processing unit reads and to I/O queue's read-write operation.
Preferably, described global resource unit comprises the crosspoint adapter, with at least one the temporal event unit that is connected with this crosspoint adapter, at least one can remember event queue, at least one mutex unit, at least one semaphore unit and I/O queue that at least one is two-way.
A kind of method that obtains the polycaryon processor run time version of satisfying SystemC syntax requirement comprises the steps:
Step S1 translates into the code that processing unit instruction set compiler can compile with the SystemC software model;
Step S2, process in the described software model is mapped on the processing unit, and with syntactic element sc_event, sc_event_queue among the SystemC, sc_mutex, sc_semaphore, sc_fifo be mapped to respectively local resource unit and/or global resource unit the temporal event unit, can remember in event queue, mutex unit, semaphore unit and the two-way I/O queue.
Preferably, the interpretation method among the described step S1 comprises the steps:
Step S11 is translated as syntactic element sc_event, sc_event_queue, sc_mutex, sc_semaphore, sc_fifo among the SystemC register manipulation of local resource unit and global resource unit in the polycaryon processor that satisfying SystemC syntax requires;
Step S12 is the main () function of C language with each SC_THREAD process code translation.
Preferably, the mapping method among the described step S2 comprises the steps:
Step S21 is processor core of each SC_THREAD course allocation or processor;
Step S22 is for employed sc_event, sc_event_queue, sc_mutex, sc_semaphore, sc_fifo function in each SC_THREAD process distribute unit corresponding in local resource unit or the global resource unit.
Preferably, in described step S22, described distribution is meant unit corresponding in the priority allocation local resource unit, distributes unit corresponding in the global resource unit when unit corresponding in the local resource unit is not enough.
Preferably, also comprise the steps:
Step S3 adopts processing unit instruction set compiler that translation result is compiled into object code;
Step S4 merges the bit stream file that generation can be directly downloaded to polycaryon processor with mapping result, compiling result.
Preferably, also be included in the step S3 ' between step 3 and the step 4:, the performance of software model is assessed with delay, the power consumption performance information that mapping result, compiling result analyze the system of obtaining.
Compared with prior art, useful technique effect of the present invention is;
Polycaryon processor of the present invention is supported the SystemC syntactic element at the chip hardware level, the method of acquisition run time version of the present invention has realized the conversion of SystemC application source code to the executable code of processor, feasible embedded development based on SystemC becomes a reality, and has significantly reduced the embedded system development difficulty.
Description of drawings
Fig. 1 is the configuration diagram of polycaryon processor of the present invention.
Fig. 2 is the structural representation of local resource of the present invention unit.
Fig. 3 is the structural representation of temporal event of the present invention unit.
Fig. 4 is a structural representation of remembering event queue of the present invention.
Fig. 5 is the structural representation of mutex of the present invention unit.
Fig. 6 is the structural representation of semaphore of the present invention unit.
Fig. 7 is the structural representation of two-way I/O queue of the present invention.
Fig. 8 is the structural representation of global resource of the present invention unit.
Fig. 9 is the schematic flow sheet that obtains polycaryon processor run time version of the present invention.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, polycaryon processor that satisfying SystemC syntax of the present invention is required and the method that obtains its run time version are further elaborated.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
In the integrated circuit (IC) design industrial community, SystemC has been ripe parallel language.The SystemC language be in integrated circuit (IC) design modeling and with the mutual software of hardware.The SystemC language itself is the expansion of C++, and therefore, in fact SystemC can be used for doing the multinuclear embedded software developing.If polycaryon processor is supported the specific syntax of SystemC, will effectively simplify the development difficulty of embedded system based on the multinuclear embedded software developing of SystemC so, and realize more effectively power consumption of processing unit management.Based on above-mentioned purpose, the invention provides the polycaryon processor of satisfying SystemC syntax requirement and obtain the method for its run time version.
Embodiment 1
As shown in Figure 1, the polycaryon processor that satisfying SystemC syntax of the present invention requires comprises: the array that a plurality of crosspoint A14 that are used for swap data are connected to form, and a plurality of processing unit A13 that are used for data processing are connected with described crosspoint A14; Also comprise: be connected being used between the adjacent processing unit synchronously and the local resource unit A16 of data sharing between the adjacent processing unit A13, and be connected with described crosspoint A14 at least one be used for the global resource unit A17 of synchronous and data sharing between all processing units.
As a kind of enforceable mode, in polycaryon processor shown in Figure 1, eight processing unit P (0,0), P (0,1), P (0,2), P (1,0), P (1,2), P (2,0), P (2,1) and P (2,2) network by the two dimension that is made of crosspoint A14 communicates, be not the direct-connected parts of all and crosspoint all be processor, wherein A17 is a global resource unit.A16 is the local resource unit that is connected between the adjacent processing unit A13.All peripheral hardwares are such as comprising USB (universal serial bus) and universal asynchronous serial interface (USB﹠amp; 2 XUARTS) A1, synchronous dynamic random access memory interface (SDRAMC) A2, A5, A8 and A12, digital television broadcasting Asynchronous Serial Interface (DVB ASI) A3, digital television broadcasting synchronous serial interface (DVB SPI) A9, LCD interface (LCDC) A4, flash interface (Nand flashC) A6, peripheral interconnecting interface (PCIH) A7, audio frequency IO interface (IIS Audio) A11, HD video output interface (YCbCr) A10 constitute the periphery of whole network-on-chip and communicate by letter with extraneous.Processing unit A13 is except that being undertaken the communication by local resource unit A16, also use global resource unit A17 to carry out communication, but being preferably the preferred local resource of adjacent processing unit A13 unit A16 carries out synchronously and data sharing, just use global resource unit A17 when local resource unit A16 is not enough, this technical characterictic will be described in detail in the method for back.
In Fig. 1, A18 is a part of polycaryon processor of the present invention, and its details as shown in Figure 2.Compare with existing polycaryon processor, polycaryon processor of the present invention have local resource unit A16 and a global resource unit A17, and the design of local resource unit A16 and global resource unit A17 is that the grammar request according to SystemC designs, thereby the issued transaction level code that guarantees SystemC can be corresponding one by one with hardware resource.
Preferably, as shown in Figure 2, processing unit A13 comprises processor core or processor and the processor that is connected with this processor core or processor suspends control module and crosspoint adapter.Described crosspoint adapter is connected with described crosspoint A14.
Preferably, described processor time-out control module comprises that processor suspends and the condition register of resuming operation; This processor time-out is connected with described local resource unit A16 with the condition register of resuming operation.
Preferably, as shown in Figure 2, local resource unit A16 comprises the parts of a plurality of satisfying SystemC syntax requirements, as a kind of enforceable mode, specifically comprise: event queue (sc_event_queue), 32 mutexs (sc_mutex) unit, 32 semaphores (sc_semaphore) unit, 16 two-way I/O queues (sc_fifo) can be remembered in 32 temporal events (sc_event) unit, 32.
Preferably, as shown in Figure 3, described temporal event cells D 1 comprises: according to the sc_event.notify () of SystemC grammer, the logical circuit D6 that sc_event.cancel () function code realizes, this logical circuit D6 is used for sending signal D2, the D4 of active processor and receiving signal D3, the D5 of cancellation incident from this processing unit to the adjacent processing unit that is connected with local resource unit A16.Those skilled in the art can utilize the elements such as register of prior art to realize this logical circuit D6.
Preferably, as shown in Figure 4, describedly remember event queue J1 and comprise: according to the sc_event_queue.notify () of SystemC grammer, the logical circuit J6 that sc_event_queue.cancel () function code realizes, this logical circuit J6 is used for sending active processor signal J2, J4 and receiving signal J3, the J5 of cancellation incident from this processing unit to the adjacent processing unit that is connected with local resource unit A16.Those skilled in the art can utilize the elements such as register of prior art to realize this logical circuit J6.
Preferably, as shown in Figure 5, described mutex unit E1 comprises: according to sc_mutex.lock (), the sc_mutex.trylock () of SystemC grammer, the logical circuit E2 that sc_mutex.unlock () function code realizes, this logical circuit E2 is used for sending activation signal, suspending processor signal to the adjacent processing unit that is connected with local resource unit A16, the state of mutex is offered processing unit read and receive mutex request signal from processing unit.Those skilled in the art can utilize the elements such as register of prior art to realize this logical circuit E2.
Preferably, as shown in Figure 6, described semaphore unit F 1 comprises: the logical circuit F2 that realizes according to sc_semaphore.wait (), the sc_semaphore.trywait () of SystemC grammer, sc_semaphore.post (), sc_semaphore.get_value () function code, this logical circuit is used for sending the active processor signal, suspending processor signal to the adjacent processing unit that is connected with local resource unit A16, the state of semaphore is offered processing unit read and the update signal amount.Those skilled in the art can utilize the elements such as register of prior art to realize this logical circuit F2.
Preferably, as shown in Figure 7, the described two-way G1 of I/O queue comprises the sc_fifo.read () according to the SystemC grammer, sc_fifo.nb_read (), sc_fifo.write (), sc_fifo.nb_write (), sc_fifo.num_available (), the logical circuit G2 that sc_fifo.num_free () function code realizes, this logical circuit is used for sending the active processor signal to the adjacent processing unit that is connected with local resource unit A16, suspend processor signal, the state of I/O queue is offered that processing unit reads and I/O queue (FIFO) read-write operation.Those skilled in the art can utilize the elements such as register of prior art to realize this logical circuit G2.
In the present embodiment, can not be by needing to be undertaken synchronously and shared data between local resource unit A16 distance directly synchronous and shared data the processing unit far away by global resource unit A17.
Preferably, described global resource unit A17 comprises the parts of a plurality of satisfying SystemC syntax requirements, as a kind of enforceable mode, as shown in Figure 8, N is the number of processing unit A13 in the polycaryon processor, global resource unit A17 comprises the crosspoint adapter, with N*32 the temporal event unit that is connected with this crosspoint adapter, can remember event queue for N*32, N*32 mutex unit, N*32 semaphore unit and N*16 two-way I/O queue, this global resource unit A17 is connected with crosspoint A14 by described crosspoint adapter wherein.With previously described embodiment accordingly, N=8 herein.
The method of the polycaryon processor run time version that acquisition satisfying SystemC syntax of the present invention requires comprises the steps:
Step S1 translates into the code that processing unit instruction set compiler can compile with the SystemC software model;
Preferably, the interpretation method among the step S1 comprises following substep:
Step S11 is translated as the specific syntax elements sc_event among the SystemC, sc_event_queue, sc_mutex, sc_semaphore, sc_fifo function the register manipulation of local resource unit and global resource unit in the polycaryon processor that satisfying SystemC syntax requires;
Step S12 is C language main () function with each SC_THREAD process code translation, i.e. the standard C language unit that can compile.
Step S2, a plurality of processes in the described software model are mapped to respectively on the different processing units, and with the specific syntax elements sc_event among the SystemC, sc_event_queue, sc_mutex, sc_semaphore, sc_fifo function be mapped to respectively local resource unit and/or global resource unit the temporal event unit, can remember in event queue, mutex unit, semaphore unit and the two-way I/O queue.
Preferably, the mapping method among the step S2 comprises following substep:
Step S21 is processor core of each SC_THREAD course allocation or processor;
Step S22 is for employed sc_event, sc_event_queue, sc_mutex, sc_semaphore, sc_fifo function in each SC_THREAD process distribute unit corresponding in local resource unit or the global resource unit.
Preferably, among the described step S22, described distribution is meant unit corresponding in the priority allocation local resource unit, distributes unit corresponding in the global resource unit when unit corresponding in the local resource unit is not enough.
The standard C language code can be obtained by above-mentioned steps, the code that polycaryon processor of the present invention can be carried out can be obtained by following step again.Therefore preferably, method of the present invention also comprises:
Step S3 adopts processing unit instruction set compiler that translation result is compiled into object code;
Step S4 merges the bit stream file that generation can be directly downloaded to polycaryon processor with mapping result, compiling result.
Preferably, method of the present invention also is included in the step S3 ' between step 3 and the step 4: whether delay, power consumption performance information with mapping result, compiling result analyze the system of obtaining, satisfy performance requirement to software model and assess.
As shown in Figure 9, polycaryon processor of the present invention and the difference part that obtains the existing polycaryon processor of method of its run time version and obtain the method for its run time version are to have adopted SystemC issued transaction level grammer Application and Development software model H1, the application software model of SystemC issued transaction level grammer exploitation through translation process H7, mapping process H8, compilation process H10, assembly process H11 after, download on the processor by downloading process H12 and to carry out.Mapping result H3 and translation result H2 can be used for analytic system delay, power consumption performance H5, and these results of performance analysis are used as the improved reference of system performance.
With an embodiment, illustrate and on polycaryon processor of the present invention, adopt method of the present invention to realize the process of embedded system development below.The user writes a video playback MP3 music, the application software that the JPEG that decodes simultaneously shows based on the polycaryon processor of Fig. 1 on screen.The progress of MP3 decoding is carried out MP3 decoding according to the timed events that timer produces, timed events of every generation, processor frame of at every turn decoding, and JPEG picture of the 10 frame music of decoding decoding, and on screen, show.Therefore, MP3 decoding needs a process, and the JPEG decoding needs a process.
The user at first adopts SystemC to using modeling, MP3 decoding process and JPEG decoding process are come modeling with a SC_THREAD respectively, communication between them realizes by a semaphore sc_semaphore, per 10 frame music MP3 decoding processes add 1 with the value of sc_semaphore, and JPEG decoding process is waited on sc_semaphore, 1 part of sc_semaphore resource of every acquisition is carried out a JPEG decoding, waits for the sc_semaphore resource again after decoding finishes.The user with the debugging of whole software algorithm, guarantees that the function of whole application code is correct on computers.
The process of mapping is finished the correspondence of SystemC process to processor.In example, the MP3 decoding process is the master control process, be mapped to processor P (0,0), JPEG decoding process is mapped to P (0,1), and the sc_semaphore that they use is mapped to the local resource unit (LR) between P (0,0) and the P (0,1), other processors do not use, and these processors are placed in halted state after the system start-up.
The process of translation is finished the conversion of the code that SystemC issued transaction level grammer Application and Development software model can compile to processor core instruction set compiler.In this example, each process in user's the SystemC code is translated into the C code, corresponding the main () function independently of each processor.SystemC is translated into the operation of processor to suspending control and recovering condition register to the operation of sc_semaphore, in case the value of sc_semaphore increased by P (0,0), P (0,1) thus regaining resource is activated.
Compilation process with each independently main () function be compiled as the machine code of target processor, assembly process independently machine code is organized, and the framework information of associative processor, forms the bit stream file that actual processor can load and carry out.
Mapping result and translation result are used to analytic system delay, power consumption performance, and these results of performance analysis are used as the improved reference of system performance.
Embodiment 2
This enforcement is with the difference of embodiment 1: local resource unit A16 comprises the parts of a plurality of satisfying SystemC syntax requirements, as a kind of enforceable mode, specifically comprise: event queue (sc_event_queue), 1 mutex (sc_mutex) unit, 1 semaphore (sc_semaphore) unit and 1 two-way I/O queue (sc_fifo) can be remembered in 1 temporal event (sc_event) unit, 1; Global resource unit A17 comprises the parts of a plurality of satisfying SystemC syntax requirements, as a kind of enforceable mode, specifically comprise: event queue (sc_event_queue), 1 mutex (sc_mutex) unit, 1 semaphore (sc_semaphore) unit and 1 two-way I/O queue (sc_fifo) can be remembered in 1 temporal event (sc_event) unit, 1.
Other technical characterictic of present embodiment is identical with embodiment 1.Present embodiment still provides a kind of polycaryon processor of satisfying SystemC syntax requirement and has obtained the method for its run time version, has also reached purpose of the present invention, can obtain beneficial technical effects.
By above detailed description of the present invention, the invention has the advantages that as can be seen: the special feature of polycaryon processor of the present invention is at local, global resource unit on the sheet of SystemC design grammer design, and based on the run time version preparation method of the processor of SystemC, thereby come based on the processor difference of serial programming models such as C, C++, JAVA with tradition.In polycaryon processor of the present invention, static task of each processor core operation, tasks in parallel is carried out needed mutual exclusion, semaphore, incident etc. and is then adopted hardware directly to realize.In the method for the invention, the user adopts SystemC to programme, and is used for programming in message-level, and code length effectively reduces.But the code of whole embedded system becomes an execution model, and exploitation and debugging are all very convenient.In case processor core enters waiting status because of mutual exclusion, semaphore, incident etc., then processor core directly enters halted state, and the self-braking mode of this hardware makes the user needn't be concerned about that the power consumption of processor reduces problem.Polycaryon processor of the present invention has realized that high-level parallel model directly to the automatic mapping of polycaryon processor framework, effectively reduces user's development difficulty, has reduced the power consumption of processor.
Above said content; only for the concrete embodiment of the present invention, but protection scope of the present invention is not limited thereto, and anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; the variation that can expect easily or replacement all should be encompassed in protection scope of the present invention.

Claims (16)

1. polycaryon processor comprises: the array that a plurality of crosspoints that are used for swap data are connected to form, and a plurality of processing units that are used for data processing are connected with described crosspoint; It is characterized in that, also comprise: be connected being used between the adjacent processing unit synchronously and the local resource unit of data sharing between the adjacent described processing unit, and be connected with described crosspoint at least one be used for the global resource unit of synchronous and data sharing between all processing units.
2. polycaryon processor according to claim 1 is characterized in that, described processing unit comprises processor core or processor and the processor that is connected with this processor core or processor suspends control module and crosspoint adapter; Described crosspoint adapter is connected with described crosspoint.
3. polycaryon processor according to claim 2 is characterized in that, described processor suspends control module and comprises that processor suspends and the condition register of resuming operation; This processor time-out is connected with described local resource unit with the condition register of resuming operation.
4. polycaryon processor according to claim 1, it is characterized in that described local resource unit comprises at least one temporal event unit, at least one can remember event queue, at least one mutex unit, at least one semaphore unit and I/O queue that at least one is two-way.
5. polycaryon processor according to claim 4, it is characterized in that, described temporal event unit comprises: according to the logical circuit that the sc_event.notify () of SystemC grammer, sc_event.cancel () function code realize, this logical circuit is used for sending the signal of active processor and from the signal of this processing unit reception cancellation incident to the adjacent processing unit that is connected with the local resource unit.
6. polycaryon processor according to claim 4, it is characterized in that, describedly remember event queue and comprise: according to the sc_event_queue.notify () of SystemC grammer, the logical circuit that sc_event_queue.cancel () function code realizes, this logical circuit is used for sending the signal of active processor and receiving the signal of cancellation incident from this processing unit to the adjacent processing unit that is connected with the local resource unit.
7. polycaryon processor according to claim 4, it is characterized in that, described mutex unit comprises: according to sc_mutex.lock (), the sc_mutex.trylock () of SystemC grammer, the logical circuit that sc_mutex.unlock () function code realizes, this logical circuit be used for to the adjacent processing unit that is connected with the local resource unit send active processor signal, suspend the signal of processor, the state of mutex is offered processing unit reads and receive mutex request signal from processing unit.
8. polycaryon processor according to claim 4, it is characterized in that, described semaphore unit comprises: the logical circuit of realizing according to sc_semaphore.wait (), the sc_semaphore.trywait () of SystemC grammer, sc_semaphore.post (), sc_semaphore.get_value () function code, this logical circuit be used for to the adjacent processing unit that is connected with the local resource unit send active processor signal, suspend the signal of processor, the state of semaphore is offered processing unit reads and the update signal amount.
9. polycaryon processor according to claim 4, it is characterized in that, described two-way I/O queue comprises: according to the sc_fifo.read () of SystemC grammer, sc_fifo.nb_read (), sc_fifo.write (), sc_fifo.nb_write (), sc_fifo.num_available (), the logical circuit that sc_fifo.num_free () function code realizes, this logical circuit is used for sending to the adjacent processing unit that is connected with the local resource unit signal of active processor, suspend the signal of processor, the state of I/O queue is offered processing unit reads and I/O queue is carried out read-write operation.
10. polycaryon processor according to claim 1, it is characterized in that, described global resource unit comprises the crosspoint adapter, with at least one the temporal event unit that is connected with this crosspoint adapter, at least one can remember event queue, at least one mutex unit, at least one semaphore unit and I/O queue that at least one is two-way.
11. a method that obtains the polycaryon processor run time version comprises the steps:
Step S1 translates into the code that processing unit instruction set compiler can compile with the SystemC software model;
Step S2, process in the described software model is mapped on the processing unit, and with syntactic element sc_event, sc_event_queue among the SystemC, sc_mutex, sc_semaphore, sc_fifo be mapped to respectively local resource unit and/or global resource unit the temporal event unit, can remember in event queue, mutex unit, semaphore unit and the two-way I/O queue.
12. the method for acquisition polycaryon processor run time version according to claim 11 is characterized in that the interpretation method among the described step S1 comprises the steps:
Step S11 is translated as syntactic element sc_event, sc_event_queue, sc_mutex, sc_semaphore, sc_fifo among the SystemC register manipulation of local resource unit and global resource unit in the polycaryon processor that satisfying SystemC syntax requires;
Step S12 is the main () function of C language with each SC_THREAD process code translation.
13. the method for acquisition polycaryon processor run time version according to claim 11 is characterized in that the mapping method among the described step S2 comprises the steps:
Step S21 is processor core of each SC_THREAD course allocation or processor;
Step S22 is for employed sc_event, sc_event_queue, sc_mutex, sc_semaphore, sc_fifo function in each SC_THREAD process distribute unit corresponding in local resource unit or the global resource unit.
14. the method for acquisition polycaryon processor run time version according to claim 13, it is characterized in that, in described step S22, described distribution is meant unit corresponding in the priority allocation local resource unit, corresponding unit in the reallocation global resource unit when unit corresponding in the local resource unit is not enough.
15. the method for acquisition polycaryon processor run time version according to claim 11 is characterized in that, also comprises the steps:
Step S3 adopts processing unit instruction set compiler that translation result is compiled into object code;
Step S4 merges the bit stream file that generation can be directly downloaded to polycaryon processor with mapping result, compiling result.
16. the method for acquisition polycaryon processor run time version according to claim 15, it is characterized in that, also be included in the step S3 ' between step 3 and the step 4:, the performance of software model is assessed with delay, the power consumption performance information that mapping result, compiling result analyze the system of obtaining.
CN200710308574A 2007-12-29 2007-12-29 Multi-core processor meeting SystemC grammar request and method for acquiring performing code Active CN100580630C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200710308574A CN100580630C (en) 2007-12-29 2007-12-29 Multi-core processor meeting SystemC grammar request and method for acquiring performing code

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200710308574A CN100580630C (en) 2007-12-29 2007-12-29 Multi-core processor meeting SystemC grammar request and method for acquiring performing code

Publications (2)

Publication Number Publication Date
CN101196826A true CN101196826A (en) 2008-06-11
CN100580630C CN100580630C (en) 2010-01-13

Family

ID=39547259

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200710308574A Active CN100580630C (en) 2007-12-29 2007-12-29 Multi-core processor meeting SystemC grammar request and method for acquiring performing code

Country Status (1)

Country Link
CN (1) CN100580630C (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100568247C (en) * 2008-07-22 2009-12-09 中国科学院计算技术研究所 A kind of event handling unit group that satisfies the polycaryon processor of systemC grammer
CN101634979B (en) * 2008-07-22 2011-09-07 中国科学院计算技术研究所 Multi-core processor satisfying SystemC syntax
CN101635006B (en) * 2008-07-22 2012-02-29 中国科学院计算技术研究所 Mutual exclusion and semaphore cell block of multi-core processor satisfying SystemC syntax
WO2012110445A1 (en) 2011-02-15 2012-08-23 Commissariat A L'energie Atomique Et Aux Energies Alternatives Device for accelerating the execution of a c system simulation
WO2014124852A2 (en) 2013-02-15 2014-08-21 Commissariat A L'energie Atomique Et Aux Energies Alternatives Device and method for accelerating the update phase of a simulation kernel
CN104657145A (en) * 2015-03-09 2015-05-27 上海兆芯集成电路有限公司 System and method for parking re-issued instruction of microprocessor
CN110209509A (en) * 2019-05-28 2019-09-06 北京星网锐捷网络技术有限公司 Method of data synchronization and device between multi-core processor

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7451333B2 (en) * 2004-09-03 2008-11-11 Intel Corporation Coordinating idle state transitions in multi-core processors
US20060143384A1 (en) * 2004-12-27 2006-06-29 Hughes Christopher J System and method for non-uniform cache in a multi-core processor
CN100458757C (en) * 2005-07-28 2009-02-04 大唐移动通信设备有限公司 Inter core communication method and apparatus for multi-core processor in embedded real-time operating system
TWI286705B (en) * 2005-09-06 2007-09-11 Via Tech Inc Power management method of central processing unit
US7590805B2 (en) * 2005-12-29 2009-09-15 Intel Corporation Monitor implementation in a multicore processor with inclusive LLC

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100568247C (en) * 2008-07-22 2009-12-09 中国科学院计算技术研究所 A kind of event handling unit group that satisfies the polycaryon processor of systemC grammer
CN101634979B (en) * 2008-07-22 2011-09-07 中国科学院计算技术研究所 Multi-core processor satisfying SystemC syntax
CN101635006B (en) * 2008-07-22 2012-02-29 中国科学院计算技术研究所 Mutual exclusion and semaphore cell block of multi-core processor satisfying SystemC syntax
WO2012110445A1 (en) 2011-02-15 2012-08-23 Commissariat A L'energie Atomique Et Aux Energies Alternatives Device for accelerating the execution of a c system simulation
US9612863B2 (en) 2011-02-15 2017-04-04 Commissariat A L'energie Atomique Et Aux Energies Alternatives Hardware device for accelerating the execution of a systemC simulation in a dynamic manner during the simulation
WO2014124852A2 (en) 2013-02-15 2014-08-21 Commissariat A L'energie Atomique Et Aux Energies Alternatives Device and method for accelerating the update phase of a simulation kernel
CN104657145A (en) * 2015-03-09 2015-05-27 上海兆芯集成电路有限公司 System and method for parking re-issued instruction of microprocessor
CN104657145B (en) * 2015-03-09 2017-12-15 上海兆芯集成电路有限公司 The system and method that repeating transmission for microprocessor is stopped
CN110209509A (en) * 2019-05-28 2019-09-06 北京星网锐捷网络技术有限公司 Method of data synchronization and device between multi-core processor

Also Published As

Publication number Publication date
CN100580630C (en) 2010-01-13

Similar Documents

Publication Publication Date Title
CN100580630C (en) Multi-core processor meeting SystemC grammar request and method for acquiring performing code
van der Wolf et al. Design and programming of embedded multiprocessors: an interface-centric approach
CN107347253B (en) Hardware instruction generation unit for special purpose processor
CN101344899B (en) Simulation test method and system of on-chip system
US8972699B2 (en) Multicore interface with dynamic task management capability and task loading and offloading method thereof
CN101551747B (en) Software system configuring tool of ARM series microprocessor
CN110088737A (en) Concurrent program is converted to the integration schedules for the hardware that can be deployed in the cloud infrastructure based on FPGA
CN101176066B (en) Transparent support for operating system services
CN102087609B (en) Dynamic binary translation method under multi-processor platform
CN101183315A (en) Paralleling multi-processor virtual machine system
CN101251819A (en) Debug method suitable for multi-processor core system chip
CN102736595A (en) Unified platform of intelligent power distribution terminal based on 32 bit microprocessor and real time operating system (RTOS)
CN110532072A (en) Distributive type data processing method and system based on Mach
Ceng et al. A high-level virtual platform for early MPSoC software development
Bi et al. Research of key technologies for embedded Linux based on ARM
Barba et al. A comprehensive integration infrastructure for embedded system design
Virtanen et al. Distributed systemc simulation on manycore servers
Paulin Programming challenges & solutions for multi-processor SoCs: an industrial perspective
DE102022131708A1 (en) APPLICATION PROGRAMMING INTERFACE TO LIMIT MEMORY
Qian Automatic parallelization tools
CN116724292A (en) Parallel processing of thread groups
CN115509736A (en) Memory allocation or de-allocation using graphs
KR100638476B1 (en) Virtual platform based system on chip development environment and development method for the same
Reyes et al. A multicast inter-task communication protocol for embedded multiprocessor systems
Popovici et al. Hardware Abstraction Layer: Introduction and Overview

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant