WO2006100630A1 - Processing module, communication protocol for streaming and/or for synchronizing such processing module as well as method of streaming and/or of synchronizing such processing module - Google Patents

Processing module, communication protocol for streaming and/or for synchronizing such processing module as well as method of streaming and/or of synchronizing such processing module Download PDF

Info

Publication number
WO2006100630A1
WO2006100630A1 PCT/IB2006/050843 IB2006050843W WO2006100630A1 WO 2006100630 A1 WO2006100630 A1 WO 2006100630A1 IB 2006050843 W IB2006050843 W IB 2006050843W WO 2006100630 A1 WO2006100630 A1 WO 2006100630A1
Authority
WO
WIPO (PCT)
Prior art keywords
primitive
data
communication channel
processing module
task
Prior art date
Application number
PCT/IB2006/050843
Other languages
French (fr)
Inventor
Andrei Radulescu
Original Assignee
Koninklijke Philips Electronics N. V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N. V. filed Critical Koninklijke Philips Electronics N. V.
Publication of WO2006100630A1 publication Critical patent/WO2006100630A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • H04L25/03Shaping networks in transmitter or receiver, e.g. adaptive shaping networks

Definitions

  • PROCESSING MODULE COMMUNICATION PROTOCOL FOR STREAMING AND/OR FOR SYNCHRONIZING SUCH PROCESSING MODULE AS WELL AS METHOD OF STREAMING AND/OR OF SYNCHRONIZING SUCH PROCESSING MODULE
  • the present invention relates to a processing module, in particular to an Intellectual] Property] core, for example to an A[pplication]P[rogramming]I[nterface], for processing data via at least one communication channel, wherein at least one state of the communication channel is alternated by executing at least one task, and wherein the processing module is streamed and/or synchronized by at least one primitive, in particular by at least one group of primitives.
  • an Intellectual] Property] core for example to an A[pplication]P[rogramming]I[nterface]
  • the processing module is streamed and/or synchronized by at least one primitive, in particular by at least one group of primitives.
  • the present invention further relates to a communication protocol for streaming and/or for synchronizing such processing module, in particular such I tellectual] Property] core, for example such A[pplication]P[rogramming]I[nterface].
  • the present invention further relates to a method of streaming and/or of synchronizing such processing module, in particular such I[ntellectual]P[roperty] core, for example such A[pplication]P[rogramming]I[nterface].
  • Modern embedded systems comprise a large number of processing modules, also called I [ntellectual] Property] cores.
  • processing modules On these processing modules, there are one or more tasks communicating with each other either directly or via using a memory. These tasks and the communication between these tasks constitute one configuration, also called a user mode, of the system.
  • a user mode of the system.
  • high-end T[ele]V[ision] chips such as Philips' Nexperia Home S[ystem]o[n]C[hip] (also internally called Philips' Base Cat)
  • there can be hundreds of configurations, each comprising more than a hundred of tasks To simplify the task creation, high-level communication interfaces have been defined, addressing specific task characteristics, and different flexibility and efficiency requirements.
  • Eclipse is an architecture template for the design of versatile media-processing S[ystem]o[n]C[hip] subsystems.
  • N[etwork]o[n]C[hip] communication services such as throughput/latency guarantees, ordering or guaranteed completion, are described.
  • the present invention relates to a different class of communication services being at a higher level of abstraction than NoC services, and being able to use NoC services to be implemented.
  • an object of the present invention is to further develop a processing module of the kind as described in the technical field, a communication protocol of the kind as described in the technical field as well as a method of the kind as described in the technical field, in such way that an interface comprising a higher level of abstraction is provided thus placing the present invention in the same context of T[ask]T[ransaction]L[ayer].
  • the object of the present invention is achieved by a processing module comprising the features of claim 1, by a communication protocol comprising the features of claim2 as well as by a method comprising the features of claim 7.
  • a processing module comprising the features of claim 1
  • a communication protocol comprising the features of claim2
  • a method comprising the features of claim 7.
  • the present invention in particular the processing module according to the present invention as well as the method according to the present invention, is principally based on the idea that the state of the communication channel can be restored to the state before alternation, in particular that the state built up in the communication channel by the task during its execution can be deleted, by the at least one further primitive; in particular, the present invention is based on the idea of deleting the state built up in a channel during a task execution and of restoring the channel to its previous state by at least one further primitive, for example by minimizing a task's state before task switch.
  • the scope of the present invention refers to streaming and/or to synchronizing the processing module or processing device.
  • the group of primitives comprises at least one first primitive for allocating at least part of memory space in the communication channel.
  • This memory can be used by the data producer or producer task to prepare data to be sent to the channel.
  • the group of primitives comprises at least one second primitive being used by the data producer for sending the data into the communication channel.
  • the group of primitives comprises at least one third primitive being used by the data consumer or consumer task to obtain access to the data in the channel.
  • the group of primitives may comprise at least one fourth primitive being used by the consumer to release memory space, wherein the memory space is used for transferring the data.
  • the further primitive (Undo) helps in minimizing the task state by removing any state built in the channel. In this way, task switching is made faster. If the channel builds up state, which is not undone, task switching is still possible but takes longer as there is more state to be saved.
  • the memory space being allocated for sending the data to the communication channel and/or - the memory space being allocated for transferring the data is freed by the further primitive.
  • Using the present invention allows the same low-cost implementation as in the case of Eclipse.
  • the channel state is adjusted back to the original state, allowing a quick and low-cost task switching.
  • the further primitive U[ndo] is proposed for a communication interface such as TTL.
  • the Undo primitive restores the communication channel state to a previous one, which helps in minimizing the task state to be saved on task switch.
  • the main advantage of such an additional primitive is that the other communication primitives (G[et]S[pace], P[ut]D[ata], G[et]D[ata], P[ut]S[pace]) are allowed to have a common semantics, resulting in easing standardization efforts of TTL.
  • the present invention finally relates to the use of at least one processing module as described above and/or of at least one communication protocol as described above and/or of at least one primitive as described above and/or of the method as described above for at least one embedded system, in particular for at least one E[mbedded]S[ystem]A[rchitecture] on silicon, for example for at least one network-on- silicon, assigned to buses and/or to chip date transfer of a digital semiconductor audio/video platform.
  • the present invention also applies to multi-board systems, to multi-chip systems, and/or to multi-computer systems.
  • Fig. 1 schematically shows an embodiment of a processing module according to the prior art, namely of an A[pplication]P[rogramming]I[nterface], comprising four primitives for streaming and/or for synchronizing;
  • Fig. 3 schematically shows a first embodiment of task management according to communication protocols T[riMedia]S[oftware]S[treaming]A[rchitecture], C[PU]- [controlled]HE[terogeneous] Architectures for signal]P[rocessing], and Arachne in the view of a data producer;
  • Fig. 4 schematically shows a first embodiment of task management according to communication protocols T[riMedia]S[oftware]S[treaming]A[rchitecture], C[PU]- [controlled]HE[terogeneous] Architectures for signal]P[rocessing], and Arachne in the view of a data consumer;
  • Fig. 7 schematically shows a third embodiment of task management according to the present invention using the further primitive
  • Fig. 8 schematically shows a third embodiment of task management according to the present invention using the further primitive "undo" in the view of a data consumer.
  • Such communication interfaces include T[riMedia]S[oftware]S[treaming]A[rchitecture], C[PU]-[controlled]HE[terogeneous]A[rchitectures for signal]P[rocessing], Y[- Chart]A[pplication]P[rogramming]I[nterface], Arachne, and Eclipse.
  • A[pplication]P[rogramming]I[nterface] 100 seen by a respective task tl, t2, t3, t4 consists of four synchronization primitives GS, PD, GD, PS. These primitives GS, PD, GD, PS may have different names in the above-mentioned protocols; in the following the TTL terminology is used.
  • Fig. 1 depicts a communication channel 10, namely a TTL channel, with relating communication primitives GS, PD, GD, PS.
  • the first primitive GS namely GetSpace, allocates memory 42 in the communication channel 10. This memory 42 is used by task tl to prepare data to be sent to the channel 10, wherein task tl is assigned to a producer 20.
  • the second primitive PD namely PutData, is used by the producer 20 to send data to the communication channel 10.
  • the third primitive GD namely GetData, is used by a consumer 30 to obtain access to the data in the channel 10.
  • the fourth primitive PS namely PutSpace, is used by the consumer 30 to release memory space 44 used for transferring the data.
  • Fig. 2 gives an example of implementation of TTL buffer management.
  • memory management for a buffer or memory 40 of the channel 10 consists of maintaining four pointers (with reference numerals A, B, C, D) indicating the begin and the end of the regions claimed for writing 42 and for reading 44.
  • the first pointer A and the second pointer B point to the same location.
  • the third pointer C and the fourth pointer D point to the same location.
  • the first primitive G[et]S[pace] moves the first pointer A ahead in order to indicate that empty space is reserved for writing.
  • the second primitive P[ut]D[ata] moves the second pointer B ahead in order to indicate that full space has been released to the channel 10, i. e. that the data has been transferred.
  • the third primitive G[et]D[ata] moves the third pointer C ahead in order to indicate that more full space f has been claimed for reading.
  • the fourth primitive P[ut]S[pace] moves the fourth pointer D ahead in order to indicate that more empty space e is available in the channel 10.
  • Task swapped out is indicated by the reference numeral tso
  • task swapped in is indicated by the reference numeral tsi.
  • G[et]S[pace] and G[et]D[ata] change the state of the channel 10, i. e. modify the first pointer A and the third pointer C. This also means that on two consecutive calls to G [et] S [pace] / G[et]D[ata], two consecutive empty e / data f regions in the channel 10 are returned.
  • TSSA, C-HEAP, and Arachne communication protocols are depicted in the view of the producers' 20 side
  • Fig. 4 TSSA, C-HEAP, and Arachne communication protocols are depicted in the view of the consumer's 30 side.
  • the channel builds up state.
  • the first primitive G[et]S[pace] moves the first pointer A in order to indicate that empty space e is reserved for writing.
  • the first pointer A is moved forward relative to its current position with the indicated number of items.
  • the second primitive P[ut]D[ata] moves the second pointer B ahead in order to indicate that full space f (data) has been released to the channel 10.
  • the third primitive G[et]D[ata] moves the third pointer C in order to indicate that more full space f has been claimed for reading.
  • the third pointer C is moved forward relative to its current position with the indicated number of items.
  • the fourth primitive P[ut]S[pace] moves the fourth pointer D ahead in order to indicate that more empty space e is available in the channel 10. On task switching, the state of the task being swapped out must be saved
  • Task state saving is known to be an expensive operation.
  • the first primitive G[et]S[pace] does not move the first pointer A in the channel 10 as this would build up channel state. Instead, the first primitive G[et]S[pace] only checks if the first pointer A can be moved for the specified amount of items relative to the second pointer B (, i. e. only checks if space can be claimed, which is an implicit claim).
  • the second primitive P[ut]D[ata] moves both the first pointer A and the second pointer B ahead.
  • only one of the first pointer A and of the second pointer B needs to be maintained.
  • the third primitive G[et]D[ata] only checks if the third pointer C can be moved ahead relative to the fourth pointer D without actually moving it.
  • the fourth primitive P[ut]S[pace] moves both the third pointer C and the fourth pointer D (with the option of optimizing away one of the third pointer C or of the fourth pointer D).
  • G[et]T[ask] does not need to delete any channel state, i. e. there is no moving back of the first pointer A and/or of the third pointer C.
  • G[et]S[pace] / G[et]D[ata] the same two empty e / full (data) f regions in the channel 10 are returned, unless data / space is released to the channel 10 with P[ut]D[ata] / P[ut]S[pace].
  • U[ndo] primitive has been applied to all channels 10 of a stateless task, a switch to another task can be safely performed.
  • the undo U semantics are to move the first pointer A back to the second pointer B, and the third pointer C back to the fourth pointer D.
  • Undo() restores the state of all opened channels 10 of the current task; this is, for each of the tasks' channels, it moves back the first pointer A and the third pointer C to their original place (second pointer B and fourth pointer D, respectively; cf. Figs 7, 8).
  • Undo(c) restores the state of channel c.
  • UndoSpace(c) restores the state related to the empty space in channel c; this is, it moves back the first pointer A to the location of the second pointer B.
  • UndoData(c) restores the state related to the full space in channel c; this is, it moves back the third pointer C to the location of the fourth pointer D.
  • a further primitive for a processing module 100 in particular for an IP core communication interface, is provided; thereby, the class of communication services the present invention refers to is streaming and synchronization.
  • Intellectual] Property core for example A[pplication]P[rogramming]I[nterface]
  • memory space 42 part or region of the memory space 40 being allocated for sending the data to the communication channel 10, in particular memory space being claimed for writing 44 part or region of the memory space 40 being allocated for transferring the data, in particular memory space being claimed for reading
  • a first pointer indicating begin of part or region 42 being allocated for sending the data to the communication channel 10
  • B second pointer indicating end of part or region 42 being allocated for sending the data to the communication channel 10

Abstract

In order to provide a processing module (100), in particular an Intellectual] P [roperty] core, for example an A[pplication]P[rogramming]I[nterface], for processing data via at least one communication channel (10), wherein at least one state of the communication channel (10) is alternated by executing at least one task (tl, t2, t3, t4), and wherein the processing module (100) is streamed and/or synchronized by at least one primitive (GD, GS, GT, PD, PS), in particular by at least one group of primitives (GD, GS, GT, PD, PS), wherein a higher level of abstraction is obtained, at least one further primitive (U) for restoring the state of the communication channel (10) to the state before alternation, in particular for deleting the state built up in the communication channel (10) by the task (tl, t2, t3, t4) during its execution, is proposed.

Description

PROCESSING MODULE, COMMUNICATION PROTOCOL FOR STREAMING AND/OR FOR SYNCHRONIZING SUCH PROCESSING MODULE AS WELL AS METHOD OF STREAMING AND/OR OF SYNCHRONIZING SUCH PROCESSING MODULE
The present invention relates to a processing module, in particular to an Intellectual] Property] core, for example to an A[pplication]P[rogramming]I[nterface], for processing data via at least one communication channel, wherein at least one state of the communication channel is alternated by executing at least one task, and wherein the processing module is streamed and/or synchronized by at least one primitive, in particular by at least one group of primitives.
The present invention further relates to a communication protocol for streaming and/or for synchronizing such processing module, in particular such I tellectual] Property] core, for example such A[pplication]P[rogramming]I[nterface]. The present invention further relates to a method of streaming and/or of synchronizing such processing module, in particular such I[ntellectual]P[roperty] core, for example such A[pplication]P[rogramming]I[nterface].
Modern embedded systems comprise a large number of processing modules, also called I [ntellectual] Property] cores. On these processing modules, there are one or more tasks communicating with each other either directly or via using a memory. These tasks and the communication between these tasks constitute one configuration, also called a user mode, of the system. In complex systems, for example in high-end T[ele]V[ision] chips, such as Philips' Nexperia Home S[ystem]o[n]C[hip] (also internally called Philips' Base Cat), there can be hundreds of configurations, each comprising more than a hundred of tasks. To simplify the task creation, high-level communication interfaces have been defined, addressing specific task characteristics, and different flexibility and efficiency requirements. For software tasks, the T[riMedia]S[oftware]S[treaming]A[rchitecture], - the T[riMedia]S[oftware]S[treaming]A[rchitecture]-2, the C[PU]-[controlled]HE[terogeneous]A[rchitectures for signal]P[rocessing] (cf. prior art article "C-HEAP: A Heterogeneous Multiprocessor Architecture Template and Scalable and Flexible Protocol for the Design of Embedded Signal Processing Systems" by Andre Nieuwland, Jeffrey Kang, Om Prakash Gangwal, Ramanathan Sethuraman, Natalino Busa, Kees G.
W. Goossens, Rafael Peset Llopis, and Paul Lippens, Journal of Design Automation for Embedded Systems, 7(3), 2002), the communication protocol Arachne (cf. prior art article "A protocol and memory manager for on-chip communication" by Kees G. W. Goossens, International Symposium on Circuits and Systems, pages 225 to 228,
2001 ; http://www.homepages.inf.ed.ac.uk/kgoossen/2001-iscas.pdf), and the Y[-Chart]A[pplication]P[rogramming]I[nterface] (cf. prior art article "YAPI: Application Modeling for Signal Processing Systems" by Erwin A. de Kock, Gerben Essink, W. J. M. Smits, Pieter van der Wolf, J. -Y. Brunei, Wido M. Kruijtzer, P. Lieverse, and K. A. Vissers, Design Automation
Conference, 2000, cf. http://www.sigda.org/Archives/ProceedingArchives/Dac/Dac2000/papers/2000/ dac00/pdffiles/23_3.pdf) have been defined and used. For hardware tasks, Eclipse has been defined and used. In this context,
Eclipse is an architecture template for the design of versatile media-processing S[ystem]o[n]C[hip] subsystems.
Currently, there are efforts to reduce this diversity of protocols and to define one single protocol, namely T[ask]T[ransaction]L[ayer] covering all cases and requirements (cf. prior art article "Design and Programming of Embedded
Multiprocessors: An Interface-Centric Approach" by Pieter van der Wolf, Erwin A. de Kock, Tomas Henriksson, Wido M. Kruijtzer, and Gerben Essink, IEEE Proceedings of Hardware/Software Codesign and System Synthesis, 2004).
In the following, prior art systems relating to channel restoring operations in IP core communication, i. e. A[pplication]S[pecific]I[ntegrated]C[ircuit]s, F[ield]P[rogrammable]G[ate]A[rray]s or similar circuits are described. In prior art document EP 1 154 601 Al a routing system with a routing
Application] P [rogram] I [nterf ace] is disclosed. In this context an operation for "cleaning up" a channel, which is basically a reset of the channel, is described. Cleaning up is called as a result of an internal error preventing further communication.
In the prior art article "Communication Services for Networks on Chip" by Andrei Radulescu and Kees G. W. Goossens (cf. Domain-Specific Processors:
Systems, Architectures, Modeling, and Simulation (SAMOS), Series Volume 20, 2002), N[etwork]o[n]C[hip] communication services, such as throughput/latency guarantees, ordering or guaranteed completion, are described. However, the present invention relates to a different class of communication services being at a higher level of abstraction than NoC services, and being able to use NoC services to be implemented.
Moreover, in the prior art article "Core Communication Interface for FPGAs" by Jose Carlos Palma, Aline Vieira de Mello, Leandro Mδller, Fernando Moraes, and Ney Calazans (cf. Proceedings of the 15th Symposium on Integrated Circuits and Systems Design (SBCCI'02), 2002, pages 183 to 188), a communication interface to be used in F[ield]P[rogrammable]G[ate]A[rray]s is described. Again, low- level communication operations are described, which do not overlap with the class of operations the present invention belongs to.
Starting from the disadvantages and shortcomings as described above and taking the prior art as discussed into account, an object of the present invention is to further develop a processing module of the kind as described in the technical field, a communication protocol of the kind as described in the technical field as well as a method of the kind as described in the technical field, in such way that an interface comprising a higher level of abstraction is provided thus placing the present invention in the same context of T[ask]T[ransaction]L[ayer].
The object of the present invention is achieved by a processing module comprising the features of claim 1, by a communication protocol comprising the features of claim2 as well as by a method comprising the features of claim 7. Advantageous embodiments and expedient improvements of the present invention are disclosed in the respective dependent claims. The present invention, in particular the processing module according to the present invention as well as the method according to the present invention, is principally based on the idea that the state of the communication channel can be restored to the state before alternation, in particular that the state built up in the communication channel by the task during its execution can be deleted, by the at least one further primitive; in particular, the present invention is based on the idea of deleting the state built up in a channel during a task execution and of restoring the channel to its previous state by at least one further primitive, for example by minimizing a task's state before task switch.
Thus, the scope of the present invention refers to streaming and/or to synchronizing the processing module or processing device. By this technical measure, the time to market is decreased by increasing the reuse and development efficiency.
According to a preferred embodiment of the present invention the group of primitives comprises at least one first primitive for allocating at least part of memory space in the communication channel. This memory can be used by the data producer or producer task to prepare data to be sent to the channel.
Moreover, advantageously the group of primitives comprises at least one second primitive being used by the data producer for sending the data into the communication channel.
Beside that, advantageously the group of primitives comprises at least one third primitive being used by the data consumer or consumer task to obtain access to the data in the channel.
Independently thereof or in combination therewith, the group of primitives may comprise at least one fourth primitive being used by the consumer to release memory space, wherein the memory space is used for transferring the data. The further primitive (Undo) helps in minimizing the task state by removing any state built in the channel. In this way, task switching is made faster. If the channel builds up state, which is not undone, task switching is still possible but takes longer as there is more state to be saved.
Moreover, in an advantageous embodiment of the present invention, the memory space being allocated for sending the data to the communication channel and/or - the memory space being allocated for transferring the data is freed by the further primitive.
Using the present invention allows the same low-cost implementation as in the case of Eclipse. The channel state is adjusted back to the original state, allowing a quick and low-cost task switching. In addition to this, there is no need to have special semantics to the group of primitives, allowing seamless unification in standardization efforts, such as T[ask]T[ransaction]L[ayer].
The present invention further relates to a primitive for at least one processing module, in particular for at least one I[ntellectual]P[roperty] core, for example for at least one A[pplication]P[rogramming]I[nterface], the processing module being designed for processing data via at least one communication channel, wherein at least one state of the communication channel is alternated by executing at least one task, and wherein the primitive is designed for restoring the state of the communication channel to the state before alternation, in particular for deleting the state built up in the communication channel by the task during its execution.
Said further primitive advantageously relates to at least one T[ask]T[ransaction]L[ayer] transaction.
By the present invention, the further primitive U[ndo] is proposed for a communication interface such as TTL. The Undo primitive restores the communication channel state to a previous one, which helps in minimizing the task state to be saved on task switch. The main advantage of such an additional primitive is that the other communication primitives (G[et]S[pace], P[ut]D[ata], G[et]D[ata], P[ut]S[pace]) are allowed to have a common semantics, resulting in easing standardization efforts of TTL. The present invention finally relates to the use of at least one processing module as described above and/or of at least one communication protocol as described above and/or of at least one primitive as described above and/or of the method as described above for at least one embedded system, in particular for at least one E[mbedded]S[ystem]A[rchitecture] on silicon, for example for at least one network-on- silicon, assigned to buses and/or to chip date transfer of a digital semiconductor audio/video platform. In addition to S[ystems]o[n]C[hip], the present invention also applies to multi-board systems, to multi-chip systems, and/or to multi-computer systems.
As already discussed above, there are several options to embody as well as to improve the teaching of the present invention in an advantageous manner. To this aim, reference is made to the claims respectively dependent on claim 2 and on claim 7; further improvements, features and advantages of the present invention are explained below in more detail with reference to preferred embodiments by way of example and to the accompanying drawings where
Fig. 1 schematically shows an embodiment of a processing module according to the prior art, namely of an A[pplication]P[rogramming]I[nterface], comprising four primitives for streaming and/or for synchronizing;
Fig. 2 schematically shows an embodiment of buffer management in the processing module of Fig. 1;
Fig. 3 schematically shows a first embodiment of task management according to communication protocols T[riMedia]S[oftware]S[treaming]A[rchitecture], C[PU]- [controlled]HE[terogeneous] Architectures for signal]P[rocessing], and Arachne in the view of a data producer;
Fig. 4 schematically shows a first embodiment of task management according to communication protocols T[riMedia]S[oftware]S[treaming]A[rchitecture], C[PU]- [controlled]HE[terogeneous] Architectures for signal]P[rocessing], and Arachne in the view of a data consumer;
Fig. 5 schematically shows a second embodiment of task management according to communication protocol Eclipse in the view of a data producer; Fig. 6 schematically shows a second embodiment of task management according to communication protocol Eclipse in the view of a data consumer;
Fig. 7 schematically shows a third embodiment of task management according to the present invention using the further primitive
"undo" in the view of a data producer; and
Fig. 8 schematically shows a third embodiment of task management according to the present invention using the further primitive "undo" in the view of a data consumer.
The same reference numerals are used for corresponding parts in Fig. 1 to Fig. 8.
To simplify the development of computational tasks, high-level communication interfaces have been defined in the prior art, addressing specific task characteristics, and different flexibility and efficiency requirements. Such communication interfaces include T[riMedia]S[oftware]S[treaming]A[rchitecture], C[PU]-[controlled]HE[terogeneous]A[rchitectures for signal]P[rocessing], Y[- Chart]A[pplication]P[rogramming]I[nterface], Arachne, and Eclipse.
Such a variety of protocols, each with different semantics, can introduce incompatibilities between tasks, and therefore make system integration more difficult. As a result, standardization is undergoing to unify these protocols under a single communication interface called T[ask]T[ransaction]L[ayer].
In these communication protocols, an
A[pplication]P[rogramming]I[nterface] 100 seen by a respective task tl, t2, t3, t4 consists of four synchronization primitives GS, PD, GD, PS. These primitives GS, PD, GD, PS may have different names in the above-mentioned protocols; in the following the TTL terminology is used. Fig. 1 depicts a communication channel 10, namely a TTL channel, with relating communication primitives GS, PD, GD, PS.
The first primitive GS, namely GetSpace, allocates memory 42 in the communication channel 10. This memory 42 is used by task tl to prepare data to be sent to the channel 10, wherein task tl is assigned to a producer 20.
The second primitive PD, namely PutData, is used by the producer 20 to send data to the communication channel 10.
The third primitive GD, namely GetData, is used by a consumer 30 to obtain access to the data in the channel 10.
The fourth primitive PS, namely PutSpace, is used by the consumer 30 to release memory space 44 used for transferring the data.
Fig. 2 gives an example of implementation of TTL buffer management. As shown in Fig. 2, memory management for a buffer or memory 40 of the channel 10 consists of maintaining four pointers (with reference numerals A, B, C, D) indicating the begin and the end of the regions claimed for writing 42 and for reading 44. When no empty space e is claimed, the first pointer A and the second pointer B point to the same location. When no full space f is claimed, the third pointer C and the fourth pointer D point to the same location.
In most of the cases, for instance TSSA, C-HEAP, and Arachne, the above-mentioned communication protocols work as follows (cf. Figs 3, 4 including the effects of the primitives in the four pointers A, B, C, D):
The first primitive G[et]S[pace] moves the first pointer A ahead in order to indicate that empty space is reserved for writing.
The second primitive P[ut]D[ata] moves the second pointer B ahead in order to indicate that full space has been released to the channel 10, i. e. that the data has been transferred.
The third primitive G[et]D[ata] moves the third pointer C ahead in order to indicate that more full space f has been claimed for reading.
The fourth primitive P[ut]S[pace] moves the fourth pointer D ahead in order to indicate that more empty space e is available in the channel 10. Task swapped out is indicated by the reference numeral tso, and task swapped in is indicated by the reference numeral tsi.
For all functions of the API 100, the number of items that the pointer A, B, C, D is moved is specified as a parameter.
In protocols such as TSSA, C-HEAP, and Arachne, G[et]S[pace] and G[et]D[ata] change the state of the channel 10, i. e. modify the first pointer A and the third pointer C. This also means that on two consecutive calls to G [et] S [pace] / G[et]D[ata], two consecutive empty e / data f regions in the channel 10 are returned. In Fig. 3, TSSA, C-HEAP, and Arachne communication protocols are depicted in the view of the producers' 20 side, and in Fig. 4, TSSA, C-HEAP, and Arachne communication protocols are depicted in the view of the consumer's 30 side. As depicted in Figs 3 and 4 the channel builds up state. The first primitive G[et]S[pace] moves the first pointer A in order to indicate that empty space e is reserved for writing. The first pointer A is moved forward relative to its current position with the indicated number of items.
The second primitive P[ut]D[ata] moves the second pointer B ahead in order to indicate that full space f (data) has been released to the channel 10. The third primitive G[et]D[ata] moves the third pointer C in order to indicate that more full space f has been claimed for reading. The third pointer C is moved forward relative to its current position with the indicated number of items.
The fourth primitive P[ut]S[pace] moves the fourth pointer D ahead in order to indicate that more empty space e is available in the channel 10. On task switching, the state of the task being swapped out must be saved
(either explicitly or implicitly), including the built up channel state (changes to the first pointer A and to the third pointer C). Task state saving is known to be an expensive operation.
Eclipseisoneexceptionfromthese semantics because it works as follows (cf. Figs 5, 6 including the effects of the primitives in the four pointers A, B, C, D):
In Eclipse, the first primitive G[et]S[pace] does not move the first pointer A in the channel 10 as this would build up channel state. Instead, the first primitive G[et]S[pace] only checks if the first pointer A can be moved for the specified amount of items relative to the second pointer B (, i. e. only checks if space can be claimed, which is an implicit claim).
As a result, the second primitive P[ut]D[ata] moves both the first pointer A and the second pointer B ahead. Actually, in this implementation only one of the first pointer A and of the second pointer B needs to be maintained.
Similarly to the first primitive G[et]S[pace], the third primitive G[et]D[ata] only checks if the third pointer C can be moved ahead relative to the fourth pointer D without actually moving it. As a result, the fourth primitive P[ut]S[pace] moves both the third pointer C and the fourth pointer D (with the option of optimizing away one of the third pointer C or of the fourth pointer D).
As there is no channel state being built up, a further primitive
G[et]T[ask] does not need to delete any channel state, i. e. there is no moving back of the first pointer A and/or of the third pointer C. In Eclipse, on two consecutive calls to G[et]S[pace] / G[et]D[ata], the same two empty e / full (data) f regions in the channel 10 are returned, unless data / space is released to the channel 10 with P[ut]D[ata] / P[ut]S[pace].
In Fig. 5, Eclipse communication protocols are depicted in the view of the producer's 20 side, and in Fig. 6, Eclipse communication protocols are depicted in the view of the consumer's 30 side. As depicted in Figs 5 and 6 the channel builds up no state.
The reason for the semantics has been changed; in this case a stateless task is obtained, which leads to very fast task switch. In such a system, if the task has not completed before switching to another task, the next run of the task will just reproduce its previous run. In such a case, the channel 10 has to return to its original state, and consequently the channel 10 will return the same empty e /data f regions as in the previous task run.
In the case of a unified protocol, such as TTL, such changed semantics depending on the mode can be confusing and misleading for a user of the API 100. Therefore by the present invention an improved solution is proposed, which allows stateless tasks without changing the semantics of the existing API primitives GS, PD, GD,
PS (cf. Figs 7, 8 including the effects of the primitives in the four pointers A, B, C, D). The proposed solution is to introduce a further or additional primitive called Undo (reference numeral U), which restores the state of the channel 10 to the previous state, in particular which deletes the state built up in the communication channel 10 by the task tl, t2, t3, t4 during its execution. This is equivalent to deleting the state built up in the channel 10 by a task during its execution. Therefore, after an
U[ndo] primitive has been applied to all channels 10 of a stateless task, a switch to another task can be safely performed. The undo U semantics are to move the first pointer A back to the second pointer B, and the third pointer C back to the fourth pointer D. Several variants of the
Undo() primitive can be defined: Undo() restores the state of all opened channels 10 of the current task; this is, for each of the tasks' channels, it moves back the first pointer A and the third pointer C to their original place (second pointer B and fourth pointer D, respectively; cf. Figs 7, 8). Undo(c) restores the state of channel c.
UndoSpace(c) restores the state related to the empty space in channel c; this is, it moves back the first pointer A to the location of the second pointer B.
UndoData(c) restores the state related to the full space in channel c; this is, it moves back the third pointer C to the location of the fourth pointer D. Thus, by the present invention a further primitive for a processing module 100, in particular for an IP core communication interface, is provided; thereby, the class of communication services the present invention refers to is streaming and synchronization.
LIST OF REFERENCE NUMERALS
100 processing module, in particular Intellectual] Property] core, for example A[pplication]P[rogramming]I[nterface]
10 communication channel
0 data producer
30 data consumer
40 memory space 42 part or region of the memory space 40 being allocated for sending the data to the communication channel 10, in particular memory space being claimed for writing 44 part or region of the memory space 40 being allocated for transferring the data, in particular memory space being claimed for reading A first pointer indicating begin of part or region 42 being allocated for sending the data to the communication channel 10 B second pointer indicating end of part or region 42 being allocated for sending the data to the communication channel 10
C third pointer indicating begin of part or region 44 being allocated for transferring the data
D fourth pointer indicating end of part or region 44 being allocated for transferring the data e empty memory space and/or free memory space f full space and/or memory space being used by the data GD third primitive GetData
GS first primitive GetSpace
PD second primitive PutData
PS fourth primitive PutSpace tl first task t2 second task t3 third task t4 fourth task tsi task swapped in tso task swapped out
U further primitive Undo

Claims

CLAIMS:
1. A processing module (100) for processing data via at least one communication channel (10), wherein at least one state of the communication channel (10) is alternated by executing at least one task (tl, t2, t3, t4), and wherein the processing module (100) is streamed and/or synchronized by at least one primitive (GD, GS, GT, PD, PS), characterized by at least one further primitive (U) for restoring the state of the communication channel (10) to the state before alternation.
2. A communication protocol for streaming and/or for synchronizing at least one processing module (100) according to claim 1 and - for communicating the data between at least one data producer
(20) and at least one data consumer (30) via the communication channel (10).
3. The communication protocol according to claim 2, characterized in that more than one task (tl, t2, t3, t4) is provided and - that the tasks (tl, t2, t3, t4) communicate with each other either directly or via using memory space (40).
4. The communication protocol according to claim 2 or 3, characterized in that the group of primitives (GD, GS, GT, PD, PS) comprises at least one first primitive (GS) for allocating at least part or region (42) of the memory space (40), said part or region (42) being designed for sending the data to the communication channel (10), at least one second primitive (PD) being assigned to the data producer (20) and being designed for sending the data in the communication channel (10), - at least one third primitive (GD) being assigned to the data consumer (30) and being designed for providing access to the data in the communication channel (10), and at least one fourth primitive (PS) — being assigned to the data consumer (30) and being designed for releasing at least part or region (44) of the memory space (40), said part or region (44) being designed for transferring the data.
5. The communication protocol according to at least one of claims 2 to 4, characterized in that the memory space (42) being allocated for sending the data to the communication channel (10) and/or the memory space (44) being allocated for transferring the data is freed by the further primitive (U).
6. A primitive (U) for at least one processing module (100), the processing module (100) being designed for processing data via at least one communication channel (10), wherein at least one state of the communication channel (10) is alternated by executing at least one task (tl, t2, t3, t4), characterized by restoring the state of the communication channel (10) to the state before alternation.
7. A method of streaming and/or of synchronizing at least one processing module (100) by at least one primitive (GD, GS, GT, PD, PS), the processing module (100) being designed for processing data via at least one communication channel (10), wherein at least one state of the communication channel (10) is alternated by executing at least one task (tl, t2, t3, t4), characterized in that the state of the communication channel (10) can be restored to the state before alternation.
8. The method according to claim 7, characterized in that at least part or region (42) of memory space (40) in the communication channel (10) for sending the data to the communication channel (10) is allocated by at least one first primitive (GS), that the data is sent in the communication channel (10) by at least one second primitive (PD), that access to the data in the communication channel (10) is provided by at least one third primitive (GD), and - that at least part or region (44) of the memory space (40) for transferring the data is released by at least one fourth primitive (PS).
9. The method according to claim 7 or 8, characterized in that the memory space being allocated for sending the data to the communication channel (10) and/or that the memory space being allocated for transferring the data is released by the further primitive (U).
10. Use of at least one processing module (100) according to claim 1 and/or of at least one communication protocol according to at least one of claims 2 to 5 and/or of at least one primitive (U) according to claim 6 and/or of the method according to at least one of claims 7 to 9 for at least one embedded system, for at least one system on chip, - for at least one multi-board system, for at least one multi-chip system, for at least one multi-computer system.
PCT/IB2006/050843 2005-03-22 2006-03-20 Processing module, communication protocol for streaming and/or for synchronizing such processing module as well as method of streaming and/or of synchronizing such processing module WO2006100630A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP05102261 2005-03-22
EP05102261.4 2005-03-22

Publications (1)

Publication Number Publication Date
WO2006100630A1 true WO2006100630A1 (en) 2006-09-28

Family

ID=36649819

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2006/050843 WO2006100630A1 (en) 2005-03-22 2006-03-20 Processing module, communication protocol for streaming and/or for synchronizing such processing module as well as method of streaming and/or of synchronizing such processing module

Country Status (1)

Country Link
WO (1) WO2006100630A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1154601A1 (en) * 2000-05-12 2001-11-14 Nortel Networks Limited Modular routing system with Routing Application Programm Interface (API)
US20040093453A1 (en) * 1996-02-02 2004-05-13 Lym Kevin K. Application programming interface for data transfer and bus management over a bus structure

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040093453A1 (en) * 1996-02-02 2004-05-13 Lym Kevin K. Application programming interface for data transfer and bus management over a bus structure
EP1154601A1 (en) * 2000-05-12 2001-11-14 Nortel Networks Limited Modular routing system with Routing Application Programm Interface (API)

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
DE KOCK E A ET AL: "YAPI: APPLICATION MODELING FOR SIGNAL PROCESSING SYSTEMS", PROCEEDINGS OF THE DESIGN AUTOMATION CONFERENCE, XX, XX, June 2000 (2000-06-01), pages 402 - 405, XP007900426 *
GOOSSENS K G W ED - INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS: "A protocol and memory manager for on-chip communication", ISCAS 2001. PROCEEDINGS OF THE 2001 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS. SYDNEY, AUSTRALIA, MAY 6 - 9, 2001, IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, NEW YORK, NY : IEEE, US, vol. VOL. 1 OF 5, 6 May 2001 (2001-05-06), pages 225 - 228, XP010540619, ISBN: 0-7803-6685-9 *
PALMA J C ET AL: "Core communication interface for FPGAs", INTEGRATED CIRCUITS AND SYSTEMS DESIGN, 2002. PROCEEDINGS. 15TH SYMPOSIUM ON 9-14, SEPT. 2002, PISCATAWAY, NJ, USA,IEEE, 9 September 2002 (2002-09-09), pages 183 - 188, XP010621787, ISBN: 0-7695-1807-9 *
RADULESCU A ET AL: "COMMUNICATION SERVICES FOR NETWORKS ON CHIP", PROCEEDINGS OF THE INTERNATIONAL WORKSHOP ON SYSTEMS, ARCHITECTURES, MODELING AND SIMULATION. SAMOS, XX, XX, vol. 2, 2002, pages 275 - 299, XP001206391 *
VAN DER WOLF P ET AL: "Design and programming of embedded multiprocessors: an interface-centric approach", HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS, 2004. CODES + ISSS 2004. INTERNATIONAL CONFERENCE ON STOCKHOLM, SWEDEN SEPT. 8-10, 2004, PISCATAWAY, NJ, USA,IEEE, 8 September 2004 (2004-09-08), pages 206 - 217, XP010743631, ISBN: 1-58113-937-3 *
YANNING LUO: "TTL INTERFACES FOR MULTIPROCESSORS PLATFORMS", MSC THESIS, 2003 - 2003, The Netherlands, Delft, pages 1 - 19, XP002390781 *

Similar Documents

Publication Publication Date Title
EP1374403B1 (en) Integrated circuit
US8407429B2 (en) Multi-context configurable memory controller
US7506089B2 (en) Bus system and method thereof
CN101169866B (en) Self-reconfigurable on-chip multimedia processing system and its self-reconfiguration realization method
US20130147515A1 (en) Hierarchically-Scalable Reconfigurable Integrated Circuit Architecture With Unit Delay Modules
US8316342B1 (en) Method and apparatus for concurrent design of modules across different design entry tools targeted to a single layout
US20130241595A1 (en) Data-Driven Integrated Circuit Architecture
CN106200760A (en) Clock management circuits, system on chip, the method for Clock management
US20200226313A1 (en) Modular periphery tile for integrated circuit device
EP1556760A2 (en) Common interface framework for developing field programmable based applications independent of target circuit board
JPH07107916B2 (en) Very large scale integrated circuit
CN112543925A (en) Unified address space for multiple hardware accelerators using dedicated low latency links
EP2388707A1 (en) Interconnection method and device, for example for systems-on-chip
Tanaka et al. A fault-secure high-level synthesis algorithm for RDR architectures
Thepayasuwan et al. Layout conscious bus architecture synthesis for deep submicron systems on chip
US8281316B2 (en) Event direction detector and method thereof
WO2006100630A1 (en) Processing module, communication protocol for streaming and/or for synchronizing such processing module as well as method of streaming and/or of synchronizing such processing module
US20060041705A1 (en) System and method for arbitration between shared peripheral core devices in system on chip architectures
CN114662432A (en) Micro-network on chip and micro-sector infrastructure
US20090327539A1 (en) Multiple Die System Status Communication System
EP1489521B1 (en) Access of two synchronous busses with asynchronous clocks to a synchronous single port ram
EP1019839A1 (en) A bus arrangement and associated method in a computer system
Essig et al. On-demand instantiation of co-processors on dynamically reconfigurable FPGAs
Dondo Gazzano et al. Facilitating preemptive hardware system design using partial reconfiguration techniques
JPH0562384B2 (en)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: RU

WWW Wipo information: withdrawn in national office

Country of ref document: RU

122 Ep: pct application non-entry in european phase

Ref document number: 06727678

Country of ref document: EP

Kind code of ref document: A1

WWW Wipo information: withdrawn in national office

Ref document number: 6727678

Country of ref document: EP