WO2010070222A1 - Gestionnaire physique de barriere de synchronisation entre processus multiples - Google Patents
Gestionnaire physique de barriere de synchronisation entre processus multiples Download PDFInfo
- Publication number
- WO2010070222A1 WO2010070222A1 PCT/FR2009/052322 FR2009052322W WO2010070222A1 WO 2010070222 A1 WO2010070222 A1 WO 2010070222A1 FR 2009052322 W FR2009052322 W FR 2009052322W WO 2010070222 A1 WO2010070222 A1 WO 2010070222A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- barrier
- processes
- blocks
- call
- synchronization
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Program synchronisation; Mutual exclusion, e.g. by means of semaphores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Program synchronisation; Mutual exclusion, e.g. by means of semaphores
- G06F9/522—Barrier synchronisation
Definitions
- the present invention relates to the processing of processes executed in parallel.
- parallel to a computer is meant a computer on which several processors or at least one processor with several cores, or at least one processor with several threads (“threads”), are mounted.
- a computer program divides its task (or main task) into several sub-tasks, whose calculations can be performed in parallel by different processes. Each process will therefore aim to execute and perform one of these subtasks. Once a process has completed its current subtask, it will be possible to assign a second subtask to perform, after which it will eventually be assigned a subtask and so on.
- multi-process processing implies a need for synchronization of these processes. In particular, this synchronization is intended to allow an orderly restructuring of the main task when the subtasks have been completed.
- inter-process synchronization mechanism Such synchronization is generally provided by a mechanism called "inter-process synchronization mechanism”. This mechanism must be fast so as not to nullify the temporal advantage derived from the use of processes executed in parallel.
- 'barrier mechanism' a mechanism of software nature. This mechanism can be based on various algorithms that follow the same main scheme described below.
- a computer program for performing a task is executed via n processes, themselves being able to execute a set of sub-tasks.
- Each subtask is divided into successive blocks intended to perform work steps, such as an intermediate calculation for example.
- the blocks or intermediate calculations of the different processes are executed in parallel.
- Each process that has completed one block waits at a barrier (synchronization barrier), until all other parallel blocks of the other processes are completed and have joined the barrier. It is only when all processes have reached the barrier that the following blocks are executed during a subsequent no work. This principle is described below using a time diagram.
- Figure 1 shows a barrier mechanism and thus the general operation of a synchronization barrier.
- a PM process manager will firstly break down the task T, into n ST sub-tasks. These n sub-tasks ST will be executed by n process P.
- the main task T complex is decomposed into several simple sub-tasks ST, each of these sub-tasks being accomplished by a separate process.
- PM process manager is not necessarily a clean element.
- the process manager can generally be seen as a capability of a computer program to implement a passive or active clipping method to allow processes to divide the subtasks between them.
- the capacity can be implicit, determined by one of the processes or correspond to a predefined division by a user.
- n processes are themselves divided into blocks B, which are to be executed successively in time.
- the subset of the B blocks that are executing at the same time (and from different P processes) constitutes a W work step. Therefore, each set of B blocks of the same rank / constitutes a distinct work step W .
- the blocks B, of the rank of work pitch /, denoted W, are executed in parallel.
- the execution time of blocks B from different processes P 1 is variable.
- the blocks B are subjected to a synchronization barrier BS (100).
- This barrier BS (100) is called by each process P when it has finished executing its block B, in progress. It is the synchronization barrier BS (100) which allows a passage to the block B 1+ , of a following rank, and this only when all the blocks B, in progress have "joined" the barrier, that is to say it informed that their execution is complete.
- the first block B completed that is to say the one with the shortest execution time, informs by request the synchronization barrier BS (100) on the one hand that it has completed its work and on the other hand.
- the number of blocks in progress remaining during the same work step is equivalent to the number n of processes P.
- Synchronization barriers are usually equipped with a counter.
- the counter is initialized when the first block B has reached the barrier. Subsequently, the counter is decremented each time another block B joins the barrier BS (100).
- the barrier BS (100) can follow the progression (or advancement) of a work step, and more specifically the termination of each block B in progress.
- the last block B namely the one with the highest execution time has reached the barrier BS (100)
- the latter informs each process P and allows them to pass to a next work step W.
- this next work step W consists of blocks B executed in parallel and originating from the various processes P.
- the mechanism of the barrier BS (100) is similar to the previous one. This is repeated for each step of work, and continues until the completion of the processes P.
- the task T will then be accomplished by reconstituting the results of the processes P.
- Such algorithms require a certain number of interactions between the processes, blocks and the barrier. These interactions will be described later in the detailed description and include the initialization of the barrier, the information given to the barrier when a block has completed its work, the verification that all sub-processes have completed their current block. , especially. These interactions, when managed by barriers of a software nature, are relatively slow and consume a lot of bandwidth.
- FIG. 2 relating to the prior art represents a known BS (100) synchronization barrier implementation.
- BS 100
- Known mechanisms are implemented in software.
- the data defining the synchronization barrier BS (100) is stored in the RAM (202) (meaning in English language RAM: Random Access Memory) of a computer (or other computing device) and the various processes P access (read / write R / W) to this RAM (202) to interact with said BS (100) barrier. This access is done using an address space and an ADR address (detailed below).
- Access includes, as described above, the initialization of the barrier BS (100) (with the initialization of the counter), the fact of informing the barrier BS (100) every time a block B has finished its work when the same step W, check whether all the processes P have finished their block B of the current work step W, etc.
- the program intended to perform these functions is also active in RAM, in particular by calling a library of functions.
- An address space can be segmented into independent segments.
- segment is generally meant a segment of memory defined by two values: - the address at which this segment starts (base address), and - the size of the segment.
- a segment therefore constitutes a range of continuous addresses in a main memory (physical or virtual).
- FIG. 2 shows a computing device comprising several processors PZi to PZ y (200), a memory access manager CACHE COHER MGR (206), a RAM memory (202) containing a program area in which is located the barrier of BS synchronization (100) of software nature.
- the device according to FIG. 2 therefore comprises a processing unit capable of multi-process processing. The processes will then run on different processors, on different cores ("cores") of processors, and / or on different threads ("threads").
- the processing unit provides these processors with so-called "address space", in particular to the RAM, where the code and the data defining the BS (100) software synchronization barrier are located, in a area associated with a specific address ADR, which may be the address of the beginning of the area.
- the device of FIG. 2 further comprises a process manager (208) of the type as defined above for decomposing a task T into n processes P, themselves divided into successive blocks B.
- the barriers of the prior art make it possible to implement a synchronization between different processes P. But, as already mentioned, the software nature of a barrier makes it slow with respect to certain needs. Indeed, each time a process P interacts with it, a library of functions of the barrier BS (100) is used. In addition, within the library, many interactions with the memory are required to read and write the barrier update data until all processes have reached the rendezvous point ( "Synchronization barrier"). Then, once the process P has informed the BS (100) barrier, the process P must regularly query the BS barrier (100) to see if the other current B blocks have completed their work.
- the present invention improves the situation.
- the invention introduces a computer device with a synchronization barrier, comprising a memory, a processing unit, capable of multiprocessing processing on different processors and allowing parallel execution of blocks by processes, said blocks being associated by group in successive work steps, and a hardware circuit with a usable address space to the memory, capable of receiving a call from each process indicating the completion of a current block, each call comprising data, and said hardware circuit being arranged to allow the execution of blocks of a subsequent work step when the set of blocks of the current work step have been executed, which is accessed from the segmented address space derived from said data of each call.
- the hardware circuit of the device includes a firmware for performing data processing of at least one call.
- the processing may include, inter alia, suspending responses to each call until an end condition is verified that all processes have reported the completion of the block of the current work step.
- the hardware circuit can respond to each call with a data output and allow the processes to pass to the subsequent work step.
- the above-mentioned processing includes extracting the number of processes from a first call and then down-counting. on this number from other calls, until the end condition is checked. Note that each call can indicate this number of processes.
- the present invention introduces a method of computer processing at the process level, of the type comprising the following steps: a. break a task into subtasks executed as processes composed of successive blocks; b. providing a synchronization barrier provided with a counter in relation to the number of processes, in a physical barrier manager; vs. in each process, setting a first block as a current block and executing it, while accessing said synchronization barrier to decrement said counter when the execution of that current block terminates; d. in each process where the execution of the current block is completed, wait for a response from said synchronization barrier, the response being directly linked to the counter and sent when it indicates that all the current blocks are executed, e. when all current blocks are running, define new blocks in progress from the next block of each process, and repeat steps c. and D. with these new blocks in progress.
- FIG. 1 is a timing diagram which illustrates the general operation of a barrier mechanism
- FIG. 2 is the block diagram of an implementation of a software synchronization barrier of the prior art
- FIG. 3 represents a computing device comprising a memory and a processing unit, capable of multiprocessing processes on different processors with a hardware circuit forming a synchronization barrier manager;
- FIG. 4 represents a hardware circuit forming a synchronization barrier manager, a dedicated memory and a firmware,
- FIG. 5 represents a synchronization barrier automatism according to one embodiment of the invention
- FIG. 6 represents a flowchart of the main operations according to one embodiment of the invention.
- the computer device of FIG. 3 comprises a RAM memory (202), a processing unit capable of multi-process processing on different processors PZi to PZ y (200), and a memory access manager COHER CACHE MGR ( 206) between said RAM (202) and the PZ processors (200).
- the device further comprises a hardware circuit forming a HBM synchronization barrier manager (400), comprising a Ded_MEM dedicated memory (404) and a micro-Prog microprogram (402) as shown in FIG. 4.
- the HBM manager (400) only needs a "D (unidirectional)" data output. In practice, it is an input / output especially for reasons of compatibility (read / write R / W) with the bus connected.
- the address / data links to the hardware circuit avoid the memory access manager COHER CACHE MGR (206).
- the HBM synchronization barrier manager COHER CACHE MGR (206).
- the synchronization barrier manager HBM (400) can for example be in a processor, in a chipset (or chipset) or as shown in FIG. 3 within an additional component such as a circuit equipment.
- the HBM (400) must be accessible for any transaction originating from the P processes participating in the BS (100) barrier and targeting that manager.
- the HBM manager (400) can therefore be accessed or called by any request to his memory space.
- multiple addresses can target the same barrier BS (100) ("address aliasing").
- each process P issuing a request to the synchronization barrier BS (100) carries: in the higher weight of this request, the address of the barrier, and
- An example of additional data may be the number of processes P participating in the BS (100) barrier. Each of the processes P can thus target a single barrier BS (100) by communicating information necessary for synchronization. This information can be stored by the micro-Prog micro-program (402) in its Ded_MEM dedicated memory (404) and then processed by the micro-Prog micro-program (402) of the synchronization barrier manager HBM (400).
- the HBM synchronization barrier manager (400) can handle a plurality of BS synchronization barriers (100) at a time. This possibility is important in some applications.
- the barrier BS (100) is in its initial state, and none of the n processes P has accessed it.
- the processes P are in a first work step W and each execute their first blocks B (see FIG. 1). Similar to what is described above, the first process P having completed its block B, informs the barrier BS (100) by means of a request.
- This request includes on its lower weight the number n of processes P participating in the barrier, which allows the initialization of a counter CNT (406) of the barrier BS (100) once the first request received. It is upon receipt of this first request that the synchronization barrier BS (100) switches to the activated mode (or state).
- the HBM timing barrier manager (400) will decrement (down count) the CNT counter (406). Queries are only answered with D data when the HBM (400) synchronization barrier manager has received all requests from the n processes P participating in the barrier BS (100). At this point, synchronization is considered effective. The set of processes P is then allowed to pass to the next work step W.
- the corresponding process interrogates the barrier BS only once to determine the progress of the working step W. This is because the barrier BS is able to store in its space Ded_MEM memory (404) own, the number of requests already received. Each process will remain pending until the response from the BS barrier is received. There is therefore no need for a multiple interrogation (regular or not) of the processes towards the barrier. In addition, each query is less expensive in terms of bandwidth. This is at the origin of the bandwidth gain achieved by the invention.
- the execution time t of a block B is not necessarily related with the arrival of it to the synchronization barrier BS. Indeed, for reasons of competition, non-scheduling of communication channels, conflicts or arbitration, a second request part later than a first request can reach the barrier BS before said first request. However, this does not change the operation of the barrier according to the invention. For the sake of simplicity, it is considered in the present description that a request issued by a first process having a run time t shorter than a second process, will join the barrier BS before the request issued by the second process.
- the memory space of the BS synchronization barrier (100) is implemented in the memory space dedicated to the PCI bus of the computer.
- the synchronization barrier manager may be advantageous for the synchronization barrier manager to manage these barriers in relation to memory segments, for example pages of memory.
- This plurality of barriers can be connected to the same circuit or to separate circuits.
- the PCI memory provides enough space to provide a predetermined size of memory page for each barrier while providing barrier protected access.
- the HBM synchronization barrier manager (400) can therefore host M * 64KB pages, where M is the number of physical BS (100) barriers implemented in the HBM synchronization barrier manager (400). M can especially be 512, which results in a total memory space of 32 MB (mega-bytes). These 32 MB obviously correspond to a virtual type of memory that is not to be considered as "real" MB but are simply seen as such by the application to synchronize.
- This request includes the address of the barrier BS (100), a command in progress (detailed below), the indication whether it is a synchronization at one or more levels (detailed below). below) and the number of processes involved in the synchronization and therefore the barrier.
- the values 0 or 1 correspond respectively to one-level synchronization and two-level synchronization.
- a higher synchronization level is detailed in the exemplary embodiment below.
- FIG. 5 relates to an exemplary embodiment of an HBM synchronization manager (400), which is capable of managing a higher level synchronization, and more precisely here at two levels.
- a two-level synchronization may for example be used when several distinct groups of P processes must be synchronized, with each group having a physical (or physical) BS (100) barrier.
- the manager of synchronization HBM (400) must handle the case where each group must be synchronized on its own, then all the groups must be synchronized with each other.
- the first request received by the barrier BS (100) to a ready state PRE contains on its lower weights information indicating whether it is a synchronization at one or two levels. If it is a one-level synchronization, it will be managed by a barrier at a level, or more precisely by an active state ACT of the barrier designed for a level (state ACT 1 N). If, on the contrary, it is a two-level synchronization, this same barrier will enter an ACT active state designed for two levels (state ACT 2 N), in which case its behavior will be as described below:
- the barrier BS (100) When all the requests have been received by the barrier BS (100), it chooses one of the processes P as being master M among all the processes P participating in the barrier BS (100). At first, only the request of the master M will be answered by a special data D indicating that he is the master of the group. From there, the master is free to perform the second level of synchronization.
- This second level of synchronization may for example be a BS (100) barrier of a software nature.
- the master M When the master M has completed this second synchronization level, it transmits a last request to the barrier BS (100). In response to this latter request, the barrier responds to all other requests from the other processes P participating in the BS (100) barrier (including the master M), and returns to the ready state PRE.
- the master M is dynamic and can be redefined at each synchronization.
- the different states of the barrier automatism shown in FIG. 5 are the following:
- the physical barrier is ready to receive requests from processes participating in the barrier.
- three transitions can take place: T1, T2 or T13.
- the barrier will choose which transitions to perform.
- the query includes in its lower weights the information that there is need for a single-level synchronization (SYNC_1_N).
- the barrier is activated and is then in active state with synchronization at ACT 1 N.
- Transition T2 Similar to T1, this transition corresponds to the reception of the barrier of a request with a SAVE record command in order to initialize the barrier.
- the request includes in its lower weight, the information that there is need of a synchronization with two levels (SYNC_2_N).
- the gate is activated and goes into active state with two-level synchronization ACT 2 N.
- the barrier performs a single-level synchronization. Several transitions exist from this state.
- the internal counter CNT (406) is decremented at each transition T3 (CNT> threshold value).
- T3 corresponds substantially to each termination of the current blocks B, during the same step W work.
- chronic counter time counter
- Transition T14 This transition corresponds to the reception of the barrier of a request with an OFF command OFF
- Transition T6 analogous to the T3 transition (see above).
- the barrier BS (100) returns to ready state PRE.
- the barrier BS (100) is provided with a time counter, also called counter-chron.
- the counter is configurable and can describe a time limit. The counter starts a countdown (usually in units ⁇ s) upon receipt of the first request. The time then begins to run. If the predetermined time limit is exceeded before the last request is received at the barrier BS (100), then it passes to the cancellation state ANN.
- the time limit can vary according to the barriers, and more precisely according to the different states of a barrier, in particular: ACT_1_N, ACT_2_N, SYNC.
- this time limit is programmable. Its upper limit can be set according to the context, in particular to avoid interference with the time-out of the processor.
- the barrier returns to ready state PRE (see above). This, for example, invites the set of processes (P) to go back towards the end of execution of a previous work step (W).
- the flow diagram of FIG. 6 shows the main operations of a synchronization barrier BS according to one embodiment of the invention.
- the flowchart shows the barrier BS (100) in its ready state PRE (operation 700).
- the first process P having completed its block B, informs (by a call) the barrier BS (100) by means of a request for the HBM manager (400) (operation 702).
- the barrier stores the identifier SVEJD Req corresponding to the process P having recently informed the barrier BS (100) (operation 708).
- the barrier BS (100) checks whether the counter CNT (406) has reached its threshold value (operations 710 and 712).
- the barrier BS (100) elects a master M among the current processes P (operation 732; CH M) and performs a second synchronization (operation 734; SYNC) before the response by data D (operation 740).
- a single barrier BS is used for the synchronization of processes. It may be useful to integrate in a computer system several synchronization barriers BS and in particular to allow to synchronize several groups of processes, each group contributing to the execution of a different task. For example in scientific computing on a machine of 16 cores, one can consider that 2 independent calculations are done using each 8 cores, one will then have 2 groups of 8 processes, each process executing on a different heart. In this example we will need 2 barriers.
- the device may comprise several hardware circuits, which are accessed from the address spaces by segments derived from said data of each call.
- each of the hardware circuits is connected either to the same circuit or to separate circuits.
- the computing device described herein may therefore further include a software synchronization barrier, operating in combination with said hardware circuit.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multi Processors (AREA)
Abstract
Description
Claims
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
BRPI0917747A BRPI0917747A2 (pt) | 2008-12-16 | 2009-11-27 | dispositivo de informática com barreira de sincronização e processo de tratamento da informática |
US13/139,989 US9218222B2 (en) | 2008-12-16 | 2009-11-27 | Physical manager of synchronization barrier between multiple processes |
EP09797109.7A EP2366147B1 (fr) | 2008-12-16 | 2009-11-27 | Gestionnaire physique de barriere de synchronisation entre processus multiples |
ES09797109.7T ES2689125T3 (es) | 2008-12-16 | 2009-11-27 | Gestor físico de barrera de sincronización entre procesos múltiples |
JP2011540156A JP5626690B2 (ja) | 2008-12-16 | 2009-11-27 | マルチプロセス間のバリアの物理マネージャ |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR0807089A FR2939922B1 (fr) | 2008-12-16 | 2008-12-16 | Gestionnaire physique de barriere de synchronisation entre processus multiples |
FR0807089 | 2008-12-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2010070222A1 true WO2010070222A1 (fr) | 2010-06-24 |
Family
ID=40785419
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/FR2009/052322 WO2010070222A1 (fr) | 2008-12-16 | 2009-11-27 | Gestionnaire physique de barriere de synchronisation entre processus multiples |
Country Status (7)
Country | Link |
---|---|
US (1) | US9218222B2 (fr) |
EP (1) | EP2366147B1 (fr) |
JP (1) | JP5626690B2 (fr) |
BR (1) | BRPI0917747A2 (fr) |
ES (1) | ES2689125T3 (fr) |
FR (1) | FR2939922B1 (fr) |
WO (1) | WO2010070222A1 (fr) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5568048B2 (ja) * | 2011-04-04 | 2014-08-06 | 株式会社日立製作所 | 並列計算機システム、およびプログラム |
US9195516B2 (en) | 2011-12-01 | 2015-11-24 | International Business Machines Corporation | Determining collective barrier operation skew in a parallel computer |
US9092272B2 (en) * | 2011-12-08 | 2015-07-28 | International Business Machines Corporation | Preparing parallel tasks to use a synchronization register |
US8924763B2 (en) * | 2011-12-15 | 2014-12-30 | International Business Machines Corporation | Synchronizing compute node time bases in a parallel computer |
JP5994601B2 (ja) * | 2012-11-27 | 2016-09-21 | 富士通株式会社 | 並列計算機、並列計算機の制御プログラム及び並列計算機の制御方法 |
US20150033234A1 (en) * | 2013-07-23 | 2015-01-29 | Qualcomm Incorporated | Providing queue barriers when unsupported by an i/o protocol or target device |
US9501300B2 (en) * | 2013-09-16 | 2016-11-22 | General Electric Company | Control system simulation system and method |
US20170139756A1 (en) * | 2014-04-23 | 2017-05-18 | Sciensys | Program parallelization on procedure level in multiprocessor systems with logically shared memory |
US10318355B2 (en) * | 2017-01-24 | 2019-06-11 | Oracle International Corporation | Distributed graph processing system featuring interactive remote control mechanism including task cancellation |
US11353868B2 (en) | 2017-04-24 | 2022-06-07 | Intel Corporation | Barriers and synchronization for machine learning at autonomous machines |
US10678925B2 (en) * | 2017-06-26 | 2020-06-09 | Microsoft Technology Licensing, Llc | Data quarantine and recovery |
JP7159696B2 (ja) * | 2018-08-28 | 2022-10-25 | 富士通株式会社 | 情報処理装置,並列計算機システムおよび制御方法 |
US10824481B2 (en) * | 2018-11-13 | 2020-11-03 | International Business Machines Corporation | Partial synchronization between compute tasks based on threshold specification in a computing system |
US11409579B2 (en) * | 2020-02-24 | 2022-08-09 | Intel Corporation | Multiple independent synchonization named barrier within a thread group |
US11531565B2 (en) * | 2020-05-08 | 2022-12-20 | Intel Corporation | Techniques to generate execution schedules from neural network computation graphs |
US11461130B2 (en) | 2020-05-26 | 2022-10-04 | Oracle International Corporation | Methodology for fast and seamless task cancelation and error handling in distributed processing of large graph data |
US11720360B2 (en) | 2020-09-11 | 2023-08-08 | Apple Inc. | DSB operation with excluded region |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070113233A1 (en) * | 2005-11-10 | 2007-05-17 | Collard Jean-Francois C P | Program thread synchronization |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3482897B2 (ja) * | 1999-01-20 | 2004-01-06 | 日本電気株式会社 | クラスタ型並列計算機システムおよびプロセッサ間バリア同期方法 |
US6766437B1 (en) * | 2000-02-28 | 2004-07-20 | International Business Machines Corporation | Composite uniprocessor |
US7100021B1 (en) * | 2001-10-16 | 2006-08-29 | Cisco Technology, Inc. | Barrier synchronization mechanism for processors of a systolic array |
JP4448784B2 (ja) * | 2005-03-15 | 2010-04-14 | 株式会社日立製作所 | 並列計算機の同期方法及びプログラム |
US8645959B2 (en) * | 2005-03-30 | 2014-02-04 | Intel Corporaiton | Method and apparatus for communication between two or more processing elements |
US7865911B2 (en) * | 2005-11-08 | 2011-01-04 | Microsoft Corporation | Hybrid programming |
US7861060B1 (en) * | 2005-12-15 | 2010-12-28 | Nvidia Corporation | Parallel data processing systems and methods using cooperative thread arrays and thread identifier values to determine processing behavior |
TWI318750B (en) * | 2006-09-22 | 2009-12-21 | Nuvoton Technology Corp | Software development methods, systems, and storage media storing software developed thereby |
US20080109604A1 (en) * | 2006-11-08 | 2008-05-08 | Sicortex, Inc | Systems and methods for remote direct memory access to processor caches for RDMA reads and writes |
JP2008234074A (ja) * | 2007-03-16 | 2008-10-02 | Fujitsu Ltd | キャッシュ装置 |
-
2008
- 2008-12-16 FR FR0807089A patent/FR2939922B1/fr not_active Expired - Fee Related
-
2009
- 2009-11-27 ES ES09797109.7T patent/ES2689125T3/es active Active
- 2009-11-27 WO PCT/FR2009/052322 patent/WO2010070222A1/fr active Application Filing
- 2009-11-27 JP JP2011540156A patent/JP5626690B2/ja not_active Expired - Fee Related
- 2009-11-27 BR BRPI0917747A patent/BRPI0917747A2/pt active Search and Examination
- 2009-11-27 EP EP09797109.7A patent/EP2366147B1/fr active Active
- 2009-11-27 US US13/139,989 patent/US9218222B2/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070113233A1 (en) * | 2005-11-10 | 2007-05-17 | Collard Jean-Francois C P | Program thread synchronization |
Non-Patent Citations (3)
Title |
---|
Retrieved from the Internet <URL:http://portal.acm.org/citation.cfm?id=1105743> [retrieved on 20090625] * |
SAMPSON J ET AL: "Fast Synchronization for Chip Multiprocessors", INTERNET CITATION, vol. 33, no. 4, November 2005 (2005-11-01), ACM SIGARCH Computer Architecture News, pages 64 - 69, XP002417542, Retrieved from the Internet <URL:http://www.cse.ucsd.edu/~rakumar/dasCMP05/paper07.pdf> [retrieved on 20070123] * |
W. E. COHEN, H. G. DIETZ, J. B. SPONAUGLE: "Dynamic Barrier Architecture For Multi-Mode Fine-Grain Parallelism Using Conventional Processors", INTERNET ARTICLE, March 1994 (1994-03-01), pages 1 - 23, XP002533890, Retrieved from the Internet <URL:http://aggregate.org/TechPub/TREE94_10/tree94_10.ps> [retrieved on 20090625] * |
Also Published As
Publication number | Publication date |
---|---|
FR2939922A1 (fr) | 2010-06-18 |
EP2366147B1 (fr) | 2018-05-09 |
ES2689125T3 (es) | 2018-11-08 |
FR2939922B1 (fr) | 2011-03-04 |
EP2366147A1 (fr) | 2011-09-21 |
BRPI0917747A2 (pt) | 2016-02-16 |
US9218222B2 (en) | 2015-12-22 |
JP5626690B2 (ja) | 2014-11-19 |
US20110252264A1 (en) | 2011-10-13 |
JP2012512452A (ja) | 2012-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2366147B1 (fr) | Gestionnaire physique de barriere de synchronisation entre processus multiples | |
EP1805611B1 (fr) | Procede d'ordonnancement de traitement de tâches et dispositif pour mettre en oeuvre le procede | |
FR2632096A1 (fr) | Systeme de microcalculateur a bus multiple avec arbitrage d'acces aux bus | |
FR2931970A1 (fr) | Procede de generation de requetes de manipulation d'une base de donnees d'initialisation et d'administration d'une grappe de serveurs , support de donnees et grappe de serveurs correspondants | |
EP1158405A1 (fr) | Système et méthode de gestion d'une architecture multi-ressources | |
FR3103585A1 (fr) | Procédé de gestion de la configuration d’accès à des périphériques et à leurs ressources associées d’un système sur puce formant par exemple un microcontrôleur, et système sur puce correspondant | |
FR3007542A1 (fr) | File d'echange de donnees ayant une profondeur illimitee | |
FR3047821A1 (fr) | Procede et dispositif de gestion d'un appareil de commande | |
EP2802992B1 (fr) | Systeme et procede de gestion de correspondance entre une memoire cache et une memoire principale | |
EP2856323B1 (fr) | Procédé, dispositif et programme d'ordinateur de contrôle dynamique de distances d'accès mémoire dans un système de type numa | |
EP1594065A1 (fr) | Système sur une puce avec unité d'arbitrage, et clé de stockage l'incorporant | |
EP1341087B1 (fr) | Procédé et système de gestion d'un journal personnel d'évènements | |
FR2995424A1 (fr) | Procede et dispositif de decompte du temps deporte pour unite de traitement dans un systeme de traitement de l'information | |
EP1603049A1 (fr) | Interfacage de modules fonctionnels dans un systeme sur une puce | |
WO2013110816A2 (fr) | Procédé d'utilisation d'une mémoire partagée | |
EP0822495B1 (fr) | Distribution de tickets dans un système informatique multinodal | |
EP2756398B1 (fr) | Procede, dispositif et programme d'ordinateur pour allouer dynamiquement des ressources d'un cluster a l'execution de processus d'une application | |
FR2656707A1 (fr) | Procede d'exploitation d'un bus d'ordinateur. | |
EP2545449A1 (fr) | Procédé de configuration d'un système informatique, programme d'ordinateur et système informatique correspondants | |
EP2221730B1 (fr) | Procédé d'accès direct et concurrent de plusieurs unités de traitment virtuelles à une unité périphérique | |
WO2019129958A1 (fr) | Procede de stockage de donnees et procede d'execution d'application avec reduction du temps d'acces aux donnees stockees | |
EP1341093B1 (fr) | Accès à une ressource collective | |
EP3814893A1 (fr) | Accès mémoire de processeurs | |
EP1293909B1 (fr) | Controle d'accès dynamique d'une fonction à une ressource collective. | |
WO2011125001A1 (fr) | Mémoire cache segmentée |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09797109 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2009797109 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011540156 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13139989 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: PI0917747 Country of ref document: BR Kind code of ref document: A2 Effective date: 20110615 |