CN104899008B - Shared storage organization in parallel processor and method - Google Patents
Shared storage organization in parallel processor and method Download PDFInfo
- Publication number
- CN104899008B CN104899008B CN201510346816.4A CN201510346816A CN104899008B CN 104899008 B CN104899008 B CN 104899008B CN 201510346816 A CN201510346816 A CN 201510346816A CN 104899008 B CN104899008 B CN 104899008B
- Authority
- CN
- China
- Prior art keywords
- counter
- thread
- line end
- multicast
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
- Multi Processors (AREA)
Abstract
The invention belongs to field of computer architecture, the more particularly to information storage in parallel processor and transmission.Its technical solution is:A kind of shared storage organization in parallel processor, including:Several information channels read and write dual-channel memory, several multicast counter groups and moderator.The present invention provides the information transmission mechanisms between parallel processor multithreading, realize the multicast transmission of information, usually only need a clock that can complete memory access mutual exclusion inspection, greatly reduce communication overhead and power consumption.The structure of the present invention can also efficiently realize concurrent operation with method on programmable processor array.
Description
Technical field
The invention belongs to field of computer architecture, more particularly to are passed using information between the thread in parallel processor
Pass data exchange and the storage of mechanism.
Background technology
Modern computing machine architecture improves its processing capacity in a manner of multiprocessor, multi-threaded parallel, this mode is wanted
It asks and exchanges information to low overhead between multithreading.And the information of cross-thread transmits mutual exclusion of the generally use by software realization at present
Mechanism, efficiency is very low, realizes also very complicated.
Invention content
The purpose of the present invention is:There is provided between a kind of multithreading dedicated for multiprocessor using information transmission mechanism,
The shared storage organization and method of data exchange efficiently complete the information exchange of multithreading between processor in parallel machine.
The technical scheme is that:A kind of shared storage organization in parallel processor, it is characterized in that:It includes:Number
A information channel reads and writes dual-channel memory, several multicast counter groups and moderator;
Several information channels respectively connect multiple parallel processors, and each parallel processor executes 2-16 line
Journey;Several information channels include:Address signal line end, storage enable signal line end, read-write selection signal line end, word
Signal line end, input data signal line end, outputting data signals line end, output useful signal line end are selected in selected parts;The input number
It is believed that a number line end includes:Input data line end, multicast count line end and obstruction/non-blocking mode line end;
The read-write dual-channel memory is used to access the data that the cross-thread transmits, and data width is 4 bytes, energy
Byte access is enough pressed, data are 32 single precision words or 64 double precision words;
Several multicast counter groups respectively contain n counter, and the counter counts letter for recording multicast
Breath, is chosen by address signal, and corresponding signal control reads and writes and decrement operations;
The moderator executes repeating query arbitration algorithm, selects a thread to the read-write dual-channel memory every time
It is written and read.
A kind of shared storage method in parallel processor, the shared storage organization in its upper described parallel processor,
And execute following operation:
A. the write operation of blocking model
A1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by
The moderator chooses one of thread, and exports the address signal that the thread provides;
A2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal
Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is 0;
A3. if it is 0, the multicast which is provided, which counts, is written the counter, the output useful signal line end
1 is set, giving address according to the thread writes data into the dual-channel memory;Otherwise,
A4. the output useful signal line end is set to 0, the failure of this write operation;
B. the read operation of blocking model
B1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by
The moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting institute
State a byte of dual-channel memory;
B2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal
Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is more than 0;
B3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, given according to the thread
Read the specified bytes of the dual-channel memory in address;Otherwise,
B4. the output useful signal line end is set to 0, this read operation failure;
C. the write operation of non-blocking mode
C1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by
The moderator chooses one of thread, and exports the address signal that the thread provides;
C2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal
Other are chosen corresponding counter in the multicast counter group;
C3. the multicast thread provided, which counts, is written the counter, and the output useful signal line end sets 1, according to this
Thread gives address and writes data into the dual-channel memory;
D. the read operation of non-blocking mode
D1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by
The moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting institute
State a byte of dual-channel memory;
D2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal
Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is more than 0;
D3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, given according to the thread
Read the specified bytes of the dual-channel memory in address;Otherwise,
D4. the output useful signal line end is set to 0, this read operation failure.
The present invention provides the information transmission mechanisms between parallel processor multithreading, realize the multicast transmission of information, usually
It only needs a clock that can complete memory access mutual exclusion inspection, greatly reduces communication overhead and power consumption.The present invention structure with
Method can also efficiently realize concurrent operation on programmable processor array.
Description of the drawings
Attached drawing 1 is schematic structural view of the invention;
Attached drawing 2 is multicast counter group structural schematic diagram in the present invention;
Attached drawing 3 is the write operation flow chart of blocking model in the present invention;
Attached drawing 4 is the read operation flow chart of blocking model and non-blocking mode in the present invention;
Attached drawing 5 is the write operation flow chart of non-blocking mode in the present invention.
Specific implementation mode
Embodiment 1:Referring to attached drawing 1,2, the shared storage organization in a kind of parallel processor, it is characterized in that:It includes:Number
A information channel reads and writes dual-channel memory, several multicast counter groups and moderator;
Several information channels respectively connect multiple parallel processors, and each parallel processor executes 2-16 line
Journey;Several information channels include:Address signal line end, storage enable signal line end, read-write selection signal line end, word
Signal line end, input data signal line end, outputting data signals line end, output useful signal line end are selected in selected parts;The input number
It is believed that a number line end includes:Input data line end, multicast count line end and obstruction/non-blocking mode line end;
The read-write dual-channel memory is used to access the data that the cross-thread transmits, and data width is 4 bytes, energy
Byte access is enough pressed, data are 32 single precision words or 64 double precision words;
Several multicast counter groups respectively contain n counter, and the counter counts letter for recording multicast
Breath, is chosen by address signal, and corresponding signal control reads and writes and decrement operations;
The moderator executes repeating query arbitration algorithm, selects a thread to the read-write dual-channel memory every time
It is written and read.
Embodiment 2:Referring to attached drawing 1,2, the shared storage organization in parallel processor as described in Example 1, it is characterized in that:
The quantity in described information channel is 4, and each described information channel connects 4 parallel processors;
The quantity of the multicast counter group is 16, and the counter that each multicast counter group includes is 16,
Each counter is 3-4 bits, supports the data multicast of 6-8 cross-thread;
Described information, which is transmitted, shares the address space that storage organization occupies the parallel processor data storage, is mapped in this
The highest address part of data space.
Embodiment 3:Referring to attached drawing 3,4,5, the shared storage method in a kind of parallel processor, it is used such as embodiment 1
Or the shared storage organization in the parallel processor described in 2, and execute following operation:
A. the write operation of blocking model
A1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by
The moderator chooses one of thread, and exports the address signal that the thread provides;
A2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal
Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is 0;
A3. if it is 0, the multicast which is provided, which counts, is written the counter, the output useful signal line end
1 is set, giving address according to the thread writes data into the dual-channel memory;Otherwise,
A4. the output useful signal line end is set to 0, the failure of this write operation;
B. the read operation of blocking model
B1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by
The moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting institute
State a byte of dual-channel memory;
B2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal
Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is more than 0;
B3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, given according to the thread
Read the specified bytes of the dual-channel memory in address;Otherwise,
B4. the output useful signal line end is set to 0, this read operation failure;
C. the write operation of non-blocking mode
C1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by
The moderator chooses one of thread, and exports the address signal that the thread provides;
C2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal
Other are chosen corresponding counter in the multicast counter group;
C3. the multicast thread provided, which counts, is written the counter, and the output useful signal line end sets 1, according to this
Thread gives address and writes data into the dual-channel memory;
D. the read operation of non-blocking mode
D1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by
The moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting institute
State a byte of dual-channel memory;
D2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal
Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is more than 0;
D3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, given according to the thread
Read the specified bytes of the dual-channel memory in address;Otherwise,
D4. the output useful signal line end is set to 0, this read operation failure.
Claims (3)
1. the shared storage organization in a kind of parallel processor, it is characterized in that:It includes:Several information channels, read-write binary channels
Memory, several multicast counter groups and moderator;
Several information channels respectively connect multiple parallel processors, and each parallel processor executes 2-16 thread;
Several information channels include:Address signal line end, storage enable signal line end, read-write selection signal line end, byte
Selection signal line end, input data signal line end, outputting data signals line end, output useful signal line end;The input data
Signal line end includes:Input data line end, multicast count line end and obstruction or non-blocking mode line end;
The read-write dual-channel memory is used to access the data that the cross-thread transmits, the number for the data that the cross-thread transmits
Be 4 bytes according to width, can by byte access, data that the cross-thread transmits be 32 single precision word or 64
Double precision word;
Several multicast counter groups respectively contain n counter, and the counter is used to record multicast count information, by
Address signal is chosen, and corresponding signal control reads and writes and decrement operations;
The moderator executes repeating query arbitration algorithm, selects a thread to carry out the read-write dual-channel memory every time
Read-write.
2. the shared storage organization in parallel processor according to claim 1, it is characterized in that:
The quantity in described information channel is 4, and each described information channel connects 4 parallel processors;
The quantity of the multicast counter group is 16, and the counter that each multicast counter group includes is 16, each
The counter is 3-4 bits, supports the data multicast of 6-8 cross-thread;
The shared storage organization occupies the address space of the parallel processor data storage, is mapped in the ground of data storage
The highest address part in location space.
3. the shared storage method in a kind of parallel processor, it is used in parallel processor as claimed in claim 1 or 2
Shared storage organization, and execute following operation:
A. the write operation of blocking model
A1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by described
Moderator chooses one of thread, and exports the address signal that the thread provides;
A2. a multicast counter group is chosen by minimum 1-2 of described address signal, by the other of described address signal
Corresponding counter in the multicast counter group is chosen in position, checks whether the value of the counter is 0;
A3. if it is 0, the multicast which is provided, which counts, is written the counter, and the output useful signal line end sets 1,
Address, which is given, according to the thread writes data into the dual-channel memory;Otherwise,
A4. the output useful signal line end is set to 0, the failure of this write operation;
B. the read operation of blocking model
B1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by described
Moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting described double
One byte of channel memory;
B2. a multicast counter group is chosen by minimum 1-2 of described address signal, by the other of described address signal
Corresponding counter in the multicast counter group is chosen in position, checks whether the value of the counter is more than 0;
B3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, and address is given according to the thread
Read the specified bytes of the dual-channel memory;Otherwise,
B4. the output useful signal line end is set to 0, this read operation failure;
C. the write operation of non-blocking mode
C1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by described
Moderator chooses one of thread, and exports the address signal that the thread provides;
C2. a multicast counter group is chosen by minimum 1-2 of described address signal, by the other of described address signal
Choose corresponding counter in the multicast counter group in position;
C3. the multicast thread provided, which counts, is written the counter, and the output useful signal line end sets 1, according to the thread
Given address writes data into the dual-channel memory;
D. the read operation of non-blocking mode
D1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by described
Moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting described double
One byte of channel memory;
D2. a multicast counter group is chosen by minimum 1-2 of described address signal, by the other of described address signal
Corresponding counter in the multicast counter group is chosen in position, checks whether the value of the counter is more than 0;
D3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, and address is given according to the thread
Read the specified bytes of the dual-channel memory;Otherwise,
D4. the output useful signal line end is set to 0, this read operation failure.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510346816.4A CN104899008B (en) | 2015-06-23 | 2015-06-23 | Shared storage organization in parallel processor and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510346816.4A CN104899008B (en) | 2015-06-23 | 2015-06-23 | Shared storage organization in parallel processor and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104899008A CN104899008A (en) | 2015-09-09 |
CN104899008B true CN104899008B (en) | 2018-10-12 |
Family
ID=54031687
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510346816.4A Expired - Fee Related CN104899008B (en) | 2015-06-23 | 2015-06-23 | Shared storage organization in parallel processor and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104899008B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109845199B (en) * | 2016-09-12 | 2022-03-04 | 马维尔亚洲私人有限公司 | Merging read requests in a network device architecture |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1758229A (en) * | 2005-10-28 | 2006-04-12 | 中国人民解放军国防科学技术大学 | Local space shared memory method of heterogeneous multi-kernel microprocessor |
CN1781079A (en) * | 2003-06-11 | 2006-05-31 | 思科技术公司 | Maintaining entity order with gate managers |
US7680988B1 (en) * | 2006-10-30 | 2010-03-16 | Nvidia Corporation | Single interconnect providing read and write access to a memory shared by concurrent threads |
CN102622192A (en) * | 2012-02-27 | 2012-08-01 | 北京理工大学 | Weak correlation multiport parallel store controller |
US8392891B2 (en) * | 2008-06-26 | 2013-03-05 | Microsoft Corporation | Technique for finding relaxed memory model vulnerabilities |
-
2015
- 2015-06-23 CN CN201510346816.4A patent/CN104899008B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1781079A (en) * | 2003-06-11 | 2006-05-31 | 思科技术公司 | Maintaining entity order with gate managers |
CN1758229A (en) * | 2005-10-28 | 2006-04-12 | 中国人民解放军国防科学技术大学 | Local space shared memory method of heterogeneous multi-kernel microprocessor |
US7680988B1 (en) * | 2006-10-30 | 2010-03-16 | Nvidia Corporation | Single interconnect providing read and write access to a memory shared by concurrent threads |
US8392891B2 (en) * | 2008-06-26 | 2013-03-05 | Microsoft Corporation | Technique for finding relaxed memory model vulnerabilities |
CN102622192A (en) * | 2012-02-27 | 2012-08-01 | 北京理工大学 | Weak correlation multiport parallel store controller |
Non-Patent Citations (1)
Title |
---|
多核处理器Cache一致性协议关键技术研究;黄安文;《计算机工程与科学》;20090910;第104-108页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104899008A (en) | 2015-09-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI425512B (en) | Flash memory controller circuit and storage system and data transfer method thereof | |
US10789182B2 (en) | System and method for individual addressing | |
CN105051711A (en) | Methods and apparatuses for providing data received by a state machine engine | |
US20160246514A1 (en) | Memory system | |
CN103019810A (en) | Scheduling and management of compute tasks with different execution priority levels | |
CN103246625B (en) | A kind of method of data and address sharing pin self-adaptative adjustment memory access granularity | |
CN103988212A (en) | Methods and systems for routing in state machine | |
KR20110059712A (en) | Independently controlled virtual memory devices in memory modules | |
US11029746B2 (en) | Dynamic power management network for memory devices | |
EP3910488A1 (en) | Systems, methods, and devices for near data processing | |
CN103744644A (en) | Quad-core processor system built in quad-core structure and data switching method thereof | |
CN106951488A (en) | A kind of log recording method and device | |
CN103890857A (en) | Shiftable memory employing ring registers | |
CN108733580A (en) | Method for scheduling read commands | |
CN104317770A (en) | Data storage structure and data access method for multiple core processing system | |
CN101930407B (en) | Flash memory control circuit and memory system and data transmission method thereof | |
CN104239232A (en) | Ping-Pong cache operation structure based on DPRAM (Dual Port Random Access Memory) in FPGA (Field Programmable Gate Array) | |
CN101515221A (en) | Method, device and system for reading data | |
CN104051009A (en) | Gating circuit and gating method of resistive random access memory (RRAM) | |
CN101825997A (en) | Asynchronous first-in first-out storage | |
CN104899008B (en) | Shared storage organization in parallel processor and method | |
CN102789424B (en) | External extended DDR2 (Double Data Rate 2) read-write method on basis of FPGA (Field Programmable Gate Array) and external extended DDR2 particle storage on basis of FPGA | |
CN104035897A (en) | Storage controller | |
CN109614145A (en) | A kind of processor core core structure and data access method | |
CN103412848A (en) | Method for sharing single program memory by four-core processor system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20181012 |