CN104899008B - Shared storage organization in parallel processor and method - Google Patents

Shared storage organization in parallel processor and method Download PDF

Info

Publication number
CN104899008B
CN104899008B CN201510346816.4A CN201510346816A CN104899008B CN 104899008 B CN104899008 B CN 104899008B CN 201510346816 A CN201510346816 A CN 201510346816A CN 104899008 B CN104899008 B CN 104899008B
Authority
CN
China
Prior art keywords
counter
thread
line end
multicast
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510346816.4A
Other languages
Chinese (zh)
Other versions
CN104899008A (en
Inventor
李萍萍
杨新宪
张振龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yuhuacong Technology Co Ltd
Original Assignee
Beijing Yuhuacong Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yuhuacong Technology Co Ltd filed Critical Beijing Yuhuacong Technology Co Ltd
Priority to CN201510346816.4A priority Critical patent/CN104899008B/en
Publication of CN104899008A publication Critical patent/CN104899008A/en
Application granted granted Critical
Publication of CN104899008B publication Critical patent/CN104899008B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)
  • Multi Processors (AREA)

Abstract

The invention belongs to field of computer architecture, the more particularly to information storage in parallel processor and transmission.Its technical solution is:A kind of shared storage organization in parallel processor, including:Several information channels read and write dual-channel memory, several multicast counter groups and moderator.The present invention provides the information transmission mechanisms between parallel processor multithreading, realize the multicast transmission of information, usually only need a clock that can complete memory access mutual exclusion inspection, greatly reduce communication overhead and power consumption.The structure of the present invention can also efficiently realize concurrent operation with method on programmable processor array.

Description

Shared storage organization in parallel processor and method
Technical field
The invention belongs to field of computer architecture, more particularly to are passed using information between the thread in parallel processor Pass data exchange and the storage of mechanism.
Background technology
Modern computing machine architecture improves its processing capacity in a manner of multiprocessor, multi-threaded parallel, this mode is wanted It asks and exchanges information to low overhead between multithreading.And the information of cross-thread transmits mutual exclusion of the generally use by software realization at present Mechanism, efficiency is very low, realizes also very complicated.
Invention content
The purpose of the present invention is:There is provided between a kind of multithreading dedicated for multiprocessor using information transmission mechanism, The shared storage organization and method of data exchange efficiently complete the information exchange of multithreading between processor in parallel machine.
The technical scheme is that:A kind of shared storage organization in parallel processor, it is characterized in that:It includes:Number A information channel reads and writes dual-channel memory, several multicast counter groups and moderator;
Several information channels respectively connect multiple parallel processors, and each parallel processor executes 2-16 line Journey;Several information channels include:Address signal line end, storage enable signal line end, read-write selection signal line end, word Signal line end, input data signal line end, outputting data signals line end, output useful signal line end are selected in selected parts;The input number It is believed that a number line end includes:Input data line end, multicast count line end and obstruction/non-blocking mode line end;
The read-write dual-channel memory is used to access the data that the cross-thread transmits, and data width is 4 bytes, energy Byte access is enough pressed, data are 32 single precision words or 64 double precision words;
Several multicast counter groups respectively contain n counter, and the counter counts letter for recording multicast Breath, is chosen by address signal, and corresponding signal control reads and writes and decrement operations;
The moderator executes repeating query arbitration algorithm, selects a thread to the read-write dual-channel memory every time It is written and read.
A kind of shared storage method in parallel processor, the shared storage organization in its upper described parallel processor, And execute following operation:
A. the write operation of blocking model
A1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by The moderator chooses one of thread, and exports the address signal that the thread provides;
A2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is 0;
A3. if it is 0, the multicast which is provided, which counts, is written the counter, the output useful signal line end 1 is set, giving address according to the thread writes data into the dual-channel memory;Otherwise,
A4. the output useful signal line end is set to 0, the failure of this write operation;
B. the read operation of blocking model
B1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by The moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting institute State a byte of dual-channel memory;
B2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is more than 0;
B3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, given according to the thread Read the specified bytes of the dual-channel memory in address;Otherwise,
B4. the output useful signal line end is set to 0, this read operation failure;
C. the write operation of non-blocking mode
C1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by The moderator chooses one of thread, and exports the address signal that the thread provides;
C2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal Other are chosen corresponding counter in the multicast counter group;
C3. the multicast thread provided, which counts, is written the counter, and the output useful signal line end sets 1, according to this Thread gives address and writes data into the dual-channel memory;
D. the read operation of non-blocking mode
D1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by The moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting institute State a byte of dual-channel memory;
D2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is more than 0;
D3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, given according to the thread Read the specified bytes of the dual-channel memory in address;Otherwise,
D4. the output useful signal line end is set to 0, this read operation failure.
The present invention provides the information transmission mechanisms between parallel processor multithreading, realize the multicast transmission of information, usually It only needs a clock that can complete memory access mutual exclusion inspection, greatly reduces communication overhead and power consumption.The present invention structure with Method can also efficiently realize concurrent operation on programmable processor array.
Description of the drawings
Attached drawing 1 is schematic structural view of the invention;
Attached drawing 2 is multicast counter group structural schematic diagram in the present invention;
Attached drawing 3 is the write operation flow chart of blocking model in the present invention;
Attached drawing 4 is the read operation flow chart of blocking model and non-blocking mode in the present invention;
Attached drawing 5 is the write operation flow chart of non-blocking mode in the present invention.
Specific implementation mode
Embodiment 1:Referring to attached drawing 1,2, the shared storage organization in a kind of parallel processor, it is characterized in that:It includes:Number A information channel reads and writes dual-channel memory, several multicast counter groups and moderator;
Several information channels respectively connect multiple parallel processors, and each parallel processor executes 2-16 line Journey;Several information channels include:Address signal line end, storage enable signal line end, read-write selection signal line end, word Signal line end, input data signal line end, outputting data signals line end, output useful signal line end are selected in selected parts;The input number It is believed that a number line end includes:Input data line end, multicast count line end and obstruction/non-blocking mode line end;
The read-write dual-channel memory is used to access the data that the cross-thread transmits, and data width is 4 bytes, energy Byte access is enough pressed, data are 32 single precision words or 64 double precision words;
Several multicast counter groups respectively contain n counter, and the counter counts letter for recording multicast Breath, is chosen by address signal, and corresponding signal control reads and writes and decrement operations;
The moderator executes repeating query arbitration algorithm, selects a thread to the read-write dual-channel memory every time It is written and read.
Embodiment 2:Referring to attached drawing 1,2, the shared storage organization in parallel processor as described in Example 1, it is characterized in that:
The quantity in described information channel is 4, and each described information channel connects 4 parallel processors;
The quantity of the multicast counter group is 16, and the counter that each multicast counter group includes is 16, Each counter is 3-4 bits, supports the data multicast of 6-8 cross-thread;
Described information, which is transmitted, shares the address space that storage organization occupies the parallel processor data storage, is mapped in this The highest address part of data space.
Embodiment 3:Referring to attached drawing 3,4,5, the shared storage method in a kind of parallel processor, it is used such as embodiment 1 Or the shared storage organization in the parallel processor described in 2, and execute following operation:
A. the write operation of blocking model
A1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by The moderator chooses one of thread, and exports the address signal that the thread provides;
A2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is 0;
A3. if it is 0, the multicast which is provided, which counts, is written the counter, the output useful signal line end 1 is set, giving address according to the thread writes data into the dual-channel memory;Otherwise,
A4. the output useful signal line end is set to 0, the failure of this write operation;
B. the read operation of blocking model
B1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by The moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting institute State a byte of dual-channel memory;
B2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is more than 0;
B3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, given according to the thread Read the specified bytes of the dual-channel memory in address;Otherwise,
B4. the output useful signal line end is set to 0, this read operation failure;
C. the write operation of non-blocking mode
C1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by The moderator chooses one of thread, and exports the address signal that the thread provides;
C2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal Other are chosen corresponding counter in the multicast counter group;
C3. the multicast thread provided, which counts, is written the counter, and the output useful signal line end sets 1, according to this Thread gives address and writes data into the dual-channel memory;
D. the read operation of non-blocking mode
D1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by The moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting institute State a byte of dual-channel memory;
D2. a multicast counter group is chosen by minimum 1-2 of described address signal, by described address signal Other are chosen corresponding counter in the multicast counter group, check whether the value of the counter is more than 0;
D3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, given according to the thread Read the specified bytes of the dual-channel memory in address;Otherwise,
D4. the output useful signal line end is set to 0, this read operation failure.

Claims (3)

1. the shared storage organization in a kind of parallel processor, it is characterized in that:It includes:Several information channels, read-write binary channels Memory, several multicast counter groups and moderator;
Several information channels respectively connect multiple parallel processors, and each parallel processor executes 2-16 thread; Several information channels include:Address signal line end, storage enable signal line end, read-write selection signal line end, byte Selection signal line end, input data signal line end, outputting data signals line end, output useful signal line end;The input data Signal line end includes:Input data line end, multicast count line end and obstruction or non-blocking mode line end;
The read-write dual-channel memory is used to access the data that the cross-thread transmits, the number for the data that the cross-thread transmits Be 4 bytes according to width, can by byte access, data that the cross-thread transmits be 32 single precision word or 64 Double precision word;
Several multicast counter groups respectively contain n counter, and the counter is used to record multicast count information, by Address signal is chosen, and corresponding signal control reads and writes and decrement operations;
The moderator executes repeating query arbitration algorithm, selects a thread to carry out the read-write dual-channel memory every time Read-write.
2. the shared storage organization in parallel processor according to claim 1, it is characterized in that:
The quantity in described information channel is 4, and each described information channel connects 4 parallel processors;
The quantity of the multicast counter group is 16, and the counter that each multicast counter group includes is 16, each The counter is 3-4 bits, supports the data multicast of 6-8 cross-thread;
The shared storage organization occupies the address space of the parallel processor data storage, is mapped in the ground of data storage The highest address part in location space.
3. the shared storage method in a kind of parallel processor, it is used in parallel processor as claimed in claim 1 or 2 Shared storage organization, and execute following operation:
A. the write operation of blocking model
A1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by described Moderator chooses one of thread, and exports the address signal that the thread provides;
A2. a multicast counter group is chosen by minimum 1-2 of described address signal, by the other of described address signal Corresponding counter in the multicast counter group is chosen in position, checks whether the value of the counter is 0;
A3. if it is 0, the multicast which is provided, which counts, is written the counter, and the output useful signal line end sets 1, Address, which is given, according to the thread writes data into the dual-channel memory;Otherwise,
A4. the output useful signal line end is set to 0, the failure of this write operation;
B. the read operation of blocking model
B1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by described Moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting described double One byte of channel memory;
B2. a multicast counter group is chosen by minimum 1-2 of described address signal, by the other of described address signal Corresponding counter in the multicast counter group is chosen in position, checks whether the value of the counter is more than 0;
B3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, and address is given according to the thread Read the specified bytes of the dual-channel memory;Otherwise,
B4. the output useful signal line end is set to 0, this read operation failure;
C. the write operation of non-blocking mode
C1. when several threads from described information channel ask to write data to the dual-channel memory simultaneously, by described Moderator chooses one of thread, and exports the address signal that the thread provides;
C2. a multicast counter group is chosen by minimum 1-2 of described address signal, by the other of described address signal Choose corresponding counter in the multicast counter group in position;
C3. the multicast thread provided, which counts, is written the counter, and the output useful signal line end sets 1, according to the thread Given address writes data into the dual-channel memory;
D. the read operation of non-blocking mode
D1. when several threads from described information channel ask to read data from the dual-channel memory simultaneously, by described Moderator chooses one of thread, and exports the address signal that the thread provides, which is also used for selecting described double One byte of channel memory;
D2. a multicast counter group is chosen by minimum 1-2 of described address signal, by the other of described address signal Corresponding counter in the multicast counter group is chosen in position, checks whether the value of the counter is more than 0;
D3. if it is greater than 0, then the Counter Value is subtracted 1, the output useful signal line end sets 1, and address is given according to the thread Read the specified bytes of the dual-channel memory;Otherwise,
D4. the output useful signal line end is set to 0, this read operation failure.
CN201510346816.4A 2015-06-23 2015-06-23 Shared storage organization in parallel processor and method Expired - Fee Related CN104899008B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510346816.4A CN104899008B (en) 2015-06-23 2015-06-23 Shared storage organization in parallel processor and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510346816.4A CN104899008B (en) 2015-06-23 2015-06-23 Shared storage organization in parallel processor and method

Publications (2)

Publication Number Publication Date
CN104899008A CN104899008A (en) 2015-09-09
CN104899008B true CN104899008B (en) 2018-10-12

Family

ID=54031687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510346816.4A Expired - Fee Related CN104899008B (en) 2015-06-23 2015-06-23 Shared storage organization in parallel processor and method

Country Status (1)

Country Link
CN (1) CN104899008B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109845199B (en) * 2016-09-12 2022-03-04 马维尔亚洲私人有限公司 Merging read requests in a network device architecture

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1758229A (en) * 2005-10-28 2006-04-12 中国人民解放军国防科学技术大学 Local space shared memory method of heterogeneous multi-kernel microprocessor
CN1781079A (en) * 2003-06-11 2006-05-31 思科技术公司 Maintaining entity order with gate managers
US7680988B1 (en) * 2006-10-30 2010-03-16 Nvidia Corporation Single interconnect providing read and write access to a memory shared by concurrent threads
CN102622192A (en) * 2012-02-27 2012-08-01 北京理工大学 Weak correlation multiport parallel store controller
US8392891B2 (en) * 2008-06-26 2013-03-05 Microsoft Corporation Technique for finding relaxed memory model vulnerabilities

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1781079A (en) * 2003-06-11 2006-05-31 思科技术公司 Maintaining entity order with gate managers
CN1758229A (en) * 2005-10-28 2006-04-12 中国人民解放军国防科学技术大学 Local space shared memory method of heterogeneous multi-kernel microprocessor
US7680988B1 (en) * 2006-10-30 2010-03-16 Nvidia Corporation Single interconnect providing read and write access to a memory shared by concurrent threads
US8392891B2 (en) * 2008-06-26 2013-03-05 Microsoft Corporation Technique for finding relaxed memory model vulnerabilities
CN102622192A (en) * 2012-02-27 2012-08-01 北京理工大学 Weak correlation multiport parallel store controller

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
多核处理器Cache一致性协议关键技术研究;黄安文;《计算机工程与科学》;20090910;第104-108页 *

Also Published As

Publication number Publication date
CN104899008A (en) 2015-09-09

Similar Documents

Publication Publication Date Title
TWI425512B (en) Flash memory controller circuit and storage system and data transfer method thereof
US10789182B2 (en) System and method for individual addressing
CN105051711A (en) Methods and apparatuses for providing data received by a state machine engine
US20160246514A1 (en) Memory system
CN103019810A (en) Scheduling and management of compute tasks with different execution priority levels
CN103246625B (en) A kind of method of data and address sharing pin self-adaptative adjustment memory access granularity
CN103988212A (en) Methods and systems for routing in state machine
KR20110059712A (en) Independently controlled virtual memory devices in memory modules
US11029746B2 (en) Dynamic power management network for memory devices
EP3910488A1 (en) Systems, methods, and devices for near data processing
CN103744644A (en) Quad-core processor system built in quad-core structure and data switching method thereof
CN106951488A (en) A kind of log recording method and device
CN103890857A (en) Shiftable memory employing ring registers
CN108733580A (en) Method for scheduling read commands
CN104317770A (en) Data storage structure and data access method for multiple core processing system
CN101930407B (en) Flash memory control circuit and memory system and data transmission method thereof
CN104239232A (en) Ping-Pong cache operation structure based on DPRAM (Dual Port Random Access Memory) in FPGA (Field Programmable Gate Array)
CN101515221A (en) Method, device and system for reading data
CN104051009A (en) Gating circuit and gating method of resistive random access memory (RRAM)
CN101825997A (en) Asynchronous first-in first-out storage
CN104899008B (en) Shared storage organization in parallel processor and method
CN102789424B (en) External extended DDR2 (Double Data Rate 2) read-write method on basis of FPGA (Field Programmable Gate Array) and external extended DDR2 particle storage on basis of FPGA
CN104035897A (en) Storage controller
CN109614145A (en) A kind of processor core core structure and data access method
CN103412848A (en) Method for sharing single program memory by four-core processor system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20181012