CN104899008A - Shared storage structure and method in parallel processor - Google Patents

Shared storage structure and method in parallel processor Download PDF

Info

Publication number
CN104899008A
CN104899008A CN201510346816.4A CN201510346816A CN104899008A CN 104899008 A CN104899008 A CN 104899008A CN 201510346816 A CN201510346816 A CN 201510346816A CN 104899008 A CN104899008 A CN 104899008A
Authority
CN
China
Prior art keywords
thread
counter
line end
multicast
write
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510346816.4A
Other languages
Chinese (zh)
Other versions
CN104899008B (en
Inventor
李萍萍
杨新宪
张振龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yuhuacong Technology Co Ltd
Original Assignee
Beijing Yuhuacong Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yuhuacong Technology Co Ltd filed Critical Beijing Yuhuacong Technology Co Ltd
Priority to CN201510346816.4A priority Critical patent/CN104899008B/en
Publication of CN104899008A publication Critical patent/CN104899008A/en
Application granted granted Critical
Publication of CN104899008B publication Critical patent/CN104899008B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)
  • Multi Processors (AREA)

Abstract

The invention belongs to the field of computer system structures and particularly relates to information storage and transmission in parallel processors. According to the technical scheme, the structure comprises a plurality of information channels, a plurality of multicast counter sets and an arbiter. By means of the structure and the method, an information transmission mechanism among parallel processor multiple threads is provided, the information multicast transmission is achieved, and memory access mutual exclusion check can be completed only through one clock, accordingly, communication costs and power consumption are reduced greatly. By means of the structure and the method, parallel computation can be performed on programmable processor arrays efficiently.

Description

Shared storage organization in parallel processor and method
Technical field
The invention belongs to field of computer architecture, between the thread particularly in parallel processor, adopt exchanges data and the storage of information transmission mechanism.
Background technology
Modern computing machine architecture improves its processing power with multiprocessor, multi-threaded parallel mode, and this mode requires to exchange information to low expense between multithreading.And the information transmission of cross-thread at present adopts by the mutual exclusion mechanism of software simulating usually, efficiency is very low, realizes also very complicated.
Summary of the invention
The object of the invention is: provide a kind of be specifically designed to adopt between multiprocessor multithreading information transmission mechanism, the shared storage organization of exchanges data and method, efficiently complete the message exchange of multithreading between processor in parallel machine.
Technical scheme of the present invention is: the shared storage organization in a kind of parallel processor, is characterized in that: it comprises: several information channel, read-write binary channels storer, several multicast counter group, and moderator;
Described several information channel connects multiple parallel processor separately, and parallel processor described in each performs 2-16 thread; Described several information channels include: address signal line end, storage enable signal line end, read-write select signal line end, byte to select signal line end, input data signal line end, outputting data signals line end, export useful signal line end; Described input data signal line end comprises: input data line end, multicast counting line end and obstruction/non-blocking mode line end;
The data that described read-write binary channels storer transmits for accessing described cross-thread, data width is 4 bytes, can by byte access, and data are the single precision word of 32 or the double precision word of 64;
The each self-contained n counter of described several multicast counter groups, described counter, for recording multicast count information, is chosen by address signal, and corresponding signal controls to read and write and decrement operations;
Described moderator performs repeating query arbitration algorithm, and each selection described thread is read and write described read-write binary channels storer.
A shared storage means in parallel processor, the shared storage organization in its upper described parallel processor, and perform following operation:
A. the write operation of blocking model
A1. when several thread from described information channel asks to write data to described binary channels storer simultaneously, choose one of them thread by described moderator, and export the address signal that this thread provides;
A2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal, check whether the value of this counter is 0;
If be A3. 0, then this counter of multicast counting write this thread provided, described output useful signal line end puts 1, according to this thread given address, data is write described binary channels storer; Otherwise,
A4. described output useful signal line end sets to 0, this write operation failure;
B. the read operation of blocking model
B1. when several thread from described information channel is asked from described binary channels storer read data simultaneously, one of them thread is chosen by described moderator, and exporting the address signal that this thread provides, this address signal is also used for selecting a byte of described binary channels storer;
B2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal, check whether the value of this counter is greater than 0;
If be B3. greater than 0, then this Counter Value is subtracted 1, described output useful signal line end puts 1, reads the specified bytes of described binary channels storer according to the given address of this thread; Otherwise,
B4. described output useful signal line end sets to 0, this read operation failure;
C. the write operation of non-blocking mode
C1. when several thread from described information channel asks to write data to described binary channels storer simultaneously, choose one of them thread by described moderator, and export the address signal that this thread provides;
C2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal;
C3. this counter of multicast counting write this thread provided, described output useful signal line end puts 1, according to this thread given address, data is write described binary channels storer;
D. the read operation of non-blocking mode
D1. when several thread from described information channel is asked from described binary channels storer read data simultaneously, one of them thread is chosen by described moderator, and exporting the address signal that this thread provides, this address signal is also used for selecting a byte of described binary channels storer;
D2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal, check whether the value of this counter is greater than 0;
If be D3. greater than 0, then this Counter Value is subtracted 1, described output useful signal line end puts 1, reads the specified bytes of described binary channels storer according to the given address of this thread; Otherwise,
D4. described output useful signal line end sets to 0, this read operation failure.
The invention provides the information transmission mechanism between parallel processor multithreading, realize the multicast transmission of information, usually only need a clock just can complete memory access mutual exclusion inspection, greatly reduce communication overhead and power consumption.Structure and method of the present invention can also realize concurrent operation efficiently on programmable processor array.
Accompanying drawing explanation
Accompanying drawing 1 is structural representation of the present invention;
Accompanying drawing 2 is multicast counter group structural representation in the present invention;
Accompanying drawing 3 is the write operation process flow diagram of blocking model in the present invention;
Accompanying drawing 4 is the read operation process flow diagram of blocking model and non-blocking mode in the present invention;
Accompanying drawing 5 is the write operation process flow diagram of non-blocking mode in the present invention.
Embodiment
Embodiment 1: see accompanying drawing 1,2, the shared storage organization in a kind of parallel processor, is characterized in that: it comprises: several information channel, read-write binary channels storer, several multicast counter group, and moderator;
Described several information channel connects multiple parallel processor separately, and parallel processor described in each performs 2-16 thread; Described several information channels include: address signal line end, storage enable signal line end, read-write select signal line end, byte to select signal line end, input data signal line end, outputting data signals line end, export useful signal line end; Described input data signal line end comprises: input data line end, multicast counting line end and obstruction/non-blocking mode line end;
The data that described read-write binary channels storer transmits for accessing described cross-thread, data width is 4 bytes, can by byte access, and data are the single precision word of 32 or the double precision word of 64;
The each self-contained n counter of described several multicast counter groups, described counter, for recording multicast count information, is chosen by address signal, and corresponding signal controls to read and write and decrement operations;
Described moderator performs repeating query arbitration algorithm, and each selection described thread is read and write described read-write binary channels storer.
Embodiment 2: see accompanying drawing 1,2, the shared storage organization as described in Example 1 in parallel processor, is characterized in that:
The quantity of described information channel is 4, and each described information channel connects 4 parallel processors;
The quantity of described multicast counter group is 16, and the counter that each described multicast counter group comprises is 16, and each described counter is 3-4 bit, supports the data multicast of 6-8 cross-thread;
The address space that storage organization occupies the storage of described parallel processor data is shared in described information transmission, is mapped in the superlatively location part of this data space.
Embodiment 3: see accompanying drawing 3,4,5, the shared storage means in a kind of parallel processor, it uses the shared storage organization in the parallel processor as described in embodiment 1 or 2, and performs following operation:
A. the write operation of blocking model
A1. when several thread from described information channel asks to write data to described binary channels storer simultaneously, choose one of them thread by described moderator, and export the address signal that this thread provides;
A2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal, check whether the value of this counter is 0;
If be A3. 0, then this counter of multicast counting write this thread provided, described output useful signal line end puts 1, according to this thread given address, data is write described binary channels storer; Otherwise,
A4. described output useful signal line end sets to 0, this write operation failure;
B. the read operation of blocking model
B1. when several thread from described information channel is asked from described binary channels storer read data simultaneously, one of them thread is chosen by described moderator, and exporting the address signal that this thread provides, this address signal is also used for selecting a byte of described binary channels storer;
B2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal, check whether the value of this counter is greater than 0;
If be B3. greater than 0, then this Counter Value is subtracted 1, described output useful signal line end puts 1, reads the specified bytes of described binary channels storer according to the given address of this thread; Otherwise,
B4. described output useful signal line end sets to 0, this read operation failure;
C. the write operation of non-blocking mode
C1. when several thread from described information channel asks to write data to described binary channels storer simultaneously, choose one of them thread by described moderator, and export the address signal that this thread provides;
C2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal;
C3. this counter of multicast counting write this thread provided, described output useful signal line end puts 1, according to this thread given address, data is write described binary channels storer;
D. the read operation of non-blocking mode
D1. when several thread from described information channel is asked from described binary channels storer read data simultaneously, one of them thread is chosen by described moderator, and exporting the address signal that this thread provides, this address signal is also used for selecting a byte of described binary channels storer;
D2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal, check whether the value of this counter is greater than 0;
If be D3. greater than 0, then this Counter Value is subtracted 1, described output useful signal line end puts 1, reads the specified bytes of described binary channels storer according to the given address of this thread; Otherwise,
D4. described output useful signal line end sets to 0, this read operation failure.

Claims (3)

1. the shared storage organization in parallel processor, is characterized in that: it comprises: several information channel, read-write binary channels storer, several multicast counter group, and moderator;
Described several information channel connects multiple parallel processor separately, and parallel processor described in each performs 2-16 thread; Described several information channels include: address signal line end, storage enable signal line end, read-write select signal line end, byte to select signal line end, input data signal line end, outputting data signals line end, export useful signal line end; Described input data signal line end comprises: input data line end, multicast counting line end and obstruction/non-blocking mode line end;
The data that described read-write binary channels storer transmits for accessing described cross-thread, data width is 4 bytes, can by byte access, and data are the single precision word of 32 or the double precision word of 64;
The each self-contained n counter of described several multicast counter groups, described counter, for recording multicast count information, is chosen by address signal, and corresponding signal controls to read and write and decrement operations;
Described moderator performs repeating query arbitration algorithm, and each selection described thread is read and write described read-write binary channels storer.
2. the shared storage organization in parallel processor according to claim 1, is characterized in that:
The quantity of described information channel is 4, and each described information channel connects 4 parallel processors;
The quantity of described multicast counter group is 16, and the counter that each described multicast counter group comprises is 16, and each described counter is 3-4 bit, supports the data multicast of 6-8 cross-thread;
The address space that storage organization occupies the storage of described parallel processor data is shared in described information transmission, is mapped in the superlatively location part of this data space.
3. the shared storage means in parallel processor, it uses the shared storage organization in parallel processor as claimed in claim 1 or 2, and performs following operation:
A. the write operation of blocking model
A1. when several thread from described information channel asks to write data to described binary channels storer simultaneously, choose one of them thread by described moderator, and export the address signal that this thread provides;
A2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal, check whether the value of this counter is 0;
If be A3. 0, then this counter of multicast counting write this thread provided, described output useful signal line end puts 1, according to this thread given address, data is write described binary channels storer; Otherwise,
A4. described output useful signal line end sets to 0, this write operation failure;
B. the read operation of blocking model
B1. when several thread from described information channel is asked from described binary channels storer read data simultaneously, one of them thread is chosen by described moderator, and exporting the address signal that this thread provides, this address signal is also used for selecting a byte of described binary channels storer;
B2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal, check whether the value of this counter is greater than 0;
If be B3. greater than 0, then this Counter Value is subtracted 1, described output useful signal line end puts 1, reads the specified bytes of described binary channels storer according to the given address of this thread; Otherwise,
B4. described output useful signal line end sets to 0, this read operation failure;
C. the write operation of non-blocking mode
C1. when several thread from described information channel asks to write data to described binary channels storer simultaneously, choose one of them thread by described moderator, and export the address signal that this thread provides;
C2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal;
C3. this counter of multicast counting write this thread provided, described output useful signal line end puts 1, according to this thread given address, data is write described binary channels storer;
D. the read operation of non-blocking mode
D1. when several thread from described information channel is asked from described binary channels storer read data simultaneously, one of them thread is chosen by described moderator, and exporting the address signal that this thread provides, this address signal is also used for selecting a byte of described binary channels storer;
D2. choose a described multicast counter group by the minimum 1-2 position of described address signal, choose corresponding counter in this multicast counter group by other position of described address signal, check whether the value of this counter is greater than 0;
If be D3. greater than 0, then this Counter Value is subtracted 1, described output useful signal line end puts 1, reads the specified bytes of described binary channels storer according to the given address of this thread; Otherwise,
D4. described output useful signal line end sets to 0, this read operation failure.
CN201510346816.4A 2015-06-23 2015-06-23 Shared storage organization in parallel processor and method Expired - Fee Related CN104899008B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510346816.4A CN104899008B (en) 2015-06-23 2015-06-23 Shared storage organization in parallel processor and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510346816.4A CN104899008B (en) 2015-06-23 2015-06-23 Shared storage organization in parallel processor and method

Publications (2)

Publication Number Publication Date
CN104899008A true CN104899008A (en) 2015-09-09
CN104899008B CN104899008B (en) 2018-10-12

Family

ID=54031687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510346816.4A Expired - Fee Related CN104899008B (en) 2015-06-23 2015-06-23 Shared storage organization in parallel processor and method

Country Status (1)

Country Link
CN (1) CN104899008B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109845199A (en) * 2016-09-12 2019-06-04 马维尔国际贸易有限公司 Merge the read requests in network device architecture

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1758229A (en) * 2005-10-28 2006-04-12 中国人民解放军国防科学技术大学 Local space shared memory method of heterogeneous multi-kernel microprocessor
CN1781079A (en) * 2003-06-11 2006-05-31 思科技术公司 Maintaining entity order with gate managers
US7680988B1 (en) * 2006-10-30 2010-03-16 Nvidia Corporation Single interconnect providing read and write access to a memory shared by concurrent threads
CN102622192A (en) * 2012-02-27 2012-08-01 北京理工大学 Weak correlation multiport parallel store controller
US8392891B2 (en) * 2008-06-26 2013-03-05 Microsoft Corporation Technique for finding relaxed memory model vulnerabilities

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1781079A (en) * 2003-06-11 2006-05-31 思科技术公司 Maintaining entity order with gate managers
CN1758229A (en) * 2005-10-28 2006-04-12 中国人民解放军国防科学技术大学 Local space shared memory method of heterogeneous multi-kernel microprocessor
US7680988B1 (en) * 2006-10-30 2010-03-16 Nvidia Corporation Single interconnect providing read and write access to a memory shared by concurrent threads
US8392891B2 (en) * 2008-06-26 2013-03-05 Microsoft Corporation Technique for finding relaxed memory model vulnerabilities
CN102622192A (en) * 2012-02-27 2012-08-01 北京理工大学 Weak correlation multiport parallel store controller

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄安文: "多核处理器Cache一致性协议关键技术研究", 《计算机工程与科学》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109845199A (en) * 2016-09-12 2019-06-04 马维尔国际贸易有限公司 Merge the read requests in network device architecture
CN109845199B (en) * 2016-09-12 2022-03-04 马维尔亚洲私人有限公司 Merging read requests in a network device architecture

Also Published As

Publication number Publication date
CN104899008B (en) 2018-10-12

Similar Documents

Publication Publication Date Title
TWI425512B (en) Flash memory controller circuit and storage system and data transfer method thereof
CN103246625B (en) A kind of method of data and address sharing pin self-adaptative adjustment memory access granularity
CN111433758A (en) Programmable operation and control chip, design method and device thereof
CN111258535B (en) Ordering method for FPGA implementation
KR20110059712A (en) Independently controlled virtual memory devices in memory modules
CN103744644A (en) Quad-core processor system built in quad-core structure and data switching method thereof
CN104317770A (en) Data storage structure and data access method for multiple core processing system
CN103927270A (en) Shared data caching device for a plurality of coarse-grained dynamic reconfigurable arrays and control method
TW202211034A (en) Method and system of processing dataset, and memory module
CN112017700A (en) Dynamic power management network for memory devices
CN104239232A (en) Ping-Pong cache operation structure based on DPRAM (Dual Port Random Access Memory) in FPGA (Field Programmable Gate Array)
CN101930407B (en) Flash memory control circuit and memory system and data transmission method thereof
CN108959149B (en) Multi-core processor interaction bus design method based on shared memory
CN102681820B (en) The register file of dynamic clustering and use the Reconfigurable Computation device of this register file
CN102789424B (en) External extended DDR2 (Double Data Rate 2) read-write method on basis of FPGA (Field Programmable Gate Array) and external extended DDR2 particle storage on basis of FPGA
CN109614145A (en) A kind of processor core core structure and data access method
CN103412848A (en) Method for sharing single program memory by four-core processor system
CN104035898A (en) Memory access system based on VLIW (Very Long Instruction Word) type processor
CN104572519A (en) Multiport access and storage controller for multiprocessor and control method thereof
CN104899008A (en) Shared storage structure and method in parallel processor
CN103365821A (en) Address generator of heterogeneous multi-core processor
CN102930898B (en) Method of structuring multiport asynchronous storage module
CN103748566A (en) An innovative structure for the register group
CN105335296A (en) Data processing method, apparatus and system
CN112565474B (en) Batch data transmission method oriented to distributed shared SPM

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20181012