CN103701621A - Message passing interface broadcasting method and device - Google Patents

Message passing interface broadcasting method and device Download PDF

Info

Publication number
CN103701621A
CN103701621A CN201310670374.XA CN201310670374A CN103701621A CN 103701621 A CN103701621 A CN 103701621A CN 201310670374 A CN201310670374 A CN 201310670374A CN 103701621 A CN103701621 A CN 103701621A
Authority
CN
China
Prior art keywords
switch
forerunner
follow
data
numbering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310670374.XA
Other languages
Chinese (zh)
Other versions
CN103701621B (en
Inventor
熊文
喻之斌
须成忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN201310670374.XA priority Critical patent/CN103701621B/en
Publication of CN103701621A publication Critical patent/CN103701621A/en
Application granted granted Critical
Publication of CN103701621B publication Critical patent/CN103701621B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a message passing interface broadcasting method and a message passing interface broadcasting device and belongs to the field of network communication. The method comprises the following steps: judging whether a precursor process exists on each switch; if not all switches are provided with the precursor processes, sending data to processes on the switches without the precursor process; if the precursor process exists on each switch, sending the data to the process in the switch. According to the invention, by the message passing interface broadcasting method and the message passing interface broadcasting device, efficiency of a broadcasting algorithm is improved.

Description

A kind of message passing interface broadcasting method and device
Technical field
The present invention relates to network communication field, particularly a kind of message passing interface broadcasting method and device.
Background technology
Along with the development of High Performance Computing, Message-Passing Model has obtained being widely used as typical parallel program development pattern.Message passing interface (Message Passing Interface, MPI) is as one of Message-Passing Model being widely used, and it is suitable for developing the Parallel application on isomery or isomorphism cluster, has higher communication performance.
Due to the relation of network service time delay, the topological structure of communication network has very important impact to the performance of high-performance computer.The functional characteristic of different topology structure, network delay, bandwidth, hardware complexity, extensibility and reliability are not identical yet.Three kinds of basic connected modes below: (a) 2D or 3D network (2D, 3D Mesh): connected mode is very simple, to node and its adjacent node swap data, application scenario is very effective frequently at the same time.The performance of this network depends primarily on the performance of router in network.(b) hypercube (Hypercube) network: the main thought of this connection is " jumping (the Hop) " number that reduces any two inter-node communications.Its spreading performance is poor, along with the needed interstitial content of increase of hypercube dimension is pressed exponential increase.(c) switching network: all nodes are all directly connected with one or more high-speed switchs, belong to dynamic connected mode and speed very fast.In mainframe computer, the topological structure of communication network may be more complicated, therefore when node communication, often uses some to seek footpath algorithm, and this cans be compared to the Packet routing operation on IP network.These typical algorithm have storage forwarding, virtual straight-through, circuit switched, worm-eaten to seek footpath etc.In addition, similar with IP route, when finding path, tend to run into the phenomenons such as deadlock, conflict, message congestion.
Broadcast is as a kind of common communication pattern under high-performance calculation and Parallel Programming Environment, the efficiency of its algorithm, and readability, extensibility has affected the performance of the high-performance calculation application program of program greatly.Complicated network topology structure is the challenge that realized efficient flooding algorithm band.
Existing flooding algorithm is the realization based on tree, and this algorithm can utilize a plurality of networks to connect within any one time of clapping simultaneously.As shown in Figure 1, when the first count of broadcasting process, No. 0 first process issues data No. 1 process.When second count, No. 0 process and No. 1 process mail to data respectively No. 2 and No. 3 processes, and two networks connections are used simultaneously.During third shot, 0,2,1, No. 3 process sends the data to respectively process 4,6,5, No. 7, and four networks connections are used simultaneously.The utilance of network is double when next is clapped, and each process ceaselessly sends the data to data and do not arrive process after receiving data, until all process is all obtained these data.
In high-performance calculation node, generally dispose polycaryon processor, a plurality of calculation procedure together with time run on a computing node, process communication is transboundary obviously large than process communication cost on same node.And, in HPCC, may comprise multilayer cascaded switches, within the scope of each switch, all there are a plurality of high-performance calculation nodes to work simultaneously, the process communication of switch-spanning is obviously large than process communication cost in same switch.Therefore, prior art is not considered distance and the internodal topological relation between process, and the efficiency of flooding algorithm is lower.
Summary of the invention
The embodiment of the present invention provides a kind of message passing interface broadcasting method and device, has improved the efficiency of flooding algorithm.
On the one hand, the embodiment of the present invention provides a kind of message passing interface broadcasting method, and described method comprises:
Judge and on every switch, whether have forerunner's process;
If be not to have forerunner's process on every switch, send the data to the process on the switch that there is no forerunner's process;
If have forerunner's process on every switch, send the data to the process in this switch.
Second aspect, the embodiment of the present invention provides a kind of message passing interface broadcaster, and described device comprises:
Judge module, for judging whether there is forerunner's process on every switch;
The first sending module, if for not being to have forerunner's process on every switch, send the data to the process on the switch that there is no forerunner's process;
The second sending module, if for there being forerunner's process on every switch, send the data to the process in this switch.
The beneficial effect that technical scheme provided by the invention is brought is:
From the invention described above embodiment, owing to considering internodal topological relation in flooding algorithm, high priority data is sent to the process on the switch that there is no forerunner's process, afterwards, between the process of data in switch, transmit, therefore, in the efficiency that has improved flooding algorithm.
Accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme in the embodiment of the present invention, below the accompanying drawing of required use during embodiment is described is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the schematic flow sheet of existing flooding algorithm;
Fig. 2 is the first embodiment flow chart of a kind of message passing interface broadcasting method of the present invention;
Fig. 3 is a kind of message passing interface broadcasting method the second embodiment flow chart of the present invention;
Fig. 4 is a kind of message passing interface broadcaster of the present invention example structure schematic diagram;
Fig. 5 is a kind of message passing interface broadcaster the second sending module example structure schematic diagram of the present invention;
Fig. 6 is multi-exchange process broadcast schematic flow sheet;
Fig. 7 is process broadcast schematic flow sheet in single switch.
Embodiment
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing, embodiment of the present invention is described further in detail.
A kind of message passing interface broadcasting method the first embodiment flow process of the present invention, referring to Fig. 2, described method comprises:
101: judge on every switch, whether there is forerunner's process.Forerunner's process is for sending the process of data, and follow-up process is for receiving the process of data.As shown in Figure 6, three switches are connected, and two sub-switches are linked into host exchange, also have a plurality of computing node interconnection under host exchange, and in single switch cluster, computing node forms a broadcast annulus.In concrete enforcement, need to judge on three switches whether have forerunner's process.
102a: if be not to have forerunner's process on every switch, send the data to the process on the switch that there is no forerunner's process.As shown in Figure 6, if there is forerunner's process on switch A and switch b, on switch C, there is no forerunner's process, send the data to the process on switch C.
102b: if there is forerunner's process on every switch, send the data to the process in this switch.If there is forerunner's process on every switch, according to described forerunner's process numbering, calculate current beat number, be specially: a i=a i-1+ 1; Wherein: a ifor current beat number; a i-1beat number for last beat; According to described current beat number, calculate follow-up process numbering, be specially: b i+1=b i+ L/2 a; Wherein: b i+1for follow-up process number; b ifor forerunner's process number; L is process number in this switch; A is current beat number; Judge that whether described follow-up process numbering is less than process number in switch, if described follow-up process numbering is less than process number in switch, sends the data to described follow-up process.There is corresponding table process numbering and IP address corresponding to process, search this relation table, obtain the IP address of follow-up process, send the data to described follow-up process.
In single switch, process broadcast schematic flow sheet as shown in Figure 7, relates to the corresponding process of N node and each node in cluster, and corresponding process number is respectively 0,1,2 ..., N-1, L=N wherein, process numbering is positioned in annulus by size clockwise.Wherein No. 0 process represents with A.In following table the first row entitling when each beat, follow-up process and the computational methods of A process.
Figure BDA0000434193210000041
With the first behavior example in table, suppose that the process number of A is 0, total process numerical digit 128, L=128 in system.When first count, A sends the data to the 64th process; When second count, A sends the data to the 32nd process; When third shot, A sends the data to the 16th process.
Each process acquiescence is known the process number of oneself, supposes that certain process is numbered m, and the beat numerical digit of oneself receiving message is n.Receive after message, according to the algorithm in form, be delivered to next process.
The present embodiment, by consider internodal topological relation in flooding algorithm, sends to high priority data the process on the switch that there is no forerunner's process, afterwards, between the process of data in switch, transmits, and therefore, has improved the efficiency of flooding algorithm.
A kind of message passing interface broadcasting method the second embodiment flow process of the present invention, as shown in Figure 3, described method flow comprises:
201: judge on every switch, whether there is forerunner's process.Identical with step 101, repeat no more herein.
202a: if be not to have forerunner's process on every switch, send the data to the process on the switch that there is no forerunner's process.Identical with step 102, repeat no more herein.
202b: if there is forerunner's process on every switch, calculate current beat number according to described forerunner's process numbering.According to described forerunner's process numbering, calculating current beat number is specially: a i=a i-1+ 1; Wherein: a ifor current beat number; a i-1beat number for last beat.
203b: calculate follow-up process numbering according to described current beat number.According to described current beat number, calculating follow-up process numbering is specially: b i+1=b i+ L/2 a; Wherein: b i+1for follow-up process number; b ifor forerunner's process number; L is process number in this switch; A is current beat number.
In single switch, process broadcast schematic flow sheet as shown in Figure 7, relates to the corresponding process of N node and each node in cluster, and corresponding process number is respectively 0,1, and 2 ... N-1. L=N wherein, process numbering is positioned in annulus by size clockwise.Wherein No. 0 process represents with A.In following table the first row entitling when each beat, follow-up process and the computational methods of A process.
Figure BDA0000434193210000051
With the first behavior example in table, suppose that the process number of A is 0, total process numerical digit 128, L=128 in system.When first count, A sends the data to the 64th process; When second count, A sends the data to the 32nd process; When third shot, A sends the data to the 16th process.
Each process acquiescence is known the process number of oneself, supposes that certain process is numbered m, and the beat numerical digit of oneself receiving message is n.Receive after message, according to the algorithm in form, be delivered to next process.
204b: judge whether described follow-up process numbering is less than process number in switch.If described follow-up process numbering is greater than process number in switch, finish to send data.
205b: if described follow-up process numbering is less than process number in switch, send the data to described follow-up process.There is corresponding table process numbering and IP address corresponding to process, search this relation table, obtain the IP address of follow-up process, send the data to described follow-up process.
The present embodiment, by consider internodal topological relation in flooding algorithm, sends to high priority data the process on the switch that there is no forerunner's process, afterwards, between the process of data in switch, transmits, and therefore, has improved the efficiency of flooding algorithm.
Corresponding with the embodiment of a kind of message passing interface broadcasting method of the present invention, the present invention also provides a kind of embodiment of message passing interface broadcaster.
As shown in Figure 4, described device 30 comprises the first example structure schematic diagram of a kind of message passing interface broadcaster of the present invention: judge module 310, the first sending module 320 and the second sending module 330.
Judge module 310, for judging whether there is forerunner's process on every switch.Forerunner's process is for sending the process of data, and follow-up process is for receiving the process of data.As shown in Figure 6, three switches are connected, and two sub-switches are linked into host exchange, also have a plurality of computing node interconnection under host exchange, and in single switch cluster, computing node forms a broadcast annulus.In concrete enforcement, need to judge on three switches whether have forerunner's process.
The first sending module 320, if for not being to have forerunner's process on every switch, send the data to the process on the switch that there is no forerunner's process.As shown in Figure 6, if there is forerunner's process on switch A and switch b, on switch C, there is no forerunner's process, send the data to the process on switch C.
The second sending module 330, if for there being forerunner's process on every switch, send the data to the process in this switch.If there is forerunner's process on every switch, the second sending module 330 calculates current beat number according to described forerunner's process numbering, is specially: a i=a i-1+ 1; Wherein: a ifor current beat number; a i-1beat number for last beat; According to described current beat number, calculate follow-up process numbering, be specially: b i+1=b i+ L/2 a; Wherein: b i+1for follow-up process number; b ifor forerunner's process number; L is process number in this switch; A is current beat number; Judge that whether described follow-up process numbering is less than process number in switch, if described follow-up process numbering is less than process number in switch, sends the data to described follow-up process.There is corresponding table process numbering and IP address corresponding to process, search this relation table, obtain the IP address of follow-up process, send the data to described follow-up process.
In single switch, process broadcast schematic flow sheet as shown in Figure 7, relates to the corresponding process of N node and each node in cluster, and corresponding process number is respectively 0,1, and 2 ... N-1. L=N wherein, process numbering is positioned in annulus by size clockwise.Wherein No. 0 process represents with A.In following table the first row entitling when each beat, follow-up process and the computational methods of A process.
Figure BDA0000434193210000071
With the first behavior example in table, suppose that the process number of A is 0, total process numerical digit 128, L=128 in system.When first count, A sends the data to the 64th process; When second count, A sends the data to the 32nd process; When third shot, A sends the data to the 16th process.
Each process acquiescence is known the process number of oneself, supposes that certain process is numbered m, and the beat numerical digit of oneself receiving message is n.Receive after message, according to the algorithm in form, be delivered to next process.
Wherein, the second sending module 330 comprises the first computing unit 331, the second computing unit 332 and transmitting element 333.
The first computing unit 331, for calculating current beat number according to described forerunner's process numbering.The first computing unit 331 calculates current beat number according to described forerunner's process numbering and is specially: a i=a i-1+ 1; Wherein: a ifor current beat number; a i-1beat number for last beat.
The second computing unit 332, for calculating follow-up process numbering according to described current beat number.The second computing unit 332 calculates follow-up process numbering according to described current beat number and is specially: b i+1=b i+ L/2 a; Wherein: b i+1for follow-up process number; b ifor forerunner's process number; L is process number in this switch; A is current beat number.
Transmitting element 333, for sending the data to described follow-up process.There is corresponding table process numbering and IP address corresponding to process, search this relation table, obtain the IP address of follow-up process, send the data to described follow-up process.
Preferably, a kind of message passing interface broadcaster the second sending module 330 also comprises judging unit 334, for judging whether described follow-up process numbering is less than switch process number; If described follow-up process numbering is less than process number in switch, trigger transmitting element 333.If described follow-up process numbering is greater than process number in switch, finish to send data.
The present embodiment, by consider internodal topological relation in flooding algorithm, sends to high priority data the process on the switch that there is no forerunner's process, afterwards, between the process of data in switch, transmits, and therefore, has improved the efficiency of flooding algorithm.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
One of ordinary skill in the art will appreciate that all or part of step that realizes above-described embodiment can complete by hardware, also can come the hardware that instruction is relevant to complete by program, described program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium of mentioning can be read-only memory, disk or CD etc.
The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (10)

1. a message passing interface broadcasting method, is characterized in that, described method comprises:
Judge and on every switch, whether have forerunner's process;
If be not to have forerunner's process on every switch, send the data to the process on the switch that there is no forerunner's process;
If have forerunner's process on every switch, send the data to the process in this switch.
2. method according to claim 1, is characterized in that, described in the process that sends the data in this switch comprise:
According to described forerunner's process numbering, calculate current beat number;
According to described current beat number, calculate follow-up process numbering;
Send the data to described follow-up process.
3. method according to claim 2, is characterized in that, also comprises:
Judge whether described follow-up process numbering is less than process number in switch;
If described follow-up process numbering is less than process number in switch, carries out and send the data to described follow-up process steps.
4. method according to claim 2, is characterized in that, describedly according to described forerunner's process numbering, calculates current beat number and is specially: a i=a i-1+ 1; Wherein: a ifor current beat number; a i-1beat number for last beat.
5. method according to claim 2, is characterized in that, describedly according to described current beat number, calculates follow-up process numbering and is specially: b i+1=b i+ L/2 a; Wherein: b i+1for follow-up process number; b ifor forerunner's process number; L is process number in this switch; A is current beat number.
6. a message passing interface broadcaster, is characterized in that, described device comprises:
Judge module, for judging whether there is forerunner's process on every switch;
The first sending module, if for not being to have forerunner's process on every switch, send the data to the process on the switch that there is no forerunner's process;
The second sending module, if for there being forerunner's process on every switch, send the data to the process in this switch.
7. device according to claim 6, is characterized in that, described the second sending module comprises:
The first computing unit, for calculating current beat number according to described forerunner's process numbering;
The second computing unit, for calculating follow-up process numbering according to described current beat number;
Transmitting element, for sending the data to described follow-up process.
8. device according to claim 7, is characterized in that, also comprises:
Judging unit, for judging whether described follow-up process numbering is less than switch process number;
If described follow-up process numbering is less than process number in switch, trigger transmitting element.
9. device according to claim 6, is characterized in that, described the first computing unit calculates current beat number according to described forerunner's process numbering and is specially: a i=a i-1+ 1; Wherein: a ifor current beat number; a i-1beat number for last beat.
10. device according to claim 6, is characterized in that, described the second computing unit calculates follow-up process numbering according to described current beat number and is specially: b i+1=b i+ L/2 a; Wherein: b i+1for follow-up process number; b ifor forerunner's process number; L is process number in this switch; A is current beat number.
CN201310670374.XA 2013-12-10 2013-12-10 A kind of message passing interface broadcasting method and device Active CN103701621B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310670374.XA CN103701621B (en) 2013-12-10 2013-12-10 A kind of message passing interface broadcasting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310670374.XA CN103701621B (en) 2013-12-10 2013-12-10 A kind of message passing interface broadcasting method and device

Publications (2)

Publication Number Publication Date
CN103701621A true CN103701621A (en) 2014-04-02
CN103701621B CN103701621B (en) 2017-11-24

Family

ID=50363024

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310670374.XA Active CN103701621B (en) 2013-12-10 2013-12-10 A kind of message passing interface broadcasting method and device

Country Status (1)

Country Link
CN (1) CN103701621B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105094998A (en) * 2015-09-22 2015-11-25 浪潮(北京)电子信息产业有限公司 MPI communication method and system of GTC software

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060282838A1 (en) * 2005-06-08 2006-12-14 Rinku Gupta MPI-aware networking infrastructure
CN1988463A (en) * 2005-12-21 2007-06-27 国际商业机器公司 Method and system for large message broadcast
US7539989B2 (en) * 2004-10-12 2009-05-26 International Business Machines Corporation Facilitating intra-node data transfer in collective communications
CN102982008A (en) * 2012-11-01 2013-03-20 山东大学 Complicated function maximum and minimum solving method by means of parallel artificial bee colony algorithm based on computer cluster

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7539989B2 (en) * 2004-10-12 2009-05-26 International Business Machines Corporation Facilitating intra-node data transfer in collective communications
US20060282838A1 (en) * 2005-06-08 2006-12-14 Rinku Gupta MPI-aware networking infrastructure
CN1988463A (en) * 2005-12-21 2007-06-27 国际商业机器公司 Method and system for large message broadcast
CN102982008A (en) * 2012-11-01 2013-03-20 山东大学 Complicated function maximum and minimum solving method by means of parallel artificial bee colony algorithm based on computer cluster

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王丽: "网格环境下对已知拓扑型集合通信的研究", 《中国优秀博硕士学位论文全文数据库 (硕士) 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105094998A (en) * 2015-09-22 2015-11-25 浪潮(北京)电子信息产业有限公司 MPI communication method and system of GTC software
CN105094998B (en) * 2015-09-22 2019-05-10 浪潮(北京)电子信息产业有限公司 A kind of the MPI communication means and system of GTC software

Also Published As

Publication number Publication date
CN103701621B (en) 2017-11-24

Similar Documents

Publication Publication Date Title
US10673741B2 (en) Control device discovery in networks having separate control and forwarding devices
KR101809396B1 (en) Method to route packets in a distributed direct interconnect network
Li et al. Scalable data center multicast using multi-class bloom filter
US8730793B2 (en) Method and apparatus providing network redundancy and high availability to remote network nodes
US9825844B2 (en) Network topology of hierarchical ring with recursive shortcuts
CN107211036B (en) Networking method for data center network and data center network
CN106254254B (en) Mesh topology structure-based network-on-chip communication method
CN101789949B (en) Method and router equipment for realizing load sharing
CN111147372B (en) Downlink message sending and forwarding method and device
CN101383772B (en) Method and device for automatically discovering and establishing MAC route information table
Wang et al. Designing efficient high performance server-centric data center network architecture
US9529775B2 (en) Network topology of hierarchical ring with gray code and binary code
CN104301229B (en) Data packet forwarding method, route table generating method and device
CN113765956B (en) Message processing method, device, system and storage medium
CN103297354B (en) Server interlinkage system, server and data forwarding method
Harsh et al. Spineless data centers
CN106101262A (en) A kind of Direct Connect Architecture computing cluster system based on Ethernet and construction method
CN105075196A (en) Control device, communication system, path switching method, and program
CN106921576B (en) Virtualization system-based data network and management network flow separation method and device
Hwang et al. Design of SDN-Enabled cloud data center
CN107872385A (en) A kind of SDN router-level topology and control method
CN103701621A (en) Message passing interface broadcasting method and device
CN108900422A (en) Multicast forward method, device and electronic equipment
CN114006780A (en) Method, equipment and system for forwarding message
Jin et al. Sandwich Tree: A new datacenter network based on passive optical devices

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant