CN111367665B - Parallel communication route establishing method and system - Google Patents

Parallel communication route establishing method and system Download PDF

Info

Publication number
CN111367665B
CN111367665B CN202010127096.3A CN202010127096A CN111367665B CN 111367665 B CN111367665 B CN 111367665B CN 202010127096 A CN202010127096 A CN 202010127096A CN 111367665 B CN111367665 B CN 111367665B
Authority
CN
China
Prior art keywords
receiving end
parallel
grid point
global
processes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010127096.3A
Other languages
Chinese (zh)
Other versions
CN111367665A (en
Inventor
刘利
于灏
孙超
李锐喆
于馨竹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202010127096.3A priority Critical patent/CN111367665B/en
Publication of CN111367665A publication Critical patent/CN111367665A/en
Priority to PCT/CN2020/126790 priority patent/WO2021169393A1/en
Application granted granted Critical
Publication of CN111367665B publication Critical patent/CN111367665B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/545Interprogram communication where tasks reside in different layers, e.g. user- and kernel-space
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mathematical Physics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a method and a system for establishing a parallel communication route, wherein the method comprises the following steps: the sending end and the receiving end establish global indexes of all grid points for the same grid, and each process establishes a grid point local-global mapping table based on the grid point global indexes; the sending end/receiving end carries out parallel sequencing on all the processes in the grid point local-global mapping table among all the processes of the sending end/receiving end; the process of the sending end and the receiving end is cooperated, and a sending end-receiving end grid point mapping relation table is established in parallel; the transmitting end/receiving end carries out parallel sequencing on all items of the transmitting end-receiving end grid point mapping relation table among all processes of the transmitting end/receiving end; and each process of the sending end and the receiving end generates a communication routing relation between each process and the process of the corresponding receiving end or the sending end according to the local sending end-receiving end grid point mapping relation table of the processes. The invention can quickly establish the parallel communication routing relation among numerical programs, has high efficiency and parallel expandability.

Description

Parallel communication route establishing method and system
Technical Field
The invention relates to the technical field of numerical programs, in particular to a parallel communication route establishing method and a parallel communication route establishing system.
Background
The earth system mode is an indispensable scientific tool for climate evolution law research, future climate prediction and seamless numerical prediction. It is a complex comprehensive numerical program, which is formed by coupling component modes of earth system circle layers of simulated atmosphere, land, sea ice and the like through couplers. Each component mode of the earth system mode is a numerical program based on a grid, namely, a region to be calculated (two-dimensional surface or three-dimensional space) is divided into a calculation grid consisting of a plurality of grid points (atomic regions) which are not overlapped with each other, and cooperative calculation is carried out on all the grid points. For a calculation area, the more grid points are divided, the higher the resolution of the simulation is, and the larger the calculation amount of the simulation process is. Increasingly high performance computers with computing nodes and processor cores offer opportunities for the accelerated running of many applications, including numeric programs. To achieve better acceleration on high performance computers, the numerical program needs to be written as an efficient parallel version. MPI (Message Passing Interface) is a widely used parallel programming library, and can realize parallel computation between different computing nodes and between different processor cores in the same computing node. At home and abroad, most earth system mode component modes have MPI-based parallel versions.
When developing a parallel version of a numerical program, it is first necessary to assign different grid points of the computational grid to different processes (the assignment of grid points on different processes is hereinafter referred to as parallel partitioning) in order to perform parallel computations. For example, table 1 and table 2 show parallel division using 4 processes and parallel division using 8 processes for the same 8 × 8 grid, respectively, and numbers in the grid points indicate process numbers.
Parallel subdivision of a table 18 x 8 grid using 4 processes
Figure BDA0002394731720000011
Figure BDA0002394731720000021
Parallel subdivision of a table 28 x 8 grid using 8 processes
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7
Due to the mutual coupling between different component modes in the earth system mode, the coupling variable will be transferred from one component mode to another component mode, and the essence of this process is MPI communication for the coupling variable, which is generally implemented by two steps: 1) establishing a parallel communication routing relation between the component modes according to parallel subdivision of the same grid in two component modes, for example, the process No. 1 in the table 1 needs to transfer coupling variable data on a grid point at the upper right corner of the grid to the process No. 7 in the table 2; 2) in the coupling process, according to the established communication routing relation, parallel communication aiming at a plurality of variables among the component modes is completed. The parallel communication route between the two parallel splits typically only needs to be established once, since it can remain unchanged during frequent coupling between the component modes.
Currently, couplers responsible for realizing coupling between earth system mode component modes, such as C-Coupler, OASIS, CPL, and the like, all adopt the above two steps to realize parallel communication between component modes, wherein a global algorithm is adopted to realize construction of a parallel communication routing relationship, that is: and one process P of one component mode firstly acquires the global parallel subdivision information of ALL processes R ALL of the other component mode, and then obtains the intersection relation between the local parallel subdivision information of the process P and the global subdivision information of the R ALL, thereby determining the parallel communication routing relation. Given a total number of mesh points for a mesh as N and a number of processes for a component mode as M, the average computational complexity over the processes is o (N), the average storage complexity is o (N), and global set communication with complexity, even o (nm), is introduced. Therefore, it is time-consuming and has no parallel scalability, and especially in the case of a high pattern grid resolution and a large number of grid points, the pattern start-up speed is significantly slowed.
Disclosure of Invention
The invention provides a parallel communication route establishing method and a parallel communication route establishing system, which can quickly establish a parallel communication route relation among numerical programs, are efficient and have parallel expandability, and can quickly start the numerical programs especially under the conditions of high grid resolution and a large number of grid points of the numerical programs.
In a first aspect, the present invention provides a parallel communication route establishing method, configured to establish a parallel communication route relationship between multi-process parallel numerical programs for variables on the same grid, where a numerical program for sending a variable is a sending end, a numerical program for receiving a variable is a receiving end, and each process of the sending end and the receiving end stores variable values on a part of grid points; the method comprises the following steps:
the sending end and the receiving end establish global indexes of all grid points for the same grid, and all processes of the sending end and the receiving end establish a grid point local-global mapping table based on the grid point global indexes;
the sending end carries out parallel sequencing on all the processes in the grid point local-global mapping table among all the processes of the sending end;
the receiving end carries out parallel sequencing on all items in the grid point local-global mapping table among all processes of the receiving end;
the process of the sending end and the receiving end is cooperated, and a sending end-receiving end grid point mapping relation table is established in parallel;
the sending end carries out parallel sequencing on all items of the grid point mapping relation table between the sending end and the receiving end among all processes of the sending end;
the receiving end carries out parallel sequencing on all items of the grid point mapping relation table between the sending end and the receiving end among all processes of the receiving end;
and each process of the sending end and the receiving end generates a communication routing relation between each process and the process of the corresponding receiving end or the sending end according to the local sending end-receiving end grid point mapping relation table of the processes.
Further, each item in the grid point local-global mapping table is a triplet < grid point global index, process number, grid point local index >.
Furthermore, the sending end performs parallel ordering on all items in the grid point local-global mapping table among all processes of the sending end, and the receiving end performs parallel ordering on all items in the grid point local-global mapping table among all processes of the receiving end, and all the items take the grid point global index as a key word.
Furthermore, the parallel ordering of the items in the local-global mapping table of the mesh point by the sending end among all processes of the sending end, and the parallel ordering of the items in the local-global mapping table of the mesh point by the receiving end among all processes of the receiving end all include:
determining the global index range of the grid points in each ordered process according to the total number of the grid points and the number of the processes;
respectively sequencing all items in the grid point local-global mapping table of each process;
the entries in the grid point local-global index table are ordered among all processes.
Furthermore, each item in the mapping relation table of the sending end-receiving end grid point is quintuple < grid point global index, sending end process number, sending end grid point local index, receiving end process number, receiving end grid point local index >.
Further, the establishing a sender-receiver grid point mapping relation table in parallel includes:
according to the mesh point global index range of each process of the sending end and the receiving end, completing the exchange of mesh point local-global mapping tables between each process of the sending end and the corresponding process of the receiving end;
in each process, a transmitting end-receiving end grid point mapping relation table is constructed according to a grid point local-global mapping table of a transmitting end and a receiving end.
Furthermore, the sending end performs parallel sequencing on all items of the sending end-receiving end grid point mapping relation table among all processes of the sending end, and takes the sending end process number as a keyword;
and the receiving end carries out parallel sequencing on all the items of the grid point mapping relation table between the sending end and the receiving end among all the processes of the receiving end, and the process number of the receiving end is used as a keyword.
In a second aspect, the present invention provides a parallel communication route establishing system, configured to establish a parallel communication route relationship between multiple parallel processes and a variable on the same grid, where a variable sending program is a sending end, a variable receiving program is a receiving end, and each process of the sending end and the receiving end stores variable values on a part of grid points; the system comprises:
the initialization module is used for establishing global indexes of all grid points for the same grid, and all processes of the sending end and the receiving end construct a grid point local-global mapping table based on the grid point global indexes;
the first parallel ordering module is used for carrying out parallel ordering on all items in the local-global mapping table of the grid point among all processes of the sending end;
the second parallel ordering module is used for carrying out parallel ordering on all items in the local-global mapping table of the grid point among all processes of the receiving end;
the parallel establishing module is used for establishing a transmitting end-receiving end grid point mapping relation table in parallel when processes of the transmitting end and the receiving end are cooperated;
a third parallel sorting module, configured to perform parallel sorting on items in the sender-receiver grid point mapping relation table among all processes of the sender;
a fourth parallel sorting module, configured to perform parallel sorting on the items of the sending end-receiving end grid point mapping relation table among all processes of the receiving end;
and the parallel generation module is used for generating a communication routing relation between each process of the sending end and the receiving end and the process corresponding to the receiving end or the sending end according to the local sending end-receiving end grid point mapping relation table of the processes.
Furthermore, the first parallel sorting module and the second parallel sorting module each include:
the determining module is used for determining the global index range of the grid points in each process after sequencing according to the total number of the grid points and the number of processes;
the first sequencing module is used for respectively sequencing all items in the grid point local-global mapping table of each process;
and the second ordering module is used for ordering the items in the grid point local-global index table among all the processes.
Still further, the parallel establishing module includes:
the exchange module is used for finishing the exchange of the grid point local-global mapping table between each process of the sending end and the corresponding process of the receiving end according to the grid point global index range of each process of the sending end and the receiving end;
and the building module is used for building a transmitting end-receiving end grid point mapping relation table in each process according to the local-global mapping table of the grid points of the transmitting end and the receiving end.
The invention provides a parallel communication route establishing method.A sending end and a receiving end establish all grid point global indexes for the same grid, and all processes of the sending end and the receiving end establish a grid point local-global mapping table based on the grid point global indexes; the sending end/receiving end carries out parallel sequencing on all the processes in the grid point local-global mapping table among all the processes of the sending end/receiving end; the process of the sending end and the receiving end is cooperated, and a sending end-receiving end grid point mapping relation table is established in parallel; the transmitting end/receiving end carries out parallel sequencing on all items of the transmitting end-receiving end grid point mapping relation table among all processes of the transmitting end/receiving end; and each process of the sending end and the receiving end generates a communication routing relation between each process and the process of the corresponding receiving end or the sending end according to the local sending end-receiving end grid point mapping relation table of the processes. The parallel communication routing relation among the numerical programs can be quickly established without introducing global communication, the efficiency is high, the parallel expandability is realized, and particularly, the numerical programs can be quickly started under the conditions that the grid resolution of the numerical programs is high and the number of grid points is large.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a flowchart of a parallel communication route establishment method according to an embodiment of the present invention;
fig. 2 is a detailed flowchart of step S2 according to an embodiment of the present invention;
fig. 3 is a detailed flowchart of step S4 according to an embodiment of the present invention;
fig. 4 is a block diagram of a parallel communication route establishment system according to a second embodiment of the present invention;
FIG. 5 is a block diagram of a first parallel sort module according to a second embodiment of the present invention;
fig. 6 is a block diagram of a parallel establishing module according to a second embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Since the prior art generally needs to introduce global communication to perform parallel routing between numerical programs, the efficiency of parallel communication is low, and parallel scalability is not provided, and especially, under the condition that the grid resolution of the numerical mode is very high and the number of grid points is large, the starting speed of the numerical program is severely slowed down. Therefore, an embodiment of the present invention provides a parallel communication route establishing method, configured to establish a parallel communication route relationship between numerical programs according to parallel subdivision of a same grid in different numerical programs, and in particular, to establish a parallel communication route relationship between multi-process parallel numerical programs and variables on the same grid, where a numerical program for sending a variable is a sending end, a numerical program for receiving a variable is a receiving end, and each process of the sending end and the receiving end stores variable values on a part of grid points. The method can quickly establish the parallel communication routing relation among numerical programs, is efficient and has parallel expandability. Wherein the numerical program may include, but is not limited to, earth system patterns.
Example one
Fig. 1 shows a flowchart of a parallel communication route establishment method, and as shown in fig. 1, the method includes the following steps:
step S1, initialization: the sending end and the receiving end establish global indexes of all grid points for the same grid, and all processes of the sending end and the receiving end establish a grid point local-global mapping table based on the grid point global indexes.
It should be noted that, the global indexes of different mesh points are different, the sending end and the receiving end of this embodiment establish the global indexes of all mesh points for the same mesh point, so that the global indexes of the same mesh point in the sending end and the receiving end are the same, and if one process of the sending end and one process of the receiving end store variable values of the mesh points with the same global indexes, the parallel communication routing relationship includes communication between the two processes. Based on the grid point global index, each process of the sending end and the receiving end can construct a respective grid point local-global mapping table, and preferably, each item in the grid point local-global mapping table is a triple < grid point global index, process number, grid point local index >. Still, the parallel subdivision of the same 8 × 8 mesh shown in table 1 and table 2 when using 4 processes and the parallel subdivision when using 8 processes are taken as examples to illustrate, the parallel subdivision shown in table 1 is the parallel subdivision of the transmitting end, the parallel subdivision shown in table 2 is the parallel subdivision of the receiving end, and table 3 and table 4 are the local-global mapping tables of the mesh points of the parallel subdivisions shown in table 1 and table 2, respectively.
Table 38 x 8 grid parallel split grid point local-global mapping table when using 4 processes
Figure BDA0002394731720000071
After the initialization is completed, a grid point local-global mapping table of parallel subdivision of 8 × 8 grids in table 1 when 4 processes are used is obtained, wherein the grid point global index is 0-63, the processes are 0-3, the grid point local indexes are increased one by one from left to right and from top to bottom according to the grid point positions, and in each process of table 3, the grid point local indexes are 0-15.
Table 48 x 8 grid parallel split grid point local-global mapping table when using 8 processes
Figure BDA0002394731720000081
After the initialization is completed, a grid point local-global mapping table of the parallel subdivision of 8 × 8 grids in table 2 when 8 processes are used is obtained, wherein the grid point global index is 0-63, the process is 0-7, the grid point local indexes are increased one by one from left to right and from top to bottom according to the grid point positions, and in each process of table 4, the grid point local indexes are 0-7.
And step S2, the sending end carries out parallel sequencing on all items in the local-global mapping table of the grid points among all processes of the sending end.
And step S3, the receiving end carries out parallel sequencing on all items in the grid point local-global mapping table among all processes of the receiving end.
It is to be understood that the execution order of the steps S2 and S3 is not limited thereto, and the execution order may be adjusted as needed or performed simultaneously.
Specifically, the sending end performs parallel ordering on all items in the local-global mapping table of the mesh points among all processes at the sending end, and the receiving end performs parallel ordering on all items in the local-global mapping table of the mesh points among all processes at the receiving end, all taking the global index of the mesh point as a key word, fig. 2 shows a specific flowchart of step S2, and as shown in fig. 2, step S2 may further include the following sub-steps:
step S21, determining a global index range of the grid points in each ordered process according to the total number of the grid points and the number of processes (of the sending end or the receiving end), and simultaneously ensuring that the number of the grid points in each ordered process is similar to ensure load balance.
And step S22, respectively sequencing the items in the grid point local-global mapping table of each process.
And step S23, ordering the items in the grid point local-global index table according to the merging ordering idea among all the processes.
The specific flow of step S3 may also include the above-mentioned step S21 to step S23.
Specifically, the number of grid points of a given grid is N, the number of processes at a sending end (or a receiving end) is M, one process in the sorting process involves log (M) one-to-one communication with other processes, the average time complexity of sorting of each process is O ((N/M) × (log (M)) + log (N/M)), the average storage complexity is O (N/M), and the average communication complexity is O ((N/M) × log (M)).
Taking the parallel subdivision of the same 8 × 8 grid shown in tables 1 and 2 when using 4 processes and the parallel subdivision when using 8 processes as examples, the items in the local-global mapping table of the grid point of each process are sorted in parallel, and the sorting results shown in tables 5 and 6 are obtained respectively.
Table 5 table 3 shows the ordering result of the grid point local-global mapping table
Figure BDA0002394731720000091
Figure BDA0002394731720000101
Table 6 table 4 shows the ordering result of the grid point local-global mapping table
Figure BDA0002394731720000102
And step S4, the processes of the sending end and the receiving end cooperate to establish a sending end-receiving end grid point mapping relation table in parallel.
The grid point mapping relationship is: one process at the sender and one process at the receiver have the same mesh point. Preferably, each item in the mapping relation table of the sending end-receiving end grid point is quintuple < grid point global index, sending end process number, sending end grid point local index, receiving end process number, receiving end grid point local index >.
Specifically, fig. 3 shows a specific flowchart of step S4, and as shown in fig. 3, the establishing the sender-receiver grid point mapping relationship table in parallel may further include the following sub-steps:
and step S41, according to the grid point global index range of each process of the sending end and the receiving end, the exchange of the grid point local-global mapping table between each process of the sending end and the corresponding process of the receiving end is completed through point-to-point local communication.
Step S42, in each process, a sender-receiver grid point mapping relation table is constructed according to the local-global mapping table of the grid points of the sender and the receiver. The average temporal complexity, average storage space complexity, and average communication complexity of each process are all O (N/M).
Still taking the parallel subdivision of the same 8 × 8 grid shown in tables 1 and 2 when using 4 processes and the parallel subdivision when using 8 processes as examples, a transmitting end-receiving end grid point mapping relation table is established in parallel, and the transmitting end-receiving end grid point mapping relation tables shown in tables 7 and 8 are obtained respectively.
Table 7 sender-receiver grid point mapping relation table
Figure BDA0002394731720000111
Table 8 sender-receiver grid point mapping relation table
Figure BDA0002394731720000112
Figure BDA0002394731720000121
And step S5, the sending end carries out parallel sequencing on all items of the grid point mapping relation table between the sending end and the receiving end among all processes of the sending end.
And step S6, the receiving end carries out parallel sequencing on all items of the grid point mapping relation table between the sending end and the receiving end among all processes of the receiving end.
It is to be understood that the execution order of the steps S5 and S6 is not limited thereto, and the execution order may be adjusted as needed or performed simultaneously.
Specifically, among all processes of the sending end (or the receiving end), all items in the sending end-receiving end grid point mapping relation table are sorted in parallel according to the process numbers and the merging sequence, so that the sending end-receiving end grid point mapping relation table of each process of the sending end (or the receiving end) only contains items related to local grid points of the process. Preferably, the sending end performs parallel sequencing on all items in the grid point mapping relation table between the sending end and the receiving end among all processes of the sending end, and the process number of the sending end is used as a keyword; and the receiving end carries out parallel sequencing on all items of the grid point mapping relation table between the sending end and the receiving end among all processes of the receiving end, and the process number of the receiving end is taken as a keyword. The average time complexity of each process sequence is O ((N/M) × log (M)), the average storage complexity is O (N/M), and the average communication complexity is O ((N/M) × log (M)).
Taking the parallel subdivision of the same 8 × 8 grid shown in tables 1 and 2 when using 4 processes and the parallel subdivision when using 8 processes as examples, after parallel sorting is performed according to the process numbers on the items in the sending end-receiving end grid point mapping relation table according to the merge sort, the sort results shown in tables 9 and 10 are obtained respectively.
TABLE 9 sender-receiver grid point mapping relationship table ordering results
Figure BDA0002394731720000131
TABLE 10 sender-receiver grid point mapping relationship table ordering results
Figure BDA0002394731720000132
Figure BDA0002394731720000141
And step S7, each process of the sending end and the receiving end generates a communication routing relation between each process and the process of the corresponding receiving end or the sending end according to the local sending end-receiving end grid point mapping relation table of the processes.
Specifically, for each process of the sending end (or the receiving end), a communication routing relationship between the process and the corresponding receiving end (or the sending end) process is generated according to a local sending end-receiving end grid point mapping relationship table of the process.
The parallel partitioning of the same 8 x 8 grid shown in tables 1 and 2 using 4 processes and 8 processes is still exemplified: establishing a communication routing relation from the process No. 0 of the sending end (table 1) to the processes No. 0-3 of the receiving end (table 2), wherein the process No. 0 of the sending end can send variable data on grid points with local indexes of 1 and 5 to the process No. 1 of the receiving end; and establishing a communication routing relation from the process No. 0 of the receiving end (table 2) to the processes No. 0 and No. 2 of the sending end (table 1), wherein the process No. 0 of the receiving end receives variable data on grid points with local indexes of 4-7 of the sending end from the process No. 2 of the sending end. The average time complexity of each process is O (N/M), the average storage complexity is O (N/M), and no communication exists.
Example two
Correspondingly to the embodiment, the embodiment provides a parallel communication route establishing system, as shown in fig. 4, the system includes:
the system comprises an initialization module 1, a local mapping module and a local mapping module, wherein the initialization module is used for establishing global indexes of all grid points for the same grid, and each process of a sending end and a receiving end establishes a local-global mapping table of the grid points based on the global indexes of the grid points;
the first parallel ordering module 2 is used for carrying out parallel ordering on all items in the local-global mapping table of the grid point among all processes of the sending end;
the second parallel ordering module 3 is used for performing parallel ordering on all the processes of the receiving end in the local-global mapping table of the grid point;
the parallel establishing module 4 is used for establishing a transmitting end-receiving end grid point mapping relation table in parallel when processes of the transmitting end and the receiving end are cooperated;
a third parallel sorting module 5, configured to perform parallel sorting on items in the sending end-receiving end grid point mapping relation table among all processes of the sending end;
a fourth parallel sorting module 6, configured to perform parallel sorting on the items in the sending end-receiving end grid point mapping relation table among all processes of the receiving end;
and the parallel generation module 7 is configured to generate a communication routing relationship between each process of the sending end and the receiving end and a process of the corresponding receiving end or the sending end according to the local sending end-receiving end grid point mapping relationship table of the processes.
It is understood that the initialization module 1 may be configured to execute the step S1 in the first embodiment, the first parallel sorting module 2 may be configured to execute the step S2 in the first embodiment, the second parallel sorting module 3 may be configured to execute the step S3 in the first embodiment, the parallel establishing module 4 may be configured to execute the step S4 in the first embodiment, the third parallel sorting module 5 may be configured to execute the step S5 in the first embodiment, the fourth parallel sorting module 6 may be configured to execute the step S6 in the first embodiment, and the parallel generating module 7 may be configured to execute the step S7 in the first embodiment.
As shown in fig. 5, the first parallel sorting module 2 may further include:
a determining module 21, configured to determine, according to the total number of grid points and the number of processes, a global index range of the grid points in each process after the ordering;
the first ordering module 22 is configured to order items in the local-global mapping table of the grid point of each process respectively;
a second sorting module 23, configured to sort the entries in the grid point local-global index table among all processes.
It is understood that the determining module 21 can be used to execute the step S21 in the first embodiment, the first sorting module 22 can be used to execute the step S22 in the first embodiment, and the second sorting module 23 can be used to execute the step S23 in the first embodiment. The second parallel sorting module 3 may also comprise a determining module 21, a first sorting module 22 and a second sorting module 23.
As shown in fig. 6, the parallel establishing module 4 may further include:
the switching module 41 is configured to complete switching of a local-global mapping table of a grid point between each process of the sending end and a corresponding process of the receiving end according to a global index range of the grid point of each process of the sending end and the receiving end;
and the building module 42 is configured to build a sending end-receiving end grid point mapping relation table according to the local-global mapping table of the grid points of the sending end and the receiving end in each process.
It is understood that the exchanging module 41 can be used to execute the step S41 in the first embodiment, and the constructing module 42 can be used to execute the step S42 in the first embodiment.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (10)

1. A parallel communication route establishing method is characterized in that the method is used for establishing a parallel communication route relation between multi-process parallel numerical programs for variables on the same grid, the numerical program for sending the variables is a sending end, the numerical program for receiving the variables is a receiving end, and each process of the sending end and the receiving end stores variable values on partial grid points; the method comprises the following steps:
the method comprises the steps that a sending end and a receiving end establish global indexes of all grid points for the same grid, each process of the sending end establishes a first grid point local-global mapping table based on the grid point global indexes, and each process of the receiving end establishes a second grid point local-global mapping table based on the grid point global indexes;
the sending end carries out parallel sequencing on all items in the local-global mapping table of the first grid point among all processes of the sending end;
the receiving end carries out parallel sequencing on all the processes in the local-global mapping table of the second grid point among all the processes of the receiving end;
the process of the sending end and the receiving end is cooperated, and a sending end-receiving end grid point mapping relation table is established in parallel;
the sending end carries out parallel sequencing on all items of the grid point mapping relation table between the sending end and the receiving end among all processes of the sending end;
the receiving end carries out parallel sequencing on all items of the grid point mapping relation table between the sending end and the receiving end among all processes of the receiving end;
and each process of the sending end and the receiving end generates a communication routing relation between each process and the process of the corresponding receiving end or the sending end according to the local sending end-receiving end grid point mapping relation table of the processes.
2. The parallel communication route establishing method according to claim 1, wherein each of the first mesh point local-global mapping table and each of the second mesh point local-global mapping table is a triplet < mesh point global index, process number, mesh point local index >.
3. The method according to claim 1, wherein the sending end performs parallel ordering on the items in the first mesh point local-global mapping table among all processes at the sending end, and the receiving end performs parallel ordering on the items in the second mesh point local-global mapping table among all processes at the receiving end, all using a mesh point global index as a key.
4. The method for establishing parallel communication routes according to any of claims 1 to 3, wherein the sending side performs parallel ordering on all processes at the sending side in the first mesh point local-global mapping table, and the receiving side performs parallel ordering on all processes at the receiving side in the second mesh point local-global mapping table, both including:
determining the global index range of the grid points in each ordered process according to the total number of the grid points and the number of the processes;
respectively sequencing all items in a first grid point local-global mapping table/a second grid point local-global mapping table of each process;
the entries in the first/second mesh point local-global mapping table are ordered between all processes.
5. The method according to claim 1, wherein each item in the sender-receiver mesh point mapping relationship table is quintuple < mesh point global index, sender process number, sender mesh point local index, receiver process number, receiver mesh point local index >.
6. The method for establishing parallel communication routes according to claim 1, wherein the establishing a sender-receiver mesh point mapping relation table in parallel comprises:
according to the mesh point global index range of each process of the sending end and the receiving end, completing the exchange of mesh point local-global mapping tables between each process of the sending end and the corresponding process of the receiving end;
and in each process, constructing a transmitting end-receiving end grid point mapping relation table according to a first grid point local-global mapping table of a transmitting end and a second grid point local-global mapping table of a receiving end.
7. The method for establishing parallel communication routes according to claim 1, wherein the sending end performs parallel ordering on items in the sending end-receiving end grid point mapping relation table among all processes of the sending end, and takes the sending end process number as a keyword;
and the receiving end carries out parallel sequencing on all the items of the grid point mapping relation table between the sending end and the receiving end among all the processes of the receiving end, and the process number of the receiving end is used as a keyword.
8. A parallel communication route establishing system is characterized in that the system is used for establishing a parallel communication route relation between multi-process parallel numerical programs for variables on the same grid, the numerical program for sending the variables is a sending end, the numerical program for receiving the variables is a receiving end, and each process of the sending end and the receiving end stores variable values on a part of grid points; the system comprises:
the system comprises an initialization module, a sending end and a receiving end, wherein the initialization module is used for establishing global indexes of all grid points for the same grid, each process of the sending end establishes a first grid point local-global mapping table based on the grid point global indexes, and each process of the receiving end establishes a second grid point local-global mapping table based on the grid point global indexes;
the first parallel ordering module is used for carrying out parallel ordering on all items in the first grid point local-global mapping table among all processes of the sending end;
the second parallel ordering module is used for carrying out parallel ordering on all items in the second grid point local-global mapping table among all processes of the receiving end;
the parallel establishing module is used for establishing a transmitting end-receiving end grid point mapping relation table in parallel when processes of the transmitting end and the receiving end are cooperated;
a third parallel sorting module, configured to perform parallel sorting on items in the sender-receiver grid point mapping relation table among all processes of the sender;
a fourth parallel sorting module, configured to perform parallel sorting on the items of the sending end-receiving end grid point mapping relation table among all processes of the receiving end;
and the parallel generation module is used for generating a communication routing relation between each process of the sending end and the receiving end and the process corresponding to the receiving end or the sending end according to the local sending end-receiving end grid point mapping relation table of the processes.
9. The parallel communication route setup system of claim 8, wherein the first parallel ordering module and the second parallel ordering module each comprise:
the determining module is used for determining the global index range of the grid points in each process after sequencing according to the total number of the grid points and the number of processes;
the first sequencing module is used for respectively sequencing each item in the first grid point local-global mapping table/the second grid point local-global mapping table of each process;
and the second ordering module is used for ordering the items in the first grid point local-global mapping table/the second grid point local-global mapping table among all the processes.
10. The parallel communication route setup system of claim 8, wherein the parallel setup module comprises:
the exchange module is used for finishing the exchange of the grid point local-global mapping table between each process of the sending end and the corresponding process of the receiving end according to the grid point global index range of each process of the sending end and the receiving end;
and the building module is used for building a transmitting end-receiving end grid point mapping relation table in each process according to the local-global mapping table of the grid points of the transmitting end and the receiving end.
CN202010127096.3A 2020-02-28 2020-02-28 Parallel communication route establishing method and system Active CN111367665B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010127096.3A CN111367665B (en) 2020-02-28 2020-02-28 Parallel communication route establishing method and system
PCT/CN2020/126790 WO2021169393A1 (en) 2020-02-28 2020-11-05 Parallel communication routing setup method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010127096.3A CN111367665B (en) 2020-02-28 2020-02-28 Parallel communication route establishing method and system

Publications (2)

Publication Number Publication Date
CN111367665A CN111367665A (en) 2020-07-03
CN111367665B true CN111367665B (en) 2020-12-18

Family

ID=71206251

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010127096.3A Active CN111367665B (en) 2020-02-28 2020-02-28 Parallel communication route establishing method and system

Country Status (2)

Country Link
CN (1) CN111367665B (en)
WO (1) WO2021169393A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111367665B (en) * 2020-02-28 2020-12-18 清华大学 Parallel communication route establishing method and system
CN113157806B (en) * 2021-04-19 2022-05-24 清华大学 Grid data distributed storage service system, method, device, equipment and medium
CN112988907B (en) * 2021-04-28 2022-01-21 北京卡普拉科技有限公司 Information adjusting method, system, electronic equipment and storage medium
CN113900808B (en) * 2021-10-09 2024-09-20 合肥工业大学 MPI parallel data structure based on arbitrary polyhedral unstructured grid
CN116319364B (en) * 2023-02-10 2024-02-09 国家海洋环境预报中心 MPI virtual graph topology communication method and system suitable for wave numerical mode

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1435030A (en) * 1999-12-10 2003-08-06 睦塞德技术公司 Method and apparatus for longest match address lookup

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2440702A1 (en) * 2000-07-25 2002-01-31 Acuo Technologies, Llc Asset communication format within a computer network
US8156421B2 (en) * 2008-06-30 2012-04-10 Yahoo! Inc. Analysis of database performance reports for graphical presentation of summary results
CN102707932B (en) * 2012-05-16 2013-07-24 清华大学 Parallel coupling method for global system mode
US9996389B2 (en) * 2014-03-11 2018-06-12 International Business Machines Corporation Dynamic optimization of workload execution based on statistical data collection and updated job profiling
CN103970580B (en) * 2014-05-05 2017-09-15 华中科技大学 A kind of data flow towards multinuclear cluster compiles optimization method
CN110764934B (en) * 2019-10-24 2020-11-27 清华大学 Parallel communication method, device and system for numerical model and storage medium
CN111367665B (en) * 2020-02-28 2020-12-18 清华大学 Parallel communication route establishing method and system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1435030A (en) * 1999-12-10 2003-08-06 睦塞德技术公司 Method and apparatus for longest match address lookup

Also Published As

Publication number Publication date
CN111367665A (en) 2020-07-03
WO2021169393A1 (en) 2021-09-02

Similar Documents

Publication Publication Date Title
CN111367665B (en) Parallel communication route establishing method and system
Pearce et al. Faster parallel traversal of scale free graphs at extreme scale with vertex delegates
Azar et al. Optimal oblivious routing in polynomial time
Ziavras RH: a versatile family of reduced hypercube interconnection networks
Koh et al. MapReduce skyline query processing with partitioning and distributed dominance tests
Chiu et al. A genetic algorithm for reliability-oriented task assignment with k/spl tilde/duplications in distributed systems
Werner et al. Systematic Literature Review of Data Exchange Strategies for Range-limited Particle Interactions.
Alsaleh et al. One-to-many node-disjoint paths routing in dense gaussian networks
Toda et al. Autonomous and distributed construction of locality aware skip graph
Guo et al. Embedding hierarchical cubic networks into k-rooted complete binary trees for minimum wirelength
Paul Enhancement of Bubble and Insertion Sort Algorithm Using Block Partitioning
Romeijn et al. Parallel algorithms for solving aggregated shortest-path problems
Chiu et al. An adaptive heuristic algorithm with the probabilistic safety vector for fault-tolerant routing on the (n, k)-star graph
Souravlas et al. Dynamic Load Balancing on All-to-All Personalized Communications Using the NNLB Principal
JP3606922B2 (en) Task assignment method and apparatus for high-cycle multi-computer
Sasidharan et al. A general space-filling curve algorithm for partitioning 2D meshes
Yang et al. A new graph approach to minimizing processor fragmentation in hypercube multiprocessors
CN115016943A (en) Parallel computing method, system, equipment and storage medium
Bogle et al. Distributed algorithms for the graph biconnectivity and least common ancestor problems
US20240134932A1 (en) Distributed matrix computation control method and apparatus supporting matrix fused operation
Truong et al. Layout-conscious expandable topology for low-degree interconnection networks
Tapia et al. A Variant of Parallel-Hybrid Genetic Algorithm for Large-Scale Traveling Salesman Problem
CN111382208A (en) Optimization method and optimization terminal of block chain architecture
An et al. Optimal Algorithms for a Mesh-Connected Computer with Limited Additional Global Bandwidth
Li et al. Practical routing and torus assignment for RDT

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant