CN108228364B - Data storage packaging method, system and medium for high-performance parallel computing - Google Patents

Data storage packaging method, system and medium for high-performance parallel computing Download PDF

Info

Publication number
CN108228364B
CN108228364B CN201711445533.0A CN201711445533A CN108228364B CN 108228364 B CN108228364 B CN 108228364B CN 201711445533 A CN201711445533 A CN 201711445533A CN 108228364 B CN108228364 B CN 108228364B
Authority
CN
China
Prior art keywords
module
data storage
data
modules
receiving
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711445533.0A
Other languages
Chinese (zh)
Other versions
CN108228364A (en
Inventor
郭力
吕计男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Academy of Aerospace Aerodynamics CAAA
Original Assignee
China Academy of Aerospace Aerodynamics CAAA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Academy of Aerospace Aerodynamics CAAA filed Critical China Academy of Aerospace Aerodynamics CAAA
Priority to CN201711445533.0A priority Critical patent/CN108228364B/en
Publication of CN108228364A publication Critical patent/CN108228364A/en
Application granted granted Critical
Publication of CN108228364B publication Critical patent/CN108228364B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/543User-generated data transfer, e.g. clipboards, dynamic data exchange [DDE], object linking and embedding [OLE]

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Abstract

The invention discloses a data storage packaging method, a data storage packaging system and a data storage packaging medium for high-performance parallel computing. The method comprises the following steps: judging whether the identification of the receiving module of a certain process module is equal to the identification of the data storage module; if the identification of the receiving module is equal to the identification of the data storage module, calculating the local data; if the identification of the receiving module is not equal to the identification of the data storage module, copying the data to be sent in the local data and storing the data to be sent in the sending array module; receiving the copied data to be sent by receiving modules in the other process modules, copying and storing the copied data to be sent to data storage modules in the corresponding process modules; and calculating the copied data to be sent, and setting the identifier of the receiving array module in the process module to be equal to the identifier of the storage array module in the process module. The invention reduces the communication times and improves the high-performance parallel computing efficiency.

Description

Data storage packaging method, system and medium for high-performance parallel computing
Technical Field
The invention belongs to the field of hydrodynamics high-performance parallel programs, and particularly relates to a data storage packaging method, a data storage packaging system and a data storage packaging medium for high-performance parallel computing.
Background
With the development of Computational Fluid Dynamics (CFD) methods, the amount of computation increases rapidly, and when a conventional single-CPU computer is used to compute such problems, the computation time is consumed too much and cannot be borne, and the memory on the single computer cannot bear the requirements of grids and unknowns on the memory in the computation problem. The adoption of a high-performance parallel computer cluster with distributed memory for CFD calculation gradually becomes the inevitable direction of computational fluid dynamics development. The data in the high-performance parallel computer with the distributed memory are respectively stored in different machines. In the calculation process, if a certain CPU needs to use data on other CPUs, the data needs to be sent to a local machine through a network switch in a communication mode. The common parallel communication modes are a calculation and communication strong coupling mode and a loose coupling mode. The loose coupling mode is used for communicating during sub-time updating and synchronously calculating data, discontinuous discontinuity exists at different CPU interfaces of a flow field obtained by the calculation method, and the size of the discontinuity is related to time iteration convergence precision. Data is communicated at each calculation using a strongly coupled calculation method. The final calculation result is the same as the corresponding serial program, and the data is smooth and uninterrupted among different CPU blocks. To meet the requirement of strong coupling, it is necessary to ensure that the values on the other CPUs needed for the calculation are the latest values on the local CPU at each calculation. At present, a common data storage mode is adopted in a calculation program, and data needs to be communicated on different CPUs in each calculation process. The time required for each communication is long, which affects the calculation efficiency, and the data is up-to-date on the local CPU after one communication, which also causes waste of communication.
Disclosure of Invention
The technical problem solved by the invention is as follows: the defects of the prior art are overcome, the data storage packaging method, the data storage packaging system and the data storage packaging medium for high-performance parallel computing are provided, the communication times are reduced, and the high-performance parallel computing efficiency is improved.
The purpose of the invention is realized by the following technical scheme: according to an aspect of the present invention, there is provided a data storage packaging method for high performance parallel computing, the method comprising the steps of: the method comprises the following steps: judging whether the identification of the receiving module of one process module in the plurality of process modules is equal to the identification of the data storage module; step two: if the identification of the receiving module in the process module in the step one is equal to the identification of the data storage module, calculating the local data of the data storage module; step three: if the identification of the receiving module in the process module in the step one is not equal to the identification of the data storage module, copying the data to be sent in the local data of the data storage module in the process module and storing the data to be sent in a sending array module in the process module; sending the copied data to be sent in the sending array module in the process module to receiving modules in other process modules; calculating the local data of the data storage module in the process module in the step one; step four: receiving the copied data to be sent in the sending array module in the process module in the step one by receiving modules in other process modules, copying and storing the copied data to be sent in the receiving modules in other process modules to data storage modules in corresponding process modules; step five: and calculating the copied data to be sent received by the data storage modules in the other process modules to obtain a calculation result, and setting the identifier of the receiving array module in the process module in the step one to be equal to the identifier of the storage array module in the process module in the step one.
In the above data storage encapsulation method for high performance parallel computing, in the second step, the computing of the local data of the data storage module includes logic computation or addition, subtraction, multiplication, and division computation.
In the above data storage encapsulation method for high performance parallel computing, in step three, the computing the local data of the data storage module in the process module in step one includes logic computation or addition, subtraction, multiplication, and division computation.
In the above data storage encapsulation method for high performance parallel computation, in step five, the computation of the copied data to be sent received by the data storage module in the other process modules includes logic computation or addition, subtraction, multiplication, and division computation.
In the data storage and packaging method for high-performance parallel computing, each process module comprises a receiving module, a data storage module and a sending array module.
According to another aspect of the present invention, there is provided a data storage packaging system for high performance parallel computing, comprising: the first module is used for judging whether the identification of the receiving module of one process module in the plurality of process modules is equal to the identification of the data storage module; the second module is used for calculating the local data of the data storage module if the identification of the receiving module in the process module in the first module is equal to the identification of the data storage module; a third module, configured to copy and store data to be sent in the local data of the data storage module in the process module into a sending array module in the process module, and send the copied data to be sent in the sending array module in the process module to receiving modules of the other process modules, if the identifier of the receiving module in the process module in the first module is not equal to the identifier of the data storage module; calculating local data of a data storage module in the process module in the first module; a fourth module, configured to receive, through the receiving module in the other process modules, the copied to-be-sent data from the sending array module in the process module in the first module, and copy and store the copied to-be-sent data in the receiving module in the other process modules to the data storage module in the corresponding process module; and the fifth module is used for calculating the copied data to be sent received by the data storage module in the other process modules to obtain a calculation result, and simultaneously setting the identifier of the receiving array module in the process module in the first module to be equal to the identifier of the storage array module in the process module in the first module.
In the data storage packaging system for high-performance parallel computing, each process module comprises a receiving module, a data storage module and a sending array module.
In the data storage packaging system for high-performance parallel computing, the computation comprises logic computation or addition, subtraction, multiplication and division computation.
According to yet another aspect of the invention, there is also provided one or more machine-readable media having instructions stored thereon which, when executed by one or more processors, cause an apparatus to perform a method as one or more of the aspects of the invention.
Compared with the prior art, the invention has the following beneficial effects:
(1) according to the invention, through the identification of the receiving module and the identification of the data storage module, the communication times are reduced, and the high-performance parallel computing efficiency is improved;
(2) the invention encapsulates the receiving module, the data storage module, the sending array module, the identification of the receiving module and the identification of the data storage module in the same process module, thereby being convenient for use.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a schematic structural diagram of a process provided in an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art. It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.
Method embodiment
The embodiment provides a data storage packaging method for high-performance parallel computing, which comprises the following steps:
the method comprises the following steps: judging whether the identification of the receiving module of one process module in the plurality of process modules is equal to the identification of the data storage module; the data storage module stores local data; the process module comprises a receiving module, a data storage module and a sending array module;
step two: if the identification of the receiving module in the process module in the step one is equal to the identification of the data storage module, calculating the local data of the data storage module;
step three: if the identification of the receiving module in the process module in the step one is not equal to the identification of the data storage module, copying the data to be sent in the local data of the data storage module in the process module and storing the data to be sent in the sending array module in the process module; sending the copied data to be sent in the sending array module in the process module to receiving modules in other process modules; calculating the local data of the data storage module in the process module in the step one;
step four: receiving the copied data to be sent in the sending array module in the process module in the step one by receiving modules in other process modules, copying and storing the copied data to be sent in the receiving modules in other process modules to data storage modules in corresponding process modules;
step five: and calculating the copied data to be sent received by the data storage modules in the other process modules to obtain a calculation result, and setting the identifier of the receiving array module in the process module in the step one to be equal to the identifier of the storage array module in the process module in the step one.
Specifically, as shown in fig. 1, there are four process modules in total, namely, a process module 1, a process module 2, a process module 3, and a process module 4, and a certain process module in the step one may be selected as the process module 1 in the figure, and each process module includes a receiving module, a data storage module, and a sending array module. The data storage module is used for storing local data.
In step one, it is determined whether the identifier of the receiving module in the process module 1 is equal to the identifier of the data storage module in the process module 1.
In the second step, if the identifier of the receiving module in the process module 1 is equal to the identifier of the data storage module in the process module 1, the local data of the data storage module in the process module 1 is calculated, and the calculation may be a logical operation or an addition, subtraction, multiplication, and division calculation.
In the third step, if the identifier of the receiving module in the process module 1 is not equal to the identifier of the data storage module in the process module 1, copying the data to be sent in the local data of the data storage module in the process module 1 and storing the data to be sent in the sending array module in the process module 1; sending the copied data to be sent in the sending array module in the process module 1 to receiving modules in other process modules, wherein the other process modules are a process module 2, a process module 3 and a process module 4 in the figure; local data of the data storage module in the process module 1 is calculated. It should be understood that the calculation of the local data of the data storage module in the process module 1 may be performed in parallel with other processes in the step three, and is not affected by each other, and the calculation of the local data of the data storage module in the process module 1 may be a logic operation, or an addition, subtraction, multiplication, and division calculation.
In step four, the receiving module in the process module 2 receives the copied data to be sent from the sending array module in the process module 1, and copies and stores the copied data to be sent in the receiving module in the process module 2 to the data storage module in the process module 2. And the receiving module in the process module 3 receives the copied data to be sent from the sending array module in the process module 1, and copies and stores the copied data to be sent in the receiving module in the process module 3 to the data storage module in the process module 3. And the receiving module in the process module 4 receives the copied data to be sent from the sending array module in the process module 1, and copies and stores the copied data to be sent in the receiving module in the process module 4 to the data storage module in the process module 4.
In step 5, the copied data to be sent received by the data storage module in the process module 2 is calculated to obtain a calculation result, the copied data to be sent received by the data storage module in the process module 3 is calculated to obtain a calculation result, the copied data to be sent received by the data storage module in the process module 4 is calculated to obtain a calculation result, and meanwhile, the identifier of the receiving array module in the process module 1 is set to be equal to the identifier of the storage array module in the process module 1. The calculation of the copied data to be sent received by the data storage module in the process module 2 can be a logic operation, and can also be an addition, subtraction, multiplication and division calculation; the calculation of the copied data to be sent received by the data storage module in the process module 3 can be a logic operation, and can also be an addition, subtraction, multiplication and division calculation; the calculation of the copied data to be sent received by the data storage module in the process module 4 may be a logical operation, or an addition, subtraction, multiplication, and division calculation.
Device embodiment
The present embodiment also provides a data storage packaging system for high performance parallel computing, including: the device comprises a first module, a second module, a third module, a fourth module and a fifth module. The first module is used for judging whether the identification of the receiving module of one process module in the plurality of process modules is equal to the identification of the data storage module; the second module is used for calculating the local data of the data storage module if the identification of the receiving module in the process module in the first module is equal to the identification of the data storage module; a third module, configured to copy and store data to be sent in the local data of the data storage module in the process module into a sending array module in the process module, and send the copied data to be sent in the sending array module in the process module to receiving modules of the other process modules, if the identifier of the receiving module in the process module in the first module is not equal to the identifier of the data storage module; calculating local data of a data storage module in the process module in the first module; a fourth module, configured to receive, through the receiving module in the other process modules, the copied to-be-sent data from the sending array module in the process module in the first module, and copy and store the copied to-be-sent data in the receiving module in the other process modules to the data storage module in the corresponding process module; and the fifth module is used for calculating the copied data to be sent received by the data storage module in the other process modules to obtain a calculation result, and simultaneously setting the identifier of the receiving array module in the process module in the first module to be equal to the identifier of the storage array module in the process module in the first module.
In the above embodiment, each process module includes a receiving module, a data storage module, and a sending array module.
In the above embodiments, the calculation includes a logical calculation or an addition, subtraction, multiplication, and division calculation.
The present embodiments also provide one or more machine-readable media having instructions stored thereon that, when executed by one or more processors, cause an apparatus to perform a method as described above for the method embodiments.
According to the embodiment, the communication times are reduced and the high-performance parallel computing efficiency is improved through the identification of the receiving module and the identification of the data storage module; in addition, the embodiment encapsulates the receiving module, the data storage module, the sending array module, the identifier of the receiving module and the identifier of the data storage module in the same process module, thereby facilitating the use.
The above-described embodiments are merely preferred embodiments of the present invention, and general changes and substitutions by those skilled in the art within the technical scope of the present invention are included in the protection scope of the present invention.

Claims (9)

1. A data storage encapsulation method for high performance parallel computing, the method comprising the steps of:
the method comprises the following steps: judging whether the identification of the receiving module of one process module in the plurality of process modules is equal to the identification of the data storage module;
step two: if the identification of the receiving module in the process module in the step one is equal to the identification of the data storage module, calculating the local data of the data storage module;
step three: if the identification of the receiving module in the process module in the step one is not equal to the identification of the data storage module, copying the data to be sent in the local data of the data storage module in the process module and storing the data to be sent in a sending array module in the process module; sending the copied data to be sent in the sending array module in the process module to receiving modules in other process modules; calculating the local data of the data storage module in the process module in the step one;
step four: receiving the copied data to be sent in the sending array module in the process module in the step three by receiving modules in the other process modules, copying and storing the copied data to be sent in the receiving modules in the other process modules to data storage modules in the corresponding process modules;
step five: and calculating the copied data to be sent received by the data storage modules in the other process modules, and simultaneously setting the identification of the receiving module in the process module in the step one to be equal to the identification of the data storage module in the process module in the step one.
2. The data storage packaging method for high-performance parallel computing according to claim 1, wherein in the second step, computing the local data of the data storage module comprises logic computation or addition, subtraction, multiplication and division computation.
3. The method of claim 1, wherein in step three, computing the local data of the data storage module in the process module in step one comprises logic computation or addition, subtraction, multiplication and division computation.
4. The method as claimed in claim 1, wherein in step five, the calculation of the copied data to be transmitted received by the data storage module in the rest process modules includes logic calculation or addition, subtraction, multiplication and division calculation.
5. The data storage packaging method for high-performance parallel computing according to claim 1, wherein each process module comprises a receiving module, a data storage module and a sending array module.
6. A data storage packaging system for high performance parallel computing, comprising:
the first module is used for judging whether the identification of the receiving module of one process module in the plurality of process modules is equal to the identification of the data storage module;
the second module is used for calculating the local data of the data storage module if the identification of the receiving module in the process module in the first module is equal to the identification of the data storage module;
a third module, configured to copy and store data to be sent in the local data of the data storage module in the process module into a sending array module in the process module, and send the copied data to be sent in the sending array module in the process module to receiving modules of the other process modules, if the identifier of the receiving module in the process module in the first module is not equal to the identifier of the data storage module; calculating local data of a data storage module in the process module in the first module;
a fourth module, configured to receive, through the receiving modules in the other process modules, the copied to-be-sent data from the sending array module in the process module in the third module, and copy and store the copied to-be-sent data in the receiving modules in the other process modules to the data storage module in the corresponding process module;
and the fifth module is used for calculating the copied data to be sent received by the data storage modules in the other process modules to obtain a calculation result, and simultaneously setting the identifier of the receiving module in the process module in the first module to be equal to the identifier of the data storage module in the process module in the first module.
7. The data storage packaging system for high performance parallel computing of claim 6, wherein: each process module comprises a receiving module, a data storage module and a sending array module.
8. The data storage packaging system for high performance parallel computing of claim 6, wherein: the calculation includes a logical calculation or an addition, subtraction, multiplication, and division calculation.
9. One or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the method of one or more of claims 1-5.
CN201711445533.0A 2017-12-27 2017-12-27 Data storage packaging method, system and medium for high-performance parallel computing Active CN108228364B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711445533.0A CN108228364B (en) 2017-12-27 2017-12-27 Data storage packaging method, system and medium for high-performance parallel computing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711445533.0A CN108228364B (en) 2017-12-27 2017-12-27 Data storage packaging method, system and medium for high-performance parallel computing

Publications (2)

Publication Number Publication Date
CN108228364A CN108228364A (en) 2018-06-29
CN108228364B true CN108228364B (en) 2020-12-18

Family

ID=62648122

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711445533.0A Active CN108228364B (en) 2017-12-27 2017-12-27 Data storage packaging method, system and medium for high-performance parallel computing

Country Status (1)

Country Link
CN (1) CN108228364B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104375806A (en) * 2014-11-19 2015-02-25 北京应用物理与计算数学研究所 Parallel computing component and method and corresponding parallel software development method and system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104375806A (en) * 2014-11-19 2015-02-25 北京应用物理与计算数学研究所 Parallel computing component and method and corresponding parallel software development method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A fast parallel large-scale grid generator for parallel computing in engineering simulation;Xiaoqing Wang etc;《2016 IEEE International Conference of Online Analysis and Computing Science (ICOACS)》;20160529;第96-99页 *
JASMIN框架中多块结构网格拼接并行计算及其应用;郭红等;《计算机工程与科学》;20120831;第34卷(第8期);第69-74页 *

Also Published As

Publication number Publication date
CN108228364A (en) 2018-06-29

Similar Documents

Publication Publication Date Title
US20190361708A1 (en) Embedded scheduling of hardware resources for hardware acceleration
US10923463B2 (en) Method and machine-readable medium for configuring processors with base dies having landing slots
US9626316B2 (en) Managing shared resources between multiple processing devices
US20040015970A1 (en) Method and system for data flow control of execution nodes of an adaptive computing engine (ACE)
CN103186404B (en) System firmware update method and the server system using the method
CN103778013A (en) Multi-channel Nand Flash controller and control method for same
CN104126179A (en) Method, apparatus, and computer program product for inter-core communication in multi-core processors
CN104583944A (en) Fast deskew when exiting low-power partial-width high speed link state
US6789183B1 (en) Apparatus and method for activation of a digital signal processor in an idle mode for interprocessor transfer of signal groups in a digital signal processing unit
CA3127840A1 (en) Handling an input/output store instruction
EP3918467A1 (en) Handling an input/output store instruction
CN108924008A (en) A kind of dual controller data communications method, device, equipment and readable storage medium storing program for executing
CN104679689A (en) Multi-core DMA (direct memory access) subsection data transmission method used for GPDSP (general purpose digital signal processor) and adopting slave counting
CN108228364B (en) Data storage packaging method, system and medium for high-performance parallel computing
US9792212B2 (en) Virtual shared cache mechanism in a processing device
US9612938B2 (en) Providing status of a processing device with periodic synchronization point in instruction tracing system
CN114706813B (en) Multi-core heterogeneous system-on-chip, asymmetric synchronization method, computing device and medium
WO2022028223A1 (en) Method and system for controlling data transmission by data flow architecture neural network chip
CN113467833A (en) Method and system for realizing risv _ v vector instruction set vselti instruction
CN108234147B (en) DMA broadcast data transmission method based on host counting in GPDSP
CN112231018A (en) Method, computing device, and computer-readable storage medium for offloading data
CN110609845A (en) Big data redundancy disaster recovery method, big data service system and query method
EP4254305A1 (en) Multi-core draw splitting
CN112232498B (en) Data processing device, integrated circuit chip, electronic equipment, board card and method
US20170337084A1 (en) Compute unit including thread dispatcher and event register and method of operating same to enable communication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant