CN114003385A - Parallelization method for improving post-processing performance - Google Patents

Parallelization method for improving post-processing performance Download PDF

Info

Publication number
CN114003385A
CN114003385A CN202111287291.3A CN202111287291A CN114003385A CN 114003385 A CN114003385 A CN 114003385A CN 202111287291 A CN202111287291 A CN 202111287291A CN 114003385 A CN114003385 A CN 114003385A
Authority
CN
China
Prior art keywords
file
mpi
calling
function
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202111287291.3A
Other languages
Chinese (zh)
Inventor
王普勇
李季
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Suochen Information Technology Co ltd
Original Assignee
Shanghai Suochen Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Suochen Information Technology Co ltd filed Critical Shanghai Suochen Information Technology Co ltd
Priority to CN202111287291.3A priority Critical patent/CN114003385A/en
Publication of CN114003385A publication Critical patent/CN114003385A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5055Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering software capabilities, i.e. software resources associated or available to the machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention discloses a parallelization method for improving post-processing performance, which comprises the following steps: each process opens the file by calling a file operation function of the MPI; each process moves the global shared file pointer to a corresponding position in the file according to the read address displacement by calling a file addressing function of the MPI, and the processes simultaneously use one global shared file pointer to carry out I/O operation; each process reads a plurality of data in the file from the file into a memory buffer area by calling a file reading function of the MPI; each process realizes file closing and I/O operation ending by calling a file pointer closing function of the MPI; dividing the data into a plurality of sub data sets which are independent of each other by applying a data division algorithm; and (4) parallelly and collectively writing the sub data sets into the visual object, and transmitting the visual object to the main process to finish parallel and efficient display results.

Description

Parallelization method for improving post-processing performance
Technical Field
The method is applied to post-processing software in the field of simulation software, and particularly used for parallelization processing of large-scale data visualization and data IO parts.
Background
The physical and engineering simulation software carries out simulation and numerical calculation in various fields by methods such as finite element and the like, and covers a plurality of fields such as mechanics, fluid, electromagnetism, optics, acoustics, electrochemistry, chemical engineering, semiconductors and the like. Data results obtained by simulation computing software are often in the order of GB to hundreds of GB, and when large data results are subjected to visual post-processing, data reading, processing, checking and the like bring inconvenience, and the software often needs to wait for a long time.
The speed of the current simulation analysis software for reading and displaying ten GB magnitude is slow, and more manpower and time are consumed. Therefore, in order to improve the efficiency of the simulation software user in post-processing, a method is needed to be found, which can read the displayed functions to save the time of the simulation user when the post-processing is completed quickly.
Disclosure of Invention
In the parallel post-processing technical method, aiming at the problem that the time consumption is too long when the large-magnitude simulation calculation result data is post-processed, the invention provides the method for parallelizing the post-processing by using the multi-core CPU of the high-performance computer, thereby greatly improving the speed and the efficiency of the post-processing.
The invention solves the technical problems through the following technical scheme:
the invention provides a parallelization method for improving post-processing performance, which is characterized by comprising the following steps of:
s1, in the I/O of the parallel MPI (information transfer interface), each process opens the file by calling the file operation function of the MPI;
s2, each process moves the global shared file pointer to the corresponding position in the file according to the read address displacement by calling the file addressing function of MPI, and the processes simultaneously use one global shared file pointer to perform I/O operation;
s3, each process reads a plurality of data in the file from the file into a memory buffer area by calling a file reading function of the MPI;
s4, each process realizes file closing and I/O operation ending by calling a file pointer closing function of the MPI;
s5, dividing the data into a plurality of independent subdata sets by using a data segmentation algorithm;
and S6, writing the sub data sets into the visual object in a parallel and aggregated manner, and transmitting the visual object to the main process to finish parallel and efficient display results.
Preferably, in step S1, a first parameter of the file operation function represents a communication domain, a second parameter represents a file path and a file name to be opened, a third parameter represents a file opening manner, a fourth parameter transfers information to the I/O implementation by attaching a pair of key values to an information object declaring MPI, a fifth parameter represents a global shared file pointer, which is a handle returned by the I/O operation and can be used for a subsequent I/O operation, and in an object of the fourth parameter, information such as file fragmentation and internal buffer size can be transferred for optimizing the I/O operation of the MPI.
On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.
The positive progress effects of the invention are as follows:
the method provided by the invention can improve the performance of the simulation calculation software from the parallelization point of view aiming at a large number of calculation results of the simulation calculation software, accelerate the post-processing speed of the simulation data from the aspect of combining software and hardware, and avoid that a software user consumes too much time when facing huge data in the post-processing stage, thereby improving the post-processing efficiency while ensuring the accuracy.
Drawings
FIG. 1 is a flow chart of a parallelization method for improving post-processing performance according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1, the present embodiment provides a parallelization method for improving post-processing performance, which is characterized in that it includes the following steps:
in step 101, in the I/O of the parallel MPI (information transfer interface), each process opens a file by calling a file operation function of the MPI.
In step 101, a first parameter of the file operation function represents a communication domain, a second parameter represents a file path and a file name that need to be opened, a third parameter represents a file opening manner, a fourth parameter transmits information to an I/O implementation by attaching a pair of key values to an information object that declares MPI, a fifth parameter represents a global shared file pointer, is a handle returned by the I/O operation, and can be used for a subsequent I/O operation, wherein in the object of the fourth parameter, information such as file fragments and an internal buffer size can be transmitted for optimizing the I/O operation of the MPI.
And 102, each process moves the global shared file pointer to a corresponding position in the file according to the read address displacement by calling a file addressing function of the MPI, and the processes simultaneously use one global shared file pointer to perform I/O operation.
Step 103, each process reads a plurality of data in the file from the file into the memory buffer by calling the file reading function of the MPI.
And step 104, each process realizes file closing and I/O operation ending by calling a file pointer closing function of the MPI.
And 105, segmenting the data into a plurality of independent subdata sets by using a data segmentation algorithm.
When data segmentation is carried out, load distribution balance is ensured, so that delay caused by waiting is reduced. Each processor is responsible for processing different data subsets, has a set of processing flow, and in visualization, after each process obtains a result, the result of each subprocess is transmitted to a main process for display.
And 106, parallelly and collectively writing the sub data sets into the visual object, and transmitting the visual object to the main process to finish parallel and efficient display results.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims (2)

1. A parallelization method for improving post-processing performance, comprising the steps of:
s1, in the I/O of the parallel MPI, each process opens the file by calling the file operation function of the MPI;
s2, each process moves the global shared file pointer to the corresponding position in the file according to the read address displacement by calling the file addressing function of MPI, and the processes simultaneously use one global shared file pointer to perform I/O operation;
s3, each process reads a plurality of data in the file from the file into a memory buffer area by calling a file reading function of the MPI;
s4, each process realizes file closing and I/O operation ending by calling a file pointer closing function of the MPI;
s5, dividing the data into a plurality of independent subdata sets by using a data segmentation algorithm;
and S6, writing the sub data sets into the visual object in a parallel and aggregated manner, and transmitting the visual object to the main process to finish parallel and efficient display results.
2. The parallelization method for improving post-processing performance according to claim 1, wherein in step S1, a first parameter of the file operation function represents a communication domain, a second parameter represents a file path and a file name to be opened, a third parameter represents a file opening manner, a fourth parameter transfers information to the I/O implementation by appending a pair of key values to an information object declaring MPI, a fifth parameter represents a global shared file pointer, is a handle returned by the I/O operation, and can be used for a subsequent I/O operation, and in an object of the fourth parameter, information such as file fragmentation and an internal buffer size can be transferred to optimize the I/O operation of the MPI.
CN202111287291.3A 2021-11-02 2021-11-02 Parallelization method for improving post-processing performance Withdrawn CN114003385A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111287291.3A CN114003385A (en) 2021-11-02 2021-11-02 Parallelization method for improving post-processing performance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111287291.3A CN114003385A (en) 2021-11-02 2021-11-02 Parallelization method for improving post-processing performance

Publications (1)

Publication Number Publication Date
CN114003385A true CN114003385A (en) 2022-02-01

Family

ID=79926309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111287291.3A Withdrawn CN114003385A (en) 2021-11-02 2021-11-02 Parallelization method for improving post-processing performance

Country Status (1)

Country Link
CN (1) CN114003385A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116185662A (en) * 2023-02-14 2023-05-30 国家海洋环境预报中心 Asynchronous parallel I/O method based on NetCDF and non-blocking communication

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116185662A (en) * 2023-02-14 2023-05-30 国家海洋环境预报中心 Asynchronous parallel I/O method based on NetCDF and non-blocking communication
CN116185662B (en) * 2023-02-14 2023-11-17 国家海洋环境预报中心 Asynchronous parallel I/O method based on NetCDF and non-blocking communication

Similar Documents

Publication Publication Date Title
US11023206B2 (en) Dot product calculators and methods of operating the same
EP3144805B1 (en) Method and processing apparatus for performing arithmetic operation
Wang et al. Workload analysis and efficient OpenCL-based implementation of SIFT algorithm on a smartphone
WO2017185393A1 (en) Apparatus and method for executing inner product operation of vectors
KR102371844B1 (en) Computing method applied to artificial intelligence chip, and artificial intelligence chip
WO2019019926A1 (en) System parameter optimization method, apparatus and device, and readable medium
CN115880132A (en) Graphics processor, matrix multiplication task processing method, device and storage medium
CN111651206A (en) Device and method for executing vector outer product operation
CN114003385A (en) Parallelization method for improving post-processing performance
CN114048816B (en) Method, device, equipment and storage medium for sampling data of graph neural network
Xiao et al. Image Sobel edge extraction algorithm accelerated by OpenCL
CN113538687B (en) Finite element visualization method, system, device and storage medium based on VTK
US9032405B2 (en) Systems and method for assigning executable functions to available processors in a multiprocessing environment
Tan et al. Parallel particle swarm optimization algorithm based on graphic processing units
Sun et al. Efficient knowledge graph embedding training framework with multiple gpus
CN106708499B (en) Analysis method and analysis system of drawing processing program
He et al. An optimal parallel implementation of Markov Clustering based on the coordination of CPU and GPU
CN116579914B (en) Execution method and device of graphic processor engine, electronic equipment and storage medium
Kłopotek et al. Solving systems of polynomial equations on a GPU
US11630667B2 (en) Dedicated vector sub-processor system
Koprawi Parallel Computation in Uncompressed Digital Images Using Computer Unified Device Architecture and Open Computing Language
CN112328960B (en) Optimization method and device for data operation, electronic equipment and storage medium
CN117349190A (en) Memory allocation method, accelerator, and storage medium
CN117556273A (en) Method and device for calculating contrast loss through multiple graphic processors
CN118051264A (en) Matrix processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20220201