CN113835869A - Load balancing method and device based on MPI, computer equipment and storage medium - Google Patents

Load balancing method and device based on MPI, computer equipment and storage medium Download PDF

Info

Publication number
CN113835869A
CN113835869A CN202010580088.4A CN202010580088A CN113835869A CN 113835869 A CN113835869 A CN 113835869A CN 202010580088 A CN202010580088 A CN 202010580088A CN 113835869 A CN113835869 A CN 113835869A
Authority
CN
China
Prior art keywords
time
seismic data
node
mpi
load balancing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010580088.4A
Other languages
Chinese (zh)
Other versions
CN113835869B (en
Inventor
杨尚琴
洪承煜
黄少华
许自龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Petroleum and Chemical Corp
Sinopec Geophysical Research Institute
Original Assignee
China Petroleum and Chemical Corp
Sinopec Geophysical Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Petroleum and Chemical Corp, Sinopec Geophysical Research Institute filed Critical China Petroleum and Chemical Corp
Priority to CN202010580088.4A priority Critical patent/CN113835869B/en
Publication of CN113835869A publication Critical patent/CN113835869A/en
Application granted granted Critical
Publication of CN113835869B publication Critical patent/CN113835869B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention provides a load balancing method, a device, computer equipment and a storage medium based on MPI, wherein the method comprises the steps of obtaining first time, obtaining second time, obtaining third time, obtaining fourth time, summing the third time and the fourth time to obtain fifth time, and judging whether the first time is less than or equal to the fifth time; when the first time is less than or equal to the fifth time, judging whether the fifth time is greater than the second time; when the sum of the fifth time and the second time is greater than the sum of the second time, on the premise that the first time is less than or equal to the fifth time, the number of bytes of the seismic data transmitted by the computing node or each time is adjusted to enable the difference value between the fifth time and the second time to reach the minimum value, and the number of bytes of the seismic data transmitted by the computing node or each time is adjusted to read more seismic data files from the storage device, so that the seismic data processing capacity of the computing unit is met, the optimal load balance state is achieved, and the seismic data processing and transmission efficiency is improved.

Description

Load balancing method and device based on MPI, computer equipment and storage medium
Technical Field
The invention relates to the technical field of petroleum seismic exploration, in particular to a load balancing method and device based on MPI, computer equipment and a storage medium.
Background
Currently, in seismic data intensive and computation intensive high-performance computation, MPI (Message serving interface) is a commonly used efficient method for seismic data transmission between nodes, but because MPI does not have a corresponding load balancing function, load balancing when MPI is used is solved by a user, which brings difficulty to professional method personnel. Load balancing is an important technique in clustered parallel computing systems to improve the performance of clustered systems by balancing system loads among the nodes connected by a high speed network. Existing research shows that the performance of the system can be remarkably improved by adopting a load balancing system in a cluster system. In most current computing clusters, reading and writing seismic data storage devices are often bottlenecks in the whole computation, and especially when a plurality of write request storage devices are not load balanced, the efficiency of I/O is reduced by times, that is, the seismic data transmission efficiency is reduced.
Disclosure of Invention
In view of the foregoing, it is desirable to provide an MPI-based load balancing method, apparatus, computer device, and storage medium.
An MPI-based load balancing method comprising:
acquiring a first time, wherein the first time is the total time required for n nodes to read seismic data with a specified size from a storage device, and n is greater than 1;
acquiring second time, wherein the second time is the time required for transmitting the seismic data from the node 0 or the storage device to the node n;
acquiring third time, wherein the third time is time required for transmitting the seismic data from a node m to a node 0, m is a node which receives the seismic data firstly and starts to calculate in n nodes, and n > m > 1;
acquiring fourth time, wherein the fourth time is the time required by the computing unit to finish the seismic data processing and computing for one time;
summing the third time and the fourth time to obtain a fifth time;
judging whether the first time is less than or equal to the fifth time;
when the first time is less than or equal to the fifth time, judging whether the fifth time is greater than the second time;
when the sum of the fifth time is greater than the second time, on the premise that the first time is less than or equal to the fifth time, the difference between the fifth time and the second time is enabled to reach the minimum value by adjusting the number of bytes of the computing node or transmitting the seismic data each time, so as to realize load balance.
In one embodiment, after the step of determining whether the first time is less than or equal to the fifth time, the method further includes:
when the first time is longer than the fifth time, calculating the optimal buffer area size of the seismic data node by giving different MPI process numbers and testing;
and adjusting the size of the buffer area of the seismic data node according to the optimal size of the buffer area of the seismic data node.
In one embodiment, after the step of adjusting the buffer size of the seismic data node according to the optimal buffer size of the seismic data node, the method further comprises:
and adjusting the small-granularity parallel efficiency to realize load balance.
In one embodiment, after the step of determining whether the fifth time is greater than the second time, the method further includes:
when the fifth time is less than the second time, calculating the optimal buffer area size of the calculation unit by giving different MPI process numbers and testing;
adjusting the size of the buffer area of the computing node according to the optimal size of the buffer area of the computing unit;
and adjusting the small-granularity parallel efficiency to realize load balance.
In one embodiment, the obtaining a first time, the first time being a total time required for n nodes to read seismic data of a specified size from a storage device, where n >1, comprises:
performing multiple reading operations on the seismic data to obtain seismic data reading speed;
calculating the seismic data reading speed according to a seismic data reading analysis algorithm to obtain the average time of the nodes for reading data from the storage equipment;
and obtaining the first time according to the number n of the nodes and the average time.
In one embodiment, the step of performing a plurality of reading operations on the seismic data to obtain a seismic data reading speed includes:
and performing multiple reading operations on the seismic data in the segy format to obtain the reading speed of the seismic data.
In one embodiment, the acquiring a second time, which is the time required for transmitting the seismic data from node 0 or the storage device to node n, comprises:
and acquiring the time required for transmitting the seismic data from the node 0 or the storage device to the node n for multiple times, and carrying out weighted average on the time required for transmitting the seismic data to obtain the second time.
In one embodiment, an MPI-based load balancing apparatus includes:
a first time acquisition module, configured to acquire a first time, where the first time is a total time required for n nodes to read seismic data of a specified size from a storage device, where n > 1;
a second time obtaining module, configured to obtain a second time, where the second time is a time required for transmitting the seismic data from node 0 or the storage device to node n;
a third time obtaining module, configured to obtain a third time, where the third time is a time required for transmitting the seismic data from a node m to a node 0, where m is a node that has received the seismic data first and starts to calculate, and n > m >1, in n nodes;
the fourth module acquisition module is used for acquiring fourth time, wherein the fourth time is the time required by the calculation unit to finish the seismic data processing and calculation;
a fifth time obtaining module, configured to sum the third time and the fourth time to obtain a fifth time;
the first judging module is used for judging whether the first time is less than or equal to the fifth time;
the second judging module is used for judging whether the fifth time is greater than the second time or not when the first time is less than or equal to the fifth time;
and the load balancing module is used for adjusting the number of bytes of the calculation node or the seismic data transmitted each time to enable the difference value between the fifth time and the second time to reach the minimum value on the premise that the first time is less than or equal to the fifth time when the sum of the fifth time and the second time is greater than the second time, so as to realize load balancing.
In one embodiment, a computer device comprises a memory storing a computer program and a processor implementing the steps of the method of any of the above embodiments when the processor executes the computer program.
In one of the embodiments, a computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the method of any of the above embodiments.
According to the MPI-based load balancing method, the MPI-based load balancing device, the computer equipment and the storage medium, whether the first time is less than the fifth time is judged, if so, the capacity of the computing unit for receiving and processing the seismic data is larger than the quantity of seismic data files read from the storage equipment, namely, the quantity of the seismic data produced by a producer is smaller than that of the seismic data consumed by a consumer, and at the moment, more seismic data files are read from the storage equipment by adjusting the number of bytes of the computing node or transmitting the seismic data every time, so that the capacity of the computing unit for processing the seismic data is met, the optimal load balancing state is achieved, the seismic data processing and transmitting efficiency is improved, and the system performance is improved.
Drawings
FIG. 1 is a schematic flow chart of an MPI-based load balancing method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a cluster processing time requirement flow of single-node read-write seismic data management and multi-node computation in the present invention;
FIG. 3 is a schematic flow chart of an MPI-based load balancing method according to another embodiment of the present invention;
fig. 4 is an internal structural diagram of a computer device in one embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
For example: provided is an MPI-based load balancing method, including:
acquiring a first time, wherein the first time is the total time required for n nodes to read seismic data with a specified size from a storage device, and n is greater than 1;
acquiring second time, wherein the second time is the time required for transmitting the seismic data from the node 0 or the storage device to the node n;
acquiring third time, wherein the third time is time required for transmitting the seismic data from a node m to a node 0, m is a node which receives the seismic data firstly and starts to calculate in n nodes, and n > m > 1;
acquiring fourth time, wherein the fourth time is the time required by the computing unit to finish the seismic data processing and computing for one time;
summing the third time and the fourth time to obtain a fifth time;
judging whether the first time is less than or equal to the fifth time;
when the first time is less than the fifth time, judging whether the fifth time is greater than the second time;
when the sum of the fifth time is greater than the second time, on the premise that the first time is less than or equal to the fifth time, the difference between the fifth time and the second time is enabled to reach the minimum value by adjusting the number of bytes of the computing node or transmitting the seismic data each time, so as to realize load balance.
According to the MPI-based load balancing method, the MPI-based load balancing device, the computer equipment and the storage medium, whether the first time is less than the fifth time is judged, if so, the capacity of the computing unit for receiving and processing the seismic data is larger than the quantity of seismic data files read from the storage equipment, namely, the quantity of the seismic data produced by a producer is smaller than that of the seismic data consumed by a consumer, and at the moment, more seismic data files are read from the storage equipment by adjusting the number of bytes of the computing node or transmitting the seismic data every time, so that the capacity of the computing unit for processing the seismic data is met, the optimal load balancing state is achieved, the seismic data processing and transmitting efficiency is improved, and the system performance is improved.
Referring to fig. 1, in one embodiment, an MPI-based load balancing method is provided, including:
step 110, a first time is obtained, wherein the first time is a total time required for n nodes to read seismic data with a specified size from a storage device, and n > 1.
Specifically, the seismic data or the seismic data file is stored in the storage device, the computing master node reads the seismic data with the specified size from the storage device, and the time I required for each node to read the seismic data with the specified size from the storage device is counted, so that the total time required by n nodes is n × I, and the time is taken as the first time. It should be noted that the specified size may be understood as a preset size, that is, seismic data of a specified data size. For a seismic data, the data size is fixed.
And 120, acquiring second time, wherein the second time is the time required for transmitting the seismic data from the node 0 or the storage device to the node n.
Specifically, if there are n nodes from node 0 to node n, the time required for transmitting the seismic data from node 0 or the storage device to node x is II(1)(x) Wherein 1. ltoreq. x.ltoreq.n, and therefore, the expression for the second time II is
Figure BDA0002552050640000061
Wherein i is more than or equal to 1 and less than or equal to n.
Step 130, obtaining a third time, wherein the third time is a time II required for transmitting the seismic data from the node m to the node 0(2)(m), where m is the node of n nodes that has received the seismic data first and started to calculate, and n is>m>1. Specifically, since the seismic data sent from the node 0 or the storage device and the received seismic data are not necessarily equal, the second time and the third time are calculated here to improve the accuracy of the calculation.
And 140, acquiring a fourth time III, wherein the fourth time is the time required by the computing unit to finish the seismic data processing and computing once.
And 150, summing the third time and the fourth time to obtain a fifth time. Specifically, the fifth time is the sum of the third time and the fourth time.
Step 160, determining whether the first time is less than or equal to the fifth time. Specifically, it is determined whether the first time is less than the fifth time, that is, whether the total time required for the n nodes to read the seismic data of the specified size from the storage device is less than or equal to the sum of the time required for the node m to the node 0 to transmit the seismic data and the time required for the computing unit to complete the seismic data processing and computing once. It is understood that the determination of whether the first time is less than or equal to the fifth time, which is equivalent to the determination of whether the computing unit has the capability of receiving and processing seismic data greater than the number of seismic data files read from the storage device, is also understood as the determination of whether the producer can produce the seismic data required by the consumer in real time.
Step 170, when the first time is less than the fifth time, determining whether the fifth time is greater than the second time. Specifically, when the first time is less than the fifth time, it indicates that the capability of the computing unit to receive and process the seismic data is less than the number of seismic data files read from the storage device, and it is also understood that the producer can produce the seismic data required by the consumer in real time. And at the moment, the fifth time which is required to be met by the task processing and calculation as much as possible is equal to the second time, namely the consumer can consume the produced seismic data in real time. Thus, the relationship between the fifth time and the second time is determined.
And step 180, when the sum of the fifth time and the second time is greater than the sum of the second time, on the premise that the first time is less than or equal to the fifth time, adjusting the number of bytes of the calculation node or the seismic data transmitted each time to enable the difference between the fifth time and the second time to reach the minimum value, so as to realize load balance.
Specifically, when the fifth time is longer than the second time, it is described that the number of bytes of seismic data transmitted each time may be increased, and the maximum first time is equal to the fifth time, that is, the producer may produce more seismic data for consumption by the consumer, so as to adjust the storage device and the network load node in the MPI, so as to achieve a load balancing state or a load balancing state as much as possible. It should be understood that when the number of bytes of the seismic data is adjusted, the total time required for the n nodes to read the seismic data with the specified size from the storage device is changed, so that when the number of bytes of the seismic data is adjusted, the first time obtained after the adjustment is controlled to be less than or equal to the fifth time.
Specifically, on the premise that the first time is less than or equal to the fifth time, the number of bytes of the calculation node or the seismic data transmitted each time is adjusted, so that the difference between the fifth time and the second time reaches a minimum value. The number of bytes of the seismic data transmitted each time can be increased, and the maximum time is equal to the second time, namely, the producer can produce more seismic data for the consumer to consume.
It is understood that the load balancing described in this embodiment may be saturated load balancing, which means that the final structure allows the computing and processing hardware resources to run at full speed from the beginning to the end of the whole task.
Specifically, the running time is an important index for measuring the performance of the parallel program. A parallel program is usually composed of multiple processes, and in a cluster environment, usually one process is run on each node, and the running time of the parallel program is determined by the node which completes the corresponding task at the latest. Therefore, the task needs to be optimally distributed through a scheduling and distribution algorithm, so that the average response time is effectively reduced, and the extra overhead in execution is reduced. Therefore, load balancing has been an important consideration in parallel programming, one of which is to improve system performance and shorten the average response time of user tasks, and the other is to fully utilize the resources of the whole system.
In order to facilitate better understanding of the technical solution of the present invention, that is, understanding the relationship corresponding to each time relationship, the following detailed description is made, and in the following description of seismic data distribution and reception, if the storage device is on the master node, the storage device refers to seismic data distribution and reception between the master node and the computing node; if the storage device is an independent node, and the master node and the computing node are connected through the network, the master node tells the computing node that the position and the size of seismic data and other control information need to be read, and the computing node reads and writes the seismic data on the storage device through the network. Thus, the typical steps for analyzing seismic data processing and calculations are:
in the first step, the master node analyzes control information such as the position and size of the calculated seismic data of each calculation unit and transmits the control information to the calculation unit.
And secondly, reading the seismic data from the specified storage device by the computing unit according to the control information, and placing the seismic data into empty buffers opened for the computing units. Specifically, the empty buffers do not have seismic data that can be processed by the corresponding computing units. This step will read the seismic data all the time into the empty buffer. The method is characterized in that the method is an independent seismic data reading thread, seismic data required by a current computing unit are read into an empty buffer area all the time, if the empty buffer area is full, a waiting stage is carried out, once data are stored into the empty buffer area, the seismic data required to be computed are read until the seismic data required to be computed are completely read.
And thirdly, the task allocation unit distributes the data in the full buffer area to each computing node buffer pool by using a communication technology, and the data are used for computing by the computing unit. Specifically, the task allocation unit is generally a master node, and a full buffer area, that is, data in the buffer area is not processed by the corresponding computing unit; the computing units perform computations, typically as threads in processes within the node.
A fourth step of receiving or reducing the results obtained by each computing unit to result collecting nodes, storing the results in temporary files for each computing node, and uniformly controlling by a master node and stipulating after completing tasks; if the calculation task is finished, executing a fifth step; if the calculation task is not finished, returning to execute the third step; and the result collection node is a main node or a storage device. Specifically, receiving or reducing the results obtained by each computing unit to the result collection node increases the number of times of MPI message transmission, so that each computing node is used here for temporary file storage.
And step five, finishing all calculations of the calculation unit, uniformly controlling by the main node, receiving or reducing all seismic data, storing the obtained seismic data, and obtaining a final result of task execution.
Based on the above general steps of processing and calculating the analysis seismic data and in combination with the single-node read-write seismic data management in the MPI mode shown in fig. 2, a cluster processing time requirement flow diagram of multi-node calculation is shown to fully explain the state corresponding to the above time relationship.
According to the MPI-based load balancing method, whether the first time is less than the fifth time is judged, if yes, the capacity of the computing unit for receiving and processing the seismic data is larger than the number of seismic data files read from the storage device, namely, the number of earthquakes produced by a producer is smaller than the number of seismic data consumed by a consumer, at the moment, more seismic data files are read from the storage device by adjusting the number of bytes of the computing node or transmitting the seismic data every time, the capacity of the computing unit for processing the seismic data is met, the optimal load balancing state is achieved, the problem that the load balancing is difficult to manage is solved, the seismic data processing and transmission efficiency is improved, and the system performance is improved.
To better address the I/O throughput bottleneck, further promote load balancing, in one embodiment, after the step of determining whether the first time is less than or equal to the fifth time; further comprising:
when the first time is longer than the fifth time, calculating the optimal buffer area size of the seismic data node by giving different MPI process numbers and testing;
and adjusting the size of the buffer area of the seismic data node according to the optimal size of the buffer area of the seismic data node.
Specifically, when the first time is longer than the fifth time, that is, the total time required for the n nodes to read the seismic data of the specified size from the storage device is longer than the sum of the time required for the seismic data to be transmitted from the node m to the node 0 and the time required for the computing unit to complete the seismic data processing and computing once. It will be appreciated that when the first time is greater than the fifth time, which corresponds to the computing unit receiving and processing seismic data, the number of seismic data files read from the storage device is less. It can also be understood that seismic data produced by a producer cannot meet seismic data required by a consumer in real time, and an I/O (Input/Output) throughput bottleneck exists at this time. For the data node, different MPI process numbers are given and tested, and the optimal buffer area size of the seismic data node is calculated; and adjusting the size of the buffer area of the seismic data node according to the optimal size of the buffer area of the seismic data node. That is, in known cluster nodes and corresponding computer algorithm implementations, different MPI process numbers are given, and through testing, the optimal buffer size of seismic data nodes is calculated. The standard of the test is that all nodes in the cluster reach saturation load balance as much as possible, so that the bottleneck of I/O throughput rate of the storage device is relieved, and the load balance is improved, so that the problem that the load balance caused by MPI is difficult to manage is further solved.
In one embodiment, the step of balancing the load by adjusting the number of bytes of the computing node or each time of transmitting the seismic data to minimize the difference between the fifth time and the second time further comprises:
and adjusting the small-granularity parallel efficiency to realize load balance.
Specifically, the small-granularity parallel efficiency is adjusted, that is, the small-granularity parallel efficiency is improved. On the basis of giving different MPI process numbers and testing, calculating an optimal buffer area size, and adjusting the seismic data node according to the optimal buffer area size, by improving small-granularity parallel efficiency, for example, by using multithreading technologies such as Phtread (POSIX Threads, portable operating system interface)/OpenMp (Open Multi-Processing), and the like, the performance of an SMP (symmetric Multi-Processing) Multi-core processor is exerted, the time of algorithm calculation is reduced, so that the first time can be reduced, the first time is as close to the fifth time as possible, or the first time is less than or equal to the fifth time, so as to further relieve or solve the bottleneck of I/O throughput, and better achieve load balance.
In order to better alleviate the network bottleneck and promote the load balancing, in one embodiment, after the step of determining whether the fifth time is greater than the second time, the method further includes:
when the fifth time is less than the second time, calculating the optimal buffer area size of the calculation unit by giving different MPI process numbers and testing;
adjusting the size of the buffer area of the computing node according to the optimal size of the buffer area of the computing unit;
and adjusting the small-granularity parallel efficiency to realize load balance.
Specifically, when the fifth time is less than the second time, that is, when the sum of the time required for transmitting the seismic data from the node m to the node 0 and the time required for the computing unit to complete the seismic data processing and computing once is less than the time required for transmitting the seismic data from the node 0 or the storage device to the node n, it indicates that the node is not operated in saturation load balance, and at this time, the waste of the corresponding computing resource is inevitable, that is, the resource of the network speed is not enough for the consumer to use, which is a bottleneck of the network speed. For this, the optimal buffer area size of the node of the computing unit is calculated by giving different MPI process numbers and testing; and adjusting the size of the buffer area of the computing unit node according to the optimal size of the buffer area of the computing unit node. That is, in known cluster nodes and corresponding computer algorithm implementations, different MPI process numbers are given, and through testing, the optimal buffer size of the compute unit node is calculated. The standard of the test is that all nodes in the cluster reach saturation load balance as much as possible to relieve the network bottleneck of the storage device and improve the load balance, so as to further solve the problem that the load balance caused by MPI is difficult to manage. And, adjust the parallel efficiency of small-size degree, improve the parallel efficiency of small-size degree promptly. On the basis of giving different MPI process numbers and testing, calculating the optimal buffer area size and adjusting the seismic data node according to the optimal buffer area size, by improving small-granularity parallel efficiency, for example, by means of multithreading technologies such as Phtread (POSIX Threads, portable operating system interface)/OpenMp (Open Multi-Processing, shared storage parallel programming) and the like, the performance of an SMP (symmetric Multi-Processing, symmetric multiprocessing) Multi-core processor is exerted, the time of algorithm calculation is reduced, and the effect of relieving network bottleneck can be realized.
In the above embodiment, two levels of buffers are used, where the two levels of buffers are a seismic data reading node buffer and each computing node buffer. The seismic data node buffer areas are used for relieving the I/O bottleneck of the storage device, and each computing node buffer area is used for relieving the network bottleneck and reducing the communication times between the computing nodes and the data nodes as much as possible. The problem that load balance is difficult to manage due to MPI programming is solved to a certain extent, and the efficiency and simplicity of high-performance calculation are improved.
To more accurately calculate the first time, referring to fig. 3, in one embodiment, the obtaining the first time is a total time required for n nodes to read seismic data of a specified size from a storage device, where n >1, and includes:
and step 111, performing reading operation on the seismic data for multiple times to obtain the reading speed of the seismic data.
And 112, calculating the seismic data reading speed according to a seismic data reading analysis algorithm to obtain the average time of the nodes for reading data from the storage device.
And 113, obtaining the first time according to the node number n and the average time.
Specifically, the uncertainty of the user process when reading and writing the file in the storage device depends on the access time of the magnetic head to a great extent; the method has the advantages that the speed is read for a plurality of times for the seismic data with the specified size, the seismic data reading speed analysis algorithm is used, so that the average time of the data of the storage device is obtained, the error generated by single statistics is reduced, the accuracy of calculation in the first time can be improved, and the load balance is better achieved. It should be understood that the first time is obtained according to the number n of nodes and the average time, that is, the number n of nodes is multiplied by the average time to obtain the first time.
In one embodiment, the step of performing a plurality of reading operations on the seismic data to obtain a seismic data reading speed includes:
and performing multiple reading operations on the seismic data in the segy format to obtain the reading speed of the seismic data.
Specifically, the seismic data is generally organized in units of seismic traces and stored in a segy file format. The segy format is one of the standard tape data formats proposed by SEG (Society of Exploration geomatics, Society of Geophysicists), which is one of the most common formats of seismic data in the oil Exploration industry. Therefore, taking the seismic data in the segy format as an example, the achieved load balance can be made representative.
To accurately calculate the second time, in one embodiment, the obtaining the second time, which is the time required for transmitting the seismic data from node 0 or the storage device to node n, includes:
and acquiring the time required for transmitting the seismic data from the node 0 or the storage device to the node n for multiple times, and carrying out weighted average on the time required for transmitting the seismic data to obtain the second time.
Specifically, the calculated value of the second time depends on the transmission rate of the network, various unknown network environments and the time of reading and writing the disk in the first time, test recording is performed by means of the communication function provided in the MPI standard, and finally the time is calculated. And acquiring the second time by acquiring the time required for transmitting the seismic data from the node 0 or the storage device to the node n for multiple times and performing weighted average on the time required for transmitting the seismic data, so as to avoid the calculating contingency of the second time and improve the calculating precision of the second time.
In one embodiment, for the calculation of the third time, a weighted average algorithm may also be used to perform a weighted average on the time required for transmitting the seismic data from node m to node 0 to obtain the third time. The principle and effect are equivalent to the second time obtaining manner, and in this embodiment, the details are not described again.
The following is a specific example:
"I" in FIG. 2 represents the time required to read seismic data of a specified size from the storage device. Because the seismic data file is stored in the storage device, access to various portions of the seismic data may occur at different times. The operation process of the file system on the storage device is analyzed to illustrate the calculation method of the "I" time. The uncertainty in the user process reading and writing files in the storage device depends largely on the head access time. Therefore, in order to calculate the time "I" in fig. 2 more accurately, the present scheme obtains the average value of the storage device data as the time "I" by performing multiple reading speeds on the seismic data in the segy format and using a seismic data reading speed analysis algorithm.
"X-II (1)" refers to the time required to transmit specified seismic data from node 0 or storage device to node X (1 to n); "X-II (2)" refers to the time required to transmit the specified seismic data from node X (1 to n) to node 0. Since the seismic data sent out from node 0 or the storage device and the seismic data received back are not necessarily equal, the two are calculated separately in time here.
The specific calculation value depends on the transmission rate of the network and various unknown network environments and the read-write disk time in the analysis time of the I. In order to avoid the analysis difficulty brought by a complex network environment, the scheme does not perform detailed analysis of specific time, but performs test recording by depending on a communication function provided in an MPI standard, and finally calculates the time. Similarly, the calculation of the time of the 'I' is carried out, the accuracy of the calculated value is improved in order to avoid the contingency, and the times of the 'X-II (1)' and the 'X-II (2)' are obtained by carrying out weighted averaging on test data for multiple times.
The 'III' time refers to the time required by the computing unit to complete one time of processing and computing the designated seismic data. It depends on the size of the current seismic data block, the complexity of the algorithm, and the computer-to-algorithm processing technique. In the step, a multi-thread algorithm is realized by using Pthread/OpenMP and other multi-thread means, a standard test routine is used for solving specific seismic data, and the time required for completing the calculation of the specific algorithm is 'III' time.
In the following expressions of seismic data distribution and reception, if the storage device is on the primary node, it refers to seismic data distribution and reception between the primary node and the computing node; if the storage device is an independent node, and the master node and the computing node are connected through the network, the master node tells the computing node that the position and the size of seismic data and other control information need to be read, and the computing node reads and writes the seismic data on the storage device through the network.
The general steps of analytical seismic data processing and computation are:
(1) the master node analyzes control information such as the position and size of the calculated seismic data required by each calculation unit and transmits the control information to the calculation unit. The computing units read the seismic data from the designated storage device according to the control information and place the seismic data into an empty buffer area (the buffer area does not contain the seismic data which can be processed by the corresponding computing unit) opened up for each computing unit. This step will always read the seismic data into the empty buffer. Namely, the method is a single seismic data reading thread, and can read the seismic data required by the current computing unit into an empty buffer area all the time, wait if the empty buffer area is full, and read the seismic data required to be computed once the buffer area is free until the reading of the seismic data required to be computed is finished.
(2) The task allocation unit (typically a master node) uses a communication technique (this scheme is message passing of MPI) to distribute data in a full buffer (data in this buffer is not processed by the corresponding compute unit) to each compute node buffer pool and allow the compute unit (typically a thread in a process within a node) to perform computation.
(3) The results obtained by each computing unit are received (reduced) to result collecting nodes (generally, main nodes and/or storage devices), and the method can increase the MPI message transmission times, so that each computing node is stored by using a temporary file, and the main node control and the specification are unified after the task is completed. If the calculation task is finished, skipping (4) to execute; and (3) if the computing task is not finished, skipping to the step (2) for execution.
(4) And finishing all calculations of the calculation unit, uniformly controlling by the main node, receiving (reducing) all seismic data, storing the obtained seismic data, and obtaining a final result of task execution.
In connection with the MPI mode used in fig. 2, it can be seen from the above that in order to achieve as much as possible saturated load balancing:
if the time (I) is less than or equal to the time (III) plus the time (m-II (2)) of the producer (namely the producer can generate the seismic data required by the consumer in real time), the processing and calculation of the task meet the following conditions as much as possible: (III time + m-II (2) "time) is equal to (1-II (1)" time + … + "n-II (1)" time) (i.e., the consumer can consume the produced seismic data in real time). Wherein m is the node/process which finishes receiving the seismic data firstly and starts computing in the n computing nodes/processes.
If the time (III) plus m-II (2) is greater than the time (1-II (1) plus … plus n-II (1)), the calculation nodes can be increased or the number of bytes of seismic data transmitted each time can be increased to the maximum value (I time n) equal to the time (III time plus m-II (2)), so that the producer can produce more seismic data for the consumer to consume. If the time (III) plus the time (m-II (2)) is less than the time (1-II (1) + … plus the time (n-II (1)), the current existence node does not operate under the saturated load balance, and the waste of the corresponding computing resource is inevitable at the moment, namely the network speed is not enough for the consumer to use, and the network speed bottleneck is formed. Wherein ("1-II (1)" time + … + "n-II (1)" time): all computing nodes/processes.
However, if the time (I) is greater than the time (III) + (m-II (2)), the producer cannot satisfy the consumer in real time, and the consumer is required to wait, which is the I/O bottleneck.
Summarizing the above, these analyses are ideal processes, and in a practical clustering environment, there may be many unknown factors affecting the parallel computing efficiency. However, the following 2 aspects are certain as key elements of parallel efficiency improvement:
(1) in known cluster node and corresponding calculation algorithm implementation, different MPI process numbers are given; and through testing, calculating the optimal buffer size of the computing unit. The standard of the test is that all nodes in the cluster reach saturation load balance as much as possible, namely, the best parallel computing efficiency is achieved by using proper space to exchange time.
(2) On the basis of (1), small-granularity parallel efficiency is improved, namely the performance of the SMP multi-core processor is exerted through a multithreading technology such as Phtread/OpenMP, and the time of algorithm calculation is reduced.
Therefore, for algorithms with different calculation speeds, technologies such as using corresponding MPI process numbers, reasonably and effectively exchanging space for time strategies, multithreading and the like are necessary.
The invention describes a seismic data load balancing model in parallel computing based on MPI, and provides a seismic data load balancing method in parallel computing based on MPI through analysis of common clusters, in particular to reading, writing and computing analysis of seismic data in a cluster mode of MPI single-node reading and writing management and multi-node computing. The method employs two levels of buffers (a seismic data reading node buffer and respective compute node buffers). The seismic data node buffer areas are used for relieving the I/O bottleneck of the storage device, and each computing node buffer area is used for relieving the network bottleneck and reducing the communication times between the computing nodes and the data nodes as much as possible. The method solves the problem that load balance is difficult to manage due to MPI programming to a certain extent, and simultaneously improves the efficiency and the simplicity of high-performance calculation by combining other technologies on the basis of MPI.
In one embodiment, the MPI-based load balancing apparatus is implemented by using the MPI-based load balancing method according to any one of the above embodiments. In one embodiment, the MPI-based load balancing apparatus includes respective modules for implementing the steps of the MPI-based load balancing method.
In one embodiment, an MPI-based load balancing apparatus includes:
a first time acquisition module, configured to acquire a first time, where the first time is a total time required for n nodes to read seismic data of a specified size from a storage device, where n > 1;
a second time obtaining module, configured to obtain a second time, where the second time is a time required for transmitting the seismic data from node 0 or the storage device to node n;
a third time obtaining module, configured to obtain a third time, where the third time is a time required for transmitting the seismic data from a node m to a node 0, where m is a node that has received the seismic data first and starts to calculate, and n > m >1, in n nodes;
the fourth module acquisition module is used for acquiring fourth time, wherein the fourth time is the time required by the calculation unit to finish the seismic data processing and calculation;
a fifth time obtaining module, configured to sum the third time and the fourth time to obtain a fifth time;
the first judging module is used for judging whether the first time is less than or equal to the fifth time;
the second judging module is used for judging whether the fifth time is greater than the second time or not when the first time is less than or equal to the fifth time;
and the load balancing module is used for adjusting the number of bytes of the calculation node or the seismic data transmitted each time to enable the difference value between the fifth time and the second time to reach the minimum value on the premise that the first time is less than or equal to the fifth time when the sum of the fifth time and the second time is greater than the second time, so as to realize load balancing.
According to the MPI-based load balancing device, whether the first time is less than the fifth time is judged, if yes, the capacity of the computing unit for receiving and processing the seismic data is larger than the number of seismic data files read from the storage device, namely, the number of earthquakes produced by a producer is smaller than the number of seismic data consumed by a consumer, at the moment, more seismic data files are read from the storage device by adjusting the number of bytes of the computing node or transmitting the seismic data every time, the capacity of the computing unit for processing the seismic data is met, the optimal load balancing state is achieved, the seismic data processing and transmitting efficiency is improved, and the system performance is improved.
In one embodiment, the MPI-based load balancing apparatus further includes:
and the first buffer area size calculating module is used for calculating the optimal buffer area size of the seismic data node by giving different MPI process numbers and testing when the first time is greater than the fifth time.
And the first adjusting module is used for adjusting the size of the buffer area of the seismic data node according to the optimal buffer area size of the seismic data node.
In one embodiment, the MPI-based load balancing apparatus further includes:
and the parallel efficiency adjusting module is used for adjusting the small-granularity parallel efficiency so as to realize load balance.
In one embodiment, the MPI-based load balancing apparatus further includes:
and the second buffer area size calculating module is used for calculating the optimal buffer area size of the calculating unit by giving different MPI process numbers and testing when the fifth time is less than the second time.
And the second adjusting module is used for adjusting the size of the buffer area of the computing node according to the optimal size of the buffer area of the computing unit.
And the parallel efficiency adjusting module is used for adjusting the small-granularity parallel efficiency so as to realize load balance.
In one embodiment, the first time acquisition module comprises:
and the reading speed acquisition submodule is used for carrying out reading operation on the seismic data for multiple times to obtain the seismic data reading speed.
And the average time calculation submodule is used for calculating the seismic data reading speed according to a seismic data reading analysis algorithm to obtain the average time of the node for reading data from the storage device.
And the first time obtaining submodule is used for obtaining the first time according to the number n of the nodes and the average time.
In one embodiment, the reading speed obtaining sub-module is configured to perform multiple reading operations on the seismic data in the segy format to obtain the seismic data reading speed.
In one embodiment, the second time obtaining module is configured to obtain the time required for transmitting the seismic data from node 0 or the storage device to node n for multiple times, and perform weighted average on the time required for transmitting the seismic data to obtain the second time.
In one embodiment, a computer device is provided, the internal structure of which may be as shown in FIG. 4. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an MPI-based load balancing method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 4 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device includes a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring a first time, wherein the first time is the total time required for n nodes to read seismic data with a specified size from a storage device, and n is greater than 1;
acquiring second time, wherein the second time is the time required for transmitting the seismic data from the node 0 or the storage device to the node n;
acquiring third time, wherein the third time is time required for transmitting the seismic data from a node m to a node 0, m is a node which receives the seismic data firstly and starts to calculate in n nodes, and n > m > 1;
acquiring fourth time, wherein the fourth time is the time required by the computing unit to finish the seismic data processing and computing for one time;
summing the third time and the fourth time to obtain a fifth time;
judging whether the first time is less than or equal to the fifth time;
when the first time is less than or equal to the fifth time, judging whether the fifth time is greater than the second time;
when the sum of the fifth time is greater than the second time, on the premise that the first time is less than or equal to the fifth time, the difference between the fifth time and the second time is enabled to reach the minimum value by adjusting the number of bytes of the computing node or transmitting the seismic data each time, so as to realize load balance.
The computer device judges whether the first time is less than the fifth time, if so, the capacity of the computing unit for receiving and processing the seismic data is larger than the quantity of seismic data files read from the storage device, namely, the quantity of the seismic data produced by a producer is smaller than the seismic data consumed by a consumer, and at the moment, more seismic data files are read from the storage device by adjusting the number of bytes of the computing node or the seismic data transmitted each time, so that the capacity of the computing unit for processing the seismic data is met, the optimal load balance state is achieved, the seismic data processing and transmission efficiency is improved, and the system performance is improved.
In one embodiment, the processor, when executing the computer program, implements the steps of the MPI-based load balancing method of any of the above embodiments.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
when the first time is longer than the fifth time, calculating the optimal buffer area size of the seismic data node by giving different MPI process numbers and testing;
and adjusting the size of the buffer area of the seismic data node according to the optimal size of the buffer area of the seismic data node.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and adjusting the small-granularity parallel efficiency to realize load balance.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
when the fifth time is less than the second time, calculating the optimal buffer area size of the calculation unit by giving different MPI process numbers and testing;
adjusting the size of the buffer area of the computing node according to the optimal size of the buffer area of the computing unit;
and adjusting the small-granularity parallel efficiency to realize load balance.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
performing multiple reading operations on the seismic data to obtain seismic data reading speed;
calculating the seismic data reading speed according to a seismic data reading analysis algorithm to obtain the average time of the nodes for reading data from the storage equipment;
and obtaining the first time according to the number n of the nodes and the average time.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and performing multiple reading operations on the seismic data in the segy format to obtain the reading speed of the seismic data.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and acquiring the time required for transmitting the seismic data from the node 0 or the storage device to the node n for multiple times, and carrying out weighted average on the time required for transmitting the seismic data to obtain the second time.
In one embodiment, a computer readable storage medium having a computer program stored thereon, the computer program when executed by a processor implementing the steps of:
acquiring a first time, wherein the first time is the total time required for n nodes to read seismic data with a specified size from a storage device, and n is greater than 1;
acquiring second time, wherein the second time is the time required for transmitting the seismic data from the node 0 or the storage device to the node n;
acquiring third time, wherein the third time is time required for transmitting the seismic data from a node m to a node 0, m is a node which receives the seismic data firstly and starts to calculate in n nodes, and n > m > 1;
acquiring fourth time, wherein the fourth time is the time required by the computing unit to finish the seismic data processing and computing for one time;
summing the third time and the fourth time to obtain a fifth time;
judging whether the first time is less than or equal to the fifth time;
when the first time is less than or equal to the fifth time, judging whether the fifth time is greater than the second time;
when the sum of the fifth time is greater than the second time, on the premise that the first time is less than or equal to the fifth time, the difference between the fifth time and the second time is enabled to reach the minimum value by adjusting the number of bytes of the computing node or transmitting the seismic data each time, so as to realize load balance.
The storage medium is used for judging whether the first time is less than the fifth time, if so, the capacity of the computing unit for receiving and processing the seismic data is larger than the quantity of seismic data files read from the storage device, namely, the quantity of the seismic data produced by a producer is smaller than the seismic data consumed by a consumer, and at the moment, more seismic data files are read from the storage device by adjusting the number of bytes of the computing node or transmitting the seismic data every time, so that the capacity of the computing unit for processing the seismic data is met, the optimal load balance state is achieved, the seismic data processing and transmitting efficiency is improved, and the system performance is improved.
In one embodiment, the computer program when executed by a processor implements the steps of the MPI-based load balancing method described in any of the above embodiments.
In one embodiment, the computer program when executed by the processor further performs the steps of:
when the first time is longer than the fifth time, calculating the optimal buffer area size of the seismic data node by giving different MPI process numbers and testing;
and adjusting the size of the buffer area of the seismic data node according to the optimal size of the buffer area of the seismic data node.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and adjusting the small-granularity parallel efficiency to realize load balance.
In one embodiment, the computer program when executed by the processor further performs the steps of:
when the fifth time is less than the second time, calculating the optimal buffer area size of the calculation unit by giving different MPI process numbers and testing;
adjusting the size of the buffer area of the computing node according to the optimal size of the buffer area of the computing unit;
and adjusting the small-granularity parallel efficiency to realize load balance.
In one embodiment, the computer program when executed by the processor further performs the steps of:
performing multiple reading operations on the seismic data to obtain seismic data reading speed;
calculating the seismic data reading speed according to a seismic data reading analysis algorithm to obtain the average time of the nodes for reading data from the storage equipment;
and obtaining the first time according to the number n of the nodes and the average time.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and performing multiple reading operations on the seismic data in the segy format to obtain the reading speed of the seismic data.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and acquiring the time required for transmitting the seismic data from the node 0 or the storage device to the node n for multiple times, and carrying out weighted average on the time required for transmitting the seismic data to obtain the second time.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. An MPI-based load balancing method, comprising:
acquiring a first time, wherein the first time is the total time required for n nodes to read seismic data with a specified size from a storage device, and n is greater than 1;
acquiring second time, wherein the second time is the time required for transmitting the seismic data from the node 0 or the storage device to the node n;
acquiring third time, wherein the third time is time required for transmitting the seismic data from a node m to a node 0, m is a node which receives the seismic data firstly and starts to calculate in n nodes, and n > m > 1;
acquiring fourth time, wherein the fourth time is the time required by the computing unit to finish the seismic data processing and computing for one time;
summing the third time and the fourth time to obtain a fifth time;
judging whether the first time is less than or equal to the fifth time;
when the first time is less than or equal to the fifth time, judging whether the fifth time is greater than the second time;
when the sum of the fifth time is greater than the second time, on the premise that the first time is less than or equal to the fifth time, the difference between the fifth time and the second time is enabled to reach the minimum value by adjusting the number of bytes of the computing node or transmitting the seismic data each time, so as to realize load balance.
2. The MPI-based load balancing method of claim 1, wherein after the step of determining whether the first time is less than or equal to the fifth time; further comprising:
when the first time is longer than the fifth time, calculating the optimal buffer area size of the seismic data node by giving different MPI process numbers and testing;
and adjusting the size of the buffer area of the seismic data node according to the optimal size of the buffer area of the seismic data node.
3. The MPI based load balancing method of claim 2, further comprising, after the step of adjusting the buffer size of the seismic data nodes according to the optimal buffer size of the seismic data nodes:
and adjusting the small-granularity parallel efficiency to realize load balance.
4. The MPI-based load balancing method of claim 1, further comprising, after the step of determining whether the fifth time is greater than the second time:
when the fifth time is less than the second time, calculating the optimal buffer area size of the calculation unit by giving different MPI process numbers and testing;
adjusting the size of the buffer area of the computing node according to the optimal size of the buffer area of the computing unit;
and adjusting the small-granularity parallel efficiency to realize load balance.
5. The MPI based load balancing method of claim 1, wherein the obtaining a first time, which is a total time required for n nodes to read seismic data of a specified size from a storage device, where n >1, comprises:
performing multiple reading operations on the seismic data to obtain seismic data reading speed;
calculating the seismic data reading speed according to a seismic data reading analysis algorithm to obtain the average time of the nodes for reading data from the storage equipment;
and obtaining the first time according to the number n of the nodes and the average time.
6. The MPI-based load balancing method according to claim 5, wherein the step of performing a plurality of reading operations on the seismic data to obtain seismic data reading speed comprises:
and performing multiple reading operations on the seismic data in the segy format to obtain the reading speed of the seismic data.
7. The MPI based load balancing method of claim 1, wherein the obtaining a second time, the second time being a time required for transmitting the seismic data from node 0 or the storage device to node n, comprises:
and acquiring the time required for transmitting the seismic data from the node 0 or the storage device to the node n for multiple times, and carrying out weighted average on the time required for transmitting the seismic data to obtain the second time.
8. An MPI-based load balancing apparatus, comprising:
a first time acquisition module, configured to acquire a first time, where the first time is a total time required for n nodes to read seismic data of a specified size from a storage device, where n > 1;
a second time obtaining module, configured to obtain a second time, where the second time is a time required for transmitting the seismic data from node 0 or the storage device to node n;
a third time obtaining module, configured to obtain a third time, where the third time is a time required for transmitting the seismic data from a node m to a node 0, where m is a node that has received the seismic data first and starts to calculate, and n > m >1, in n nodes;
the fourth module acquisition module is used for acquiring fourth time, wherein the fourth time is the time required by the calculation unit to finish the seismic data processing and calculation;
a fifth time obtaining module, configured to sum the third time and the fourth time to obtain a fifth time;
the first judging module is used for judging whether the first time is less than or equal to the fifth time;
the second judging module is used for judging whether the fifth time is greater than the second time or not when the first time is less than or equal to the fifth time;
and the load balancing module is used for adjusting the number of bytes of the calculation node or the seismic data transmitted each time to enable the difference value between the fifth time and the second time to reach the minimum value on the premise that the first time is less than or equal to the fifth time when the sum of the fifth time and the second time is greater than the second time, so as to realize load balancing.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202010580088.4A 2020-06-23 2020-06-23 MPI-based load balancing method, MPI-based load balancing device, computer equipment and storage medium Active CN113835869B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010580088.4A CN113835869B (en) 2020-06-23 2020-06-23 MPI-based load balancing method, MPI-based load balancing device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010580088.4A CN113835869B (en) 2020-06-23 2020-06-23 MPI-based load balancing method, MPI-based load balancing device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113835869A true CN113835869A (en) 2021-12-24
CN113835869B CN113835869B (en) 2024-04-09

Family

ID=78964048

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010580088.4A Active CN113835869B (en) 2020-06-23 2020-06-23 MPI-based load balancing method, MPI-based load balancing device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113835869B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220300321A1 (en) * 2021-03-19 2022-09-22 Regeneron Pharmaceuticals, Inc. Data pipeline

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225443A1 (en) * 2003-05-08 2004-11-11 Moac Llc Systems and methods for processing complex data sets
US20090064166A1 (en) * 2007-08-28 2009-03-05 Arimilli Lakshminarayana B System and Method for Hardware Based Dynamic Load Balancing of Message Passing Interface Tasks
US20090064167A1 (en) * 2007-08-28 2009-03-05 Arimilli Lakshminarayana B System and Method for Performing Setup Operations for Receiving Different Amounts of Data While Processors are Performing Message Passing Interface Tasks
US20090327464A1 (en) * 2008-06-26 2009-12-31 International Business Machines Corporation Load Balanced Data Processing Performed On An Application Message Transmitted Between Compute Nodes
US20100095303A1 (en) * 2008-10-09 2010-04-15 International Business Machines Corporation Balancing A Data Processing Load Among A Plurality Of Compute Nodes In A Parallel Computer
US20120233486A1 (en) * 2011-03-10 2012-09-13 Nec Laboratories America, Inc. Load balancing on heterogeneous processing clusters implementing parallel execution
US8683099B1 (en) * 2012-06-14 2014-03-25 Emc Corporation Load balancing of read/write accesses on a single host device
CN106201673A (en) * 2016-06-24 2016-12-07 中国石油天然气集团公司 A kind of seismic data processing technique and device
CN109344135A (en) * 2018-10-18 2019-02-15 中国海洋石油集团有限公司 A kind of parallel seismic processing job scheduling method of the file lock of automatic load balancing

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225443A1 (en) * 2003-05-08 2004-11-11 Moac Llc Systems and methods for processing complex data sets
US20090064166A1 (en) * 2007-08-28 2009-03-05 Arimilli Lakshminarayana B System and Method for Hardware Based Dynamic Load Balancing of Message Passing Interface Tasks
US20090064167A1 (en) * 2007-08-28 2009-03-05 Arimilli Lakshminarayana B System and Method for Performing Setup Operations for Receiving Different Amounts of Data While Processors are Performing Message Passing Interface Tasks
US20090327464A1 (en) * 2008-06-26 2009-12-31 International Business Machines Corporation Load Balanced Data Processing Performed On An Application Message Transmitted Between Compute Nodes
US20100095303A1 (en) * 2008-10-09 2010-04-15 International Business Machines Corporation Balancing A Data Processing Load Among A Plurality Of Compute Nodes In A Parallel Computer
US20120233486A1 (en) * 2011-03-10 2012-09-13 Nec Laboratories America, Inc. Load balancing on heterogeneous processing clusters implementing parallel execution
US8683099B1 (en) * 2012-06-14 2014-03-25 Emc Corporation Load balancing of read/write accesses on a single host device
CN106201673A (en) * 2016-06-24 2016-12-07 中国石油天然气集团公司 A kind of seismic data processing technique and device
CN109344135A (en) * 2018-10-18 2019-02-15 中国海洋石油集团有限公司 A kind of parallel seismic processing job scheduling method of the file lock of automatic load balancing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王一达;赵长海;李超;张建磊;晏海华;张威毅;: "异构计算环境下的三维Kirchhoff叠前深度偏移混合域并行算法", 石油地球物理勘探, no. 03, 1 June 2018 (2018-06-01) *
鲁金;马可;高剑;: "基于MPI的卷积计算并行实现", 计算机测量与控制, no. 01, 25 January 2016 (2016-01-25) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220300321A1 (en) * 2021-03-19 2022-09-22 Regeneron Pharmaceuticals, Inc. Data pipeline

Also Published As

Publication number Publication date
CN113835869B (en) 2024-04-09

Similar Documents

Publication Publication Date Title
Ipek et al. Self-optimizing memory controllers: A reinforcement learning approach
Bhadauria et al. An approach to resource-aware co-scheduling for CMPs
US20230418997A1 (en) Comprehensive contention-based thread allocation and placement
Polo et al. Deadline-based MapReduce workload management
US10871996B2 (en) Detection, modeling and application of memory bandwith patterns
EP2396730A1 (en) Devices and methods for optimizing data-parallel processing in multi-core computing systems
Zacheilas et al. Chess: Cost-effective scheduling across multiple heterogeneous mapreduce clusters
Zhang et al. Fine-grained multi-query stream processing on integrated architectures
Bird et al. {PACORA}: Performance Aware Convex Optimization for Resource Allocation
Ruan et al. A comparative study of large-scale cluster workload traces via multiview analysis
Gandomi et al. Designing a MapReduce performance model in distributed heterogeneous platforms based on benchmarking approach
CN113835869B (en) MPI-based load balancing method, MPI-based load balancing device, computer equipment and storage medium
Bansal et al. A framework for performance analysis and tuning in hadoop based clusters
Songara et al. MRA-VC: multiple resources aware virtual machine consolidation using particle swarm optimization
Lin et al. Performance analysis of MapReduce program in heterogeneous cloud computing
López-Albelda et al. FlexSched: Efficient scheduling techniques for concurrent kernel execution on GPUs
CN111898865B (en) Smart campus data dynamic management method
Laso et al. CIMAR, NIMAR, and LMMA: Novel algorithms for thread and memory migrations in user space on NUMA systems using hardware counters
CN102981805B (en) The response method of serialized software and system
CN113407333A (en) Task scheduling method, system, GPU and equipment for Warp level scheduling
Chhabra et al. Qualitative Parametric Comparison of Load Balancing Algorithms in Distributed Computing Environment
Rahmani et al. Machine learning-driven energy-efficient load balancing for real-time heterogeneous systems
García Lorenzo et al. A new hardware counters based thread migration strategy for NUMA systems
Zhao et al. Improving Cluster Utilization through Adaptive Resource Management for DNN and CPU Jobs Co-location
Srikanthan Sharing-aware resource management for multicore systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant