CN113391917B - Multi-machine heterogeneous parallel computing method and device for geophysical prospecting application - Google Patents

Multi-machine heterogeneous parallel computing method and device for geophysical prospecting application Download PDF

Info

Publication number
CN113391917B
CN113391917B CN202010173738.3A CN202010173738A CN113391917B CN 113391917 B CN113391917 B CN 113391917B CN 202010173738 A CN202010173738 A CN 202010173738A CN 113391917 B CN113391917 B CN 113391917B
Authority
CN
China
Prior art keywords
computing
heterogeneous
task
node
calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010173738.3A
Other languages
Chinese (zh)
Other versions
CN113391917A (en
Inventor
潘英杰
何宝庆
何永清
罗开云
杜清波
皮红梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China National Petroleum Corp
BGP Inc
Original Assignee
China National Petroleum Corp
BGP Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China National Petroleum Corp, BGP Inc filed Critical China National Petroleum Corp
Priority to CN202010173738.3A priority Critical patent/CN113391917B/en
Publication of CN113391917A publication Critical patent/CN113391917A/en
Application granted granted Critical
Publication of CN113391917B publication Critical patent/CN113391917B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Economics (AREA)
  • Animal Husbandry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mining & Mineral Resources (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Agronomy & Crop Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Geophysics And Detection Of Objects (AREA)

Abstract

The invention provides a multi-machine heterogeneous parallel computing method and a device for geophysical prospecting application, wherein the method comprises the following steps: the user node sends a heterogeneous computing resource searching command to the management node; the command includes lookup parameter information; the management node broadcasts the command to each computing node; each computing node generates and starts a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource conditions and the searching parameter information, and feeds back resource information to the management node; the management node sends the fed-back resource information to the user node; the user node screens the fed-back resource information and sends the selected resource information to the management node; the management node sends a selected confirmation message to each computing node according to the selected resource information; and each computing node determines a computing task heterogeneous execution end according to the confirmation message and performs multi-machine parallel computing. The technical scheme plays all heterogeneous computing resource performances in the computing node, and realizes load balancing and high-performance computing of the operation.

Description

Multi-machine heterogeneous parallel computing method and device for geophysical prospecting application
Technical Field
The invention relates to the technical field of petroleum geophysical prospecting, in particular to a multi-machine heterogeneous parallel computing method and device for geophysical prospecting application.
Background
In petroleum seismic exploration, there are a large number of service requirements and algorithms which need to perform high-performance computation, such as forward modeling of a two-dimensional model, illumination of a two-dimensional model, and the like, in order to fully exert the computation performance of a single machine, the geophysical prospecting applications usually utilize a CPU (Central processing Unit) and a GPU (graphics processing Unit) to respectively perform computation according to cannons, and utilize technologies such as multithreading, openCL (open control language) and the like to fully exert the performance of all hardware resources, so that the whole forward modeling, illumination and other service algorithms or computation processes are usually packaged into independent dynamic libraries or independent processes, and library interfaces or calling processes are called by external programs to perform computation. As the current construction area is bigger and bigger, the number of excitation points in the work area, namely the number of cannons is bigger and bigger, the model forward modeling and model illumination calculation of the whole work area are more and more difficult to realize by a single machine, and the calculation of model illumination, model forward modeling and the like of the whole work area by using the existing multi-machine heterogeneous resource and conveniently and rapidly calling the existing program or algorithm library becomes the key of multi-machine parallel calculation.
In the process of realizing single-machine high-performance calculation, hardware, methods and strategies used by different algorithms and applications are different, some of the algorithms and the strategies adopt CPU and GPU heterogeneous non-cooperative calculation, and some of the algorithms and the strategies adopt CPU and GPU heterogeneous cooperative calculation, so that the high-performance calculation of geophysical prospecting application operation cannot be met.
For the current multi-machine parallel computation, one physical machine is usually used as a computing node, and all computing resources are used for a computing program, but when the multi-machine heterogeneous parallel computation is carried out on the physical machine, if task allocation is carried out by a minimum task unit gun, as multiple heterogeneous devices cannot be used for coordination and the same computation, only one computing device can be used for computation, the performance of all computing resources in the computing node cannot be fully exerted, and the use of heterogeneous computing resources is unreasonable; if multiple cannons are distributed as calculation task units for calculation, the GPU can quickly complete calculation due to the calculation performance difference between different devices, then the phenomenon that a CPU completes calculation is waited for, so that uneven load of the calculation devices is caused, and meanwhile, the phenomenon that the load of the calculation devices in the calculation nodes is uneven and idle waiting is caused can be eliminated by using a debugging algorithm to accurately divide proper cannons for different calculation nodes.
In view of the above technical problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the invention provides a multi-machine heterogeneous parallel computing method aiming at geophysical prospecting application, which is used for fully playing all heterogeneous computing resource performances in computing nodes and realizing load balancing and operation high-performance computing, and comprises the following steps:
Before performing seismic exploration multi-machine parallel computation, a user node sends a heterogeneous computing resource searching command to a management node; the heterogeneous computing resource search command comprises search parameter information;
the management node broadcasts the heterogeneous computing resource searching command to each computing node;
After receiving the heterogeneous computing resource searching command, each computing node performs hardware scanning on the physical machine, generates and starts a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource conditions and the searching parameter information, and feeds back heterogeneous computing resource information to the management node after the starting of the plurality of computing task heterogeneous execution ends is completed; each heterogeneous execution end of the computing task corresponds to one type of heterogeneous computing resource;
The management node sends heterogeneous computing resource information fed back by each computing node to the user node;
The user node utilizes preset screening conditions to carry out screening treatment on heterogeneous computing resource information fed back by each computing node to obtain selected heterogeneous computing resource information, and the selected heterogeneous computing resource information is sent to the management node;
The management node sends heterogeneous computing resource use confirmation information to each computing node according to the selected heterogeneous computing resource information;
and each computing node determines a computing task heterogeneous execution end participating in multi-machine parallel computing according to the heterogeneous computing resource use confirmation message, and the determined computing task heterogeneous execution end performs seismic exploration multi-machine parallel computing.
The embodiment of the invention also provides a multi-machine heterogeneous parallel computing device for geophysical prospecting application, which is used for fully playing all heterogeneous computing resource performances in a computing node and realizing load balancing and operation high-performance computing, and comprises the following steps:
The user node is used for sending heterogeneous computing resource searching commands to the management node before performing seismic exploration multi-machine parallel computation; the heterogeneous computing resource search command comprises search parameter information; screening the heterogeneous computing resource information fed back by each computing node by utilizing preset screening conditions to obtain selected heterogeneous computing resource information, and sending the selected heterogeneous computing resource information to a management node;
The management node is used for broadcasting heterogeneous computing resource searching commands to all computing nodes; heterogeneous computing resource information fed back by each computing node is sent to the user node; sending heterogeneous computing resource use confirmation information to each computing node according to the selected heterogeneous computing resource information;
Each computing node is used for carrying out hardware scanning on the physical machine after receiving the heterogeneous computing resource searching command, generating and starting a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource condition and the searching parameter information, and feeding back heterogeneous computing resource information to the management node after the starting of the plurality of computing task heterogeneous execution ends is completed; each heterogeneous execution end of the computing task corresponds to one type of heterogeneous computing resource; according to the heterogeneous computing resource use confirmation message, determining a computing task heterogeneous execution end participating in multi-machine parallel computing, and performing seismic exploration multi-machine parallel computing by the determined computing task heterogeneous execution end.
The embodiment of the invention also provides computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the multi-machine heterogeneous parallel computing method aiming at geophysical prospecting application when executing the computer program.
The embodiment of the invention also provides a computer readable storage medium, which stores a computer program for executing the multi-machine heterogeneous parallel computing method aiming at geophysical prospecting application.
The technical scheme provided by the embodiment of the invention is as follows: before performing seismic exploration multi-machine parallel computation, a user node sends a heterogeneous computing resource searching command to a management node; the heterogeneous computing resource search command comprises search parameter information; the management node broadcasts the heterogeneous computing resource searching command to each computing node; after receiving the heterogeneous computing resource searching command, each computing node performs hardware scanning on the physical machine, generates and starts a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource conditions and the searching parameter information, and feeds back heterogeneous computing resource information to the management node after the starting of the plurality of computing task heterogeneous execution ends is completed; each heterogeneous execution end of the computing task corresponds to one type of heterogeneous computing resource; the management node sends heterogeneous computing resource information fed back by each computing node to the user node; the user node utilizes preset screening conditions to carry out screening treatment on heterogeneous computing resource information fed back by each computing node to obtain selected heterogeneous computing resource information, and the selected heterogeneous computing resource information is sent to the management node; the management node sends heterogeneous computing resource use confirmation information to each computing node according to the selected heterogeneous computing resource information; each computing node determines a computing task heterogeneous execution end participating in multi-machine parallel computing according to the heterogeneous computing resource use confirmation message, and the determined computing task heterogeneous execution end performs seismic exploration multi-machine parallel computing, so that a plurality of computing task heterogeneous execution ends aiming at different computing devices are automatically generated in a single machine, and the multi-machine heterogeneous parallel task computing is performed by using different heterogeneous computing devices.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow diagram of a multi-machine heterogeneous parallel computing method for geophysical prospecting applications in an embodiment of the present invention;
FIG. 2 is a schematic diagram of selecting heterogeneous computing resource information in accordance with an embodiment of the present invention;
FIG. 3 is a schematic diagram of a computing node in an embodiment of the invention;
FIG. 4 is a schematic diagram of a computing node in an embodiment of the invention;
FIG. 5 is a schematic diagram of a multi-machine heterogeneous workflow based on autonomous execution of computing tasks in an embodiment of the invention;
FIG. 6 is a schematic diagram of a multi-machine heterogeneous parallel computing device for geophysical prospecting applications according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Aiming at different requirements of multi-machine parallelism of different geophysical prospecting applications at present, the multi-machine parallelism is realized by utilizing the existing geophysical prospecting algorithm program or geophysical prospecting algorithm library as far as possible, the multi-machine parallelism is realized, the requirements of multi-machine parallel computing support of different algorithm flows, the performance of heterogeneous computing resources is exerted, the load balance and the high-performance computing of the operation are realized, and the embodiment of the invention provides a multi-machine heterogeneous parallel scheme which supports multi-machine heterogeneous parallel computing, supports the customization of computing task flows and the autonomous execution of tasks. This scheme is described in detail below.
Fig. 1 is a flow chart of a multi-machine heterogeneous parallel computing method for geophysical prospecting application according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:
step 101: before performing seismic exploration multi-machine parallel computation, a user node sends a heterogeneous computing resource searching command to a management node; the heterogeneous computing resource search command comprises search parameter information;
step 102: the management node broadcasts the heterogeneous computing resource searching command to each computing node;
Step 103: after receiving the heterogeneous computing resource searching command, each computing node performs hardware scanning on the physical machine, generates and starts a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource conditions and the searching parameter information, and feeds back heterogeneous computing resource information to the management node after the starting of the plurality of computing task heterogeneous execution ends is completed; each heterogeneous execution end of the computing task corresponds to one type of heterogeneous computing resource;
Step 104: the management node sends heterogeneous computing resource information fed back by each computing node to the user node;
Step 105: the user node utilizes preset screening conditions to carry out screening treatment on heterogeneous computing resource information fed back by each computing node to obtain selected heterogeneous computing resource information, and the selected heterogeneous computing resource information is sent to the management node;
step 106: the management node sends heterogeneous computing resource use confirmation information to each computing node according to the selected heterogeneous computing resource information;
step 107: and each computing node determines a computing task heterogeneous execution end participating in multi-machine parallel computing according to the heterogeneous computing resource use confirmation message, and the determined computing task heterogeneous execution end performs seismic exploration multi-machine parallel computing.
The technical scheme provided by the embodiment of the invention realizes that a plurality of heterogeneous execution ends of the calculation tasks aiming at different calculation devices are automatically generated in a single machine, and the heterogeneous parallel task calculation is carried out by utilizing different heterogeneous calculation devices respectively.
In a specific embodiment, the commands in the embodiment of the present invention may include a message command, a plug-in command, and the like. Wherein:
The message command is mainly used for message communication among nodes, and comprises a command name, a command parameter item and parameters, such as a search heterogeneous computing resource command FindExecutror-GPUClient =true-DriverVersion > =1 0-GPUMem > =1g. Wherein FindExecutror is a command, which is a command parameter followed by a specific parameter content.
The plug-in command is the encapsulation of functions, the declaration of the plug-in command is in the configuration file of the plug-in, the functional implementation of the plug-in command is implemented in the plug-in, such as cmd_calculate $ TASKDATA, the cmd_calculate command is the command of the plug-in calculation function, the command parameter $ TASKDATA represents task data, the declaration is implemented in the plug-in configuration file, the functional implementation is implemented in the corresponding plug-in, the parallel platform provides some commonly used basic plug-in commands, such as data transmission, compression, etc., and the parallel computing plug-in of the business provides plug-in commands of specific applications, such as illumination calculation, task result processing, etc., and the plug-in commands can be invoked in algorithm order in the task flow configuration of automatic running parallel heterogeneous calculation to implement customization of complex geophysical algorithm flows, see the description of the embodiment below.
The following describes the steps according to the embodiment of the present invention in detail with reference to fig. 2 to 5.
1. Firstly, the multi-machine heterogeneous parallel computing method provided by the embodiment of the invention is a multi-machine heterogeneous resource using method aiming at a collaborative or non-collaborative computing program of multi-heterogeneous equipment in a single machine.
The inventors have found a technical problem: in the forward calculation process of a model (a geological model which is needed in the geophysical prospecting model forward calculation), different heterogeneous devices are needed to be used for calculation according to cannons, the cannons are minimum division units of calculation tasks, the model forward calculation is carried out according to the cannons by utilizing multiple threads in a CPU, the model forward calculation is carried out according to the cannons on a GPU by utilizing OpenCL or CUDA, cooperative calculation is not carried out among the independent calculation devices, namely, the model forward calculation of a cannon is not completed by utilizing the CPU and the GPU simultaneously. Because the calculation performance difference between the CPU and the GPU and between the GPU and the different types of GPUs is huge, for example, when the CPU finishes forward calculation of one shot, the GPU finishes calculation of a few shots and tens of shots. When multi-machine parallel computing is performed, one physical machine is usually used as a computing node, all computing resources are used for a computing program, but when multi-machine heterogeneous parallel computing is performed on the physical machine, if task allocation is performed by a minimum task unit gun, because multiple heterogeneous devices cannot be used for coordination and the same computing, only one computing device can be used for computing, the performance of all computing resources in the computing node cannot be fully exerted, and the use of heterogeneous computing resources is unreasonable; if multiple cannons are distributed as calculation task units for calculation, the phenomenon that the GPU quickly completes calculation and then waits for the CPU to complete calculation occurs due to the difference of calculation performance among different devices, so that uneven load of the calculation devices is caused, and meanwhile, the proper cannons are difficult to accurately divide for different calculation nodes by utilizing a debugging algorithm, so that the phenomena of uneven load and idle waiting of the calculation devices in the calculation nodes are eliminated.
Meanwhile, in the process of performing multi-machine parallel computing, a default computing task execution end needs to be installed in a physical machine participating in computing, the physical machine is used as a computing node to be added into the multi-machine parallel computing, and the computing task execution end provides basic network communication, data transmission, computing resource management, task operation environment and the like for the multi-machine parallel computing. And the overall utilization of the resources cannot effectively support the multi-machine heterogeneous computing demands of users.
The inventor discovers the technical problem and provides a multi-machine heterogeneous parallel computing method for geophysical prospecting application, and the method automatically generates a plurality of computing task heterogeneous execution ends for different computing devices in a single machine according to requirements, and respectively utilizes the different heterogeneous devices to perform task computing.
In one implementation, after receiving the heterogeneous computing resource search command, each computing node performs hardware scanning on the physical machine, generates and starts a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource condition and the search parameter information, and feeds back heterogeneous computing resource information to the management node after the start of the plurality of computing task heterogeneous execution ends is completed, which may include:
After each computing node receives the heterogeneous computing resource searching command, carrying out hardware scanning on a physical machine; each computing node copies the default computing task heterogeneous execution end program of the computing node on the peer directory to obtain a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource condition and the searching parameter information;
the default heterogeneous execution end of the computing task of each computing node respectively starts the copied heterogeneous execution ends of the computing tasks;
And after the starting of the copied heterogeneous execution ends of the computing tasks is completed, each computing node feeds back heterogeneous computing resource information to the management node.
In specific implementation, the heterogeneous execution ends of the computing tasks for different computing devices are automatically generated in a single machine, one heterogeneous execution end of the computing tasks corresponds to one type of heterogeneous device, so that an original physical machine is divided into a plurality of abstract computing nodes according to the type and the number of the computing devices, and different heterogeneous computing devices are used for carrying out multi-machine heterogeneous parallel task computing, thereby meeting the requirements of heterogeneous collaborative computing and heterogeneous non-collaborative computing in the single machine, fully playing the performance of all heterogeneous computing resources in the computing nodes, and realizing load balancing and high-performance computing of operation.
In one embodiment, the multi-machine heterogeneous parallel computing method for geophysical prospecting application may further include: and each computing node performs closing processing on the copied computing task heterogeneous execution end when determining that the copied computing task heterogeneous execution end is not selected according to the heterogeneous computing resource use confirmation message, and sets the default computing task heterogeneous execution end to be in a free state when determining that the default computing task heterogeneous execution end is not selected.
In the implementation, the processing flow of each computing node after receiving the resource use confirmation message ensures the stable performance of multi-machine heterogeneous parallel computing aiming at geophysical prospecting application.
Specifically, the embodiment of the invention fully exerts the performance of all heterogeneous computing resources in the computing node, as shown in fig. 2, and realizes the load balancing and the high-performance computing processing method of the operation, which comprises the following steps:
1. When the user performs multi-machine parallel computation on the user node, the user node needs to search and screen available heterogeneous computing resources, and when searching for the computing resources, the user node sends a heterogeneous computing resource searching command added with additional searching parameter information (searching parameter information) to the management node.
2. When receiving a command (heterogeneous computing resource search command) sent by a user, the management node broadcasts the command to each computing node.
3. The computing node is usually a physical machine or cluster node on which a computing task execution end (such as the computing task execution end in fig. 2, which may also be referred to as a default computing task heterogeneous execution end) is installed, where the default installed computing task execution end (the default computing task heterogeneous execution end may be shown in fig. 3) represents the physical machine or physical node on which the default computing task execution end is located, after receiving the heterogeneous computing resource search command, the heterogeneous device detection module (a module of the default computing task heterogeneous execution end, as shown in fig. 3) is utilized to detect heterogeneous devices, and when detecting available computing resources such as GPUs, the heterogeneous computing task heterogeneous execution end included in the default computing task heterogeneous execution end may be utilized to generate and start the following operations (as shown in fig. 3): according to parameter requirements (searching parameter information), copying a plurality of computing task executing end (computing task heterogeneous executing end obtained by copying, such as computing task heterogeneous executing end shown in fig. 2) programs on a peer directory where the computing task executing end (default computing task heterogeneous executing end, such as computing task executing end shown in fig. 2) programs are located, wherein the programs respectively correspond to different heterogeneous devices, and marks and differences are added in configuration files and starting parameters of the computing task executing end programs. After the heterogeneous execution end programs of the computing tasks are generated, the default heterogeneous execution ends of the computing tasks respectively start the heterogeneous execution ends of the computing tasks, and one heterogeneous execution end of the computing tasks corresponds to one type of heterogeneous equipment, so that an original physical machine is split into a plurality of abstract computing nodes according to the types and the numbers of the computing equipment. After the completion of the starting of each computing task executing end, a registration message is sent to the management node.
4. The management node gathers the computing node information and then sends the computing node information to the user node.
5. The user node analyzes heterogeneous computing resource information fed back by each computing node, and utilizes screening conditions to screen computing resources, because the GPU heterogeneous computing has a plurality of requirements on hardware resources, driving versions and the like, the screening conditions are utilized to screen available computing resources, and computing resources which do not meet the requirements and are not used are deleted, so that a computing resource list is formed and fed back to the management node.
6. After receiving the computing resource list selected by the user, the management node sends a confirmation message to each computing node.
7. After receiving the confirmation message, each computing node performs the next processing according to the content of the confirmation message, if a certain computing task heterogeneous execution end (including a default computing task execution end and a copied computing task heterogeneous execution end) of the computing node is selected by a user, the computing task heterogeneous execution end starts to participate in the seismic exploration operation calculation, and if the computing task heterogeneous execution end is removed, the computing task heterogeneous execution end is closed; if the default computing task execution end (default computing task heterogeneous execution end) is eliminated, the default computing task heterogeneous execution end is set to a free state.
2. Next, the multi-machine heterogeneous parallel computing method provided by the embodiment of the invention is a processing method for autonomously executing the computing task.
In one embodiment, the multi-machine heterogeneous parallel computing method for geophysical prospecting application may further include:
The user node divides the operation into calculation tasks according to the selected heterogeneous calculation resource information, sets a calculation task flow configuration file, and performs operation of submitting and starting the operation after the calculation task flow configuration file is set; the computing task flow configuration file comprises a computing task processing flow;
After receiving trigger instructions of job submission and starting operation, the management node sends the computing task flow configuration file to each computing node;
each computing node determines a computing task heterogeneous execution end participating in multi-machine parallel computing according to the heterogeneous computing resource use confirmation message, and the determined computing task heterogeneous execution end performs seismic exploration multi-machine parallel computing, and the method comprises the following steps:
and the heterogeneous execution end of the calculation task determined in each calculation node performs seismic exploration calculation task processing according to the calculation task processing flow.
In the implementation, after the heterogeneous computing resources are selected, the user node can automatically execute the seismic exploration computing task processing according to the preset computing task flow configuration file, so that the efficiency of multi-machine heterogeneous parallel computing processing for geophysical prospecting application is improved.
In one embodiment, the performing, by the heterogeneous execution end of the computing task determined in each computing node, the seismic exploration computing task processing according to the computing task processing flow may include:
The heterogeneous execution end of the computing task loads a corresponding computing task plug-in according to the processing flow of the computing task; the computing task processing flow comprises a predefined computing task plug-in and a calling sequence and a calling mode thereof;
And the calculation task heterogeneous execution end invokes a corresponding geophysical prospecting algorithm from a pre-established geophysical prospecting service algorithm library through a loaded calculation task plug-in, so as to complete the seismic exploration calculation task processing.
In the concrete implementation, the corresponding calculation task plug-in is loaded through the calculation task processing flow, and then the corresponding geophysical prospecting algorithm is called from the geophysical prospecting business algorithm library which is built in advance, so that the seismic prospecting calculation task processing is completed, the seismic prospecting calculation task processing can be completed by calling the existing geophysical prospecting algorithm or program, and the development labor and material cost and the system transformation cost of system software are saved.
The processing method for autonomously executing the calculation task is described in detail below with reference to fig. 4 and 5.
In specific implementation, model forward modeling is realized by using an independent console program, model illumination is realized by calling an independent algorithm library, the programs and algorithms (the algorithms can refer to specific geophysical prospecting algorithms such as model forward modeling illumination calculation, elastic wave illumination calculation and Gaussian illumination calculation, the algorithms belong to different geophysical prospecting services) are developed based on a single machine, the programs and algorithms have similar processing procedures such as preparation data (the data refer to different data needed for realizing the functions of the algorithms, including model file data, observation system file data, parameter file data and the like), setting calculation parameters (the algorithm parameters are parameters needed by specific algorithms, such as model range parameters, calculated shot range parameters, target stratum depth parameters and the like), and performing calculation and result processing according to shots one by one, but certain differences exist in terms of calling details, calling sequence and the like. How to implement multi-machine parallel computing without changing the current program and algorithm library as much as possible is one of the problems that the present invention needs to solve, and can be implemented by the computing task implementing unit in fig. 4. The invention provides an autonomous processing method of a computing node task, which uses a method of adding task flow (computing task processing flow) configuration to a computing task plug-in to call the existing library and programs to realize multi-machine parallel computing of different geophysical applications, wherein the computing task plug-in realizes the calling of different algorithm interfaces and program functions, and uses task flow configuration to realize the customization and adjustment of processing procedures in different business algorithms.
In the implementation, as shown in fig. 4, a computing task plugin (may also be referred to as a geophysical algorithm plugin, as shown in fig. 4) uses a plugin command interface mode to implement the call of different algorithm interfaces and application program functions, a function interface name is defined in a plugin configuration file of the computing task plugin, a command analysis module and a plugin operation module in a computing task execution end implement the call of plugin command analysis and interface call, and by using the plugin interface definition and analysis call mode, the call of function interfaces with different names and different parameters can be implemented, and by defining and implementing different plugin interfaces, the data preparation, parameter setting, calculation, result processing and other function interfaces in an algorithm library (such as the geophysical algorithm library in fig. 4) or an application program (such as the geophysical algorithm program in fig. 4) can be implemented. The command parsing module in FIG. 4 may also be used to parse heterogeneous computing resource lookup commands.
In specific implementation, the calculation process of the geophysical prospecting algorithm such as model forward modeling or model illumination and the like by the calculation node is realized according to a certain algorithm flow, such as data preparation, calculation, result feedback and the like, and the geophysical prospecting algorithm of different types has huge processing step difference, and different data and parameters and the like. Meanwhile, the implementation methods have differences, and some implementations are that algorithm libraries are called, some processes are called, some other plug-ins are called, and the like. How to directly realize the calling process of the different geophysical prospecting algorithms on the existing multi-machine parallel platform in the multi-machine parallel process is one of the invention points. The method comprises the steps of configuring a task flow (computing task processing flow) of a computing node, and predefining a computing task plug-in and a calling sequence and a calling mode thereof, namely arranging plug-in interfaces and corresponding parameters in a plug-in configuration file of the computing task plug-in according to a specified format and sequence, and realizing the control of computing processes of different algorithms by defining the calling sequence and the calling mode of the plug-in interfaces in the task flow configuration.
In the implementation, as shown in fig. 4, a plug-in running module and a task flow analysis and execution module are set in a computing task execution end (computing task heterogeneous execution end). The plug-in operation module realizes functions of plug-in loading, plug-in interface analysis, plug-in function call and the like, and the task flow analysis and execution module in FIG. 4 is used for realizing functions of task flow analysis, task flow execution and the like. To provide complex flow control, some basic process control primitives, such as judgment, circulation, synchronization, etc., are implemented in the task flow configuration. Common multi-machine parallel computing functions such as file transmission, compression, decompression and the like are built into internal plug-in commands, and are directly called by an application plug-in. The two modules are used jointly to realize the calculation task execution method which takes the plug-in interface as a calling unit, the task flow as an execution sequence and the plug-in function as a carrier. In addition, the network communication module of fig. 3 may be used for the computing node to communicate with the management node. The flow parsing module in fig. 3 corresponds to the task flow parsing and executing module in fig. 4.
In specific implementation, as shown in fig. 5, in order to implement the call of different algorithms or programs at the execution end of a computing task, the task flow is firstly read, parsed and executed by using a task flow parsing and executing module, and the task flow needs to be configured separately according to the algorithm characteristics of different applications, and generally includes steps of inputting input data, inputting application program plugins and the like, loading application program plugins, preprocessing data, requesting for computing tasks, executing computation, returning task results, unloading application program plugins and the like, and calling related plugin interface commands in each step to transfer corresponding interface parameters. The method can realize the calling of different service algorithm functions such as forward modeling of the two-dimensional model, model illumination and the like through the calculation task plug-in, can also call the algorithm program of the two-dimensional model illumination to directly calculate according to the process mode, reduce the adjustment and the change of the original algorithm and the program to the greatest extent, directly transplant the algorithm program to a multi-machine platform as far as possible to carry out multi-machine heterogeneous parallel calculation, and can automatically realize the execution of different algorithm tasks by calling the calculation task plug-in according to the preset calculation task flow.
3. Next, an exception handling method in the multi-machine heterogeneous parallel computing process provided by the embodiment of the invention is introduced.
In an embodiment, the multi-machine heterogeneous parallel computing method for geophysical prospecting application may further include:
When the abnormality of the management node is detected, the user node detects the received calculation task results, combines the completed calculation task results and simultaneously gives an unfinished calculation task list.
In specific implementation, the exception handling implementation mode in the multi-machine heterogeneous parallel computing process ensures the system stability of the seismic exploration multi-machine heterogeneous parallel computing process and ensures the accuracy of results.
In an embodiment, the multi-machine heterogeneous parallel computing method for geophysical prospecting application may further include:
When the abnormality of the user node is detected, the user node performs program restarting processing, detects the content under the operation result list, performs merging processing on the completed calculation task results, and simultaneously gives an unfinished calculation task list.
In specific implementation, the exception handling implementation mode in the multi-machine heterogeneous parallel computing process ensures the system stability of the seismic exploration multi-machine heterogeneous parallel computing process and ensures the accuracy of results.
In an embodiment, the multi-machine heterogeneous parallel computing method for geophysical prospecting application may further include:
When the abnormal occurrence of the computing node is detected, the management node reassigns the computing task of the computing node with the abnormal occurrence to other computing nodes.
In specific implementation, the exception handling implementation mode in the multi-machine heterogeneous parallel computing process ensures the system stability of the seismic exploration multi-machine heterogeneous parallel computing process and ensures the accuracy of results.
In the implementation, various abnormal phenomena, such as abnormal computing nodes, abnormal management nodes and abnormal user nodes, occur in the multi-machine parallel process, and different abnormal conditions are processed in different modes. The detection of anomalies is accomplished through network connections and heartbeat threads between nodes. If the management node is abnormal, the user node detects the received task results, performs merging processing on the completed calculation task results, and simultaneously gives an unfinished calculation task list. If the user node is abnormal, restarting the user node program, automatically detecting the content under the operation result list, merging the completed calculation task results, and simultaneously giving an unfinished calculation task list. When the computing node is abnormal, the management node redistributes the computing task computed by the node to other nodes for computing, so that stable performance of multi-machine heterogeneous parallel computing aiming at geophysical prospecting application is ensured.
4. Next, the multi-machine heterogeneous parallel computing method provided by the embodiment of the invention is introduced in whole.
In particular implementation, in order to realize multi-machine heterogeneous parallel computing of different geophysical prospecting applications such as model forward modeling, the embodiment of the invention provides a multi-machine heterogeneous parallel computing processing flow and method based on autonomous execution of computing tasks, which mainly comprise the following steps:
① Preparing job data: preparing a calculation task flow configuration file, a calculation task plug-in and configuration files thereof, a dependent dynamic library program (a geophysical prospecting business algorithm library can comprise a geophysical prospecting algorithm program and a geophysical prospecting algorithm program) and the like, and preparing input data and the like required by multi-machine operation calculation such as model forward modeling, model illumination and the like. The computing task flow configuration file needs to correspond to the computing task plug-in and the configuration file thereof, and the calling sequence of the plug-in function interface, namely the execution sequence and the steps of the computing task, are defined in the computing task flow configuration file according to the algorithm flow.
② Heterogeneous computing resource screening: before multi-machine heterogeneous parallel computing, the needed computing resources are needed to be screened, a command for searching heterogeneous computing resources is sent to a management node, after the command for searching heterogeneous computing resources is received by a default computing task execution end on each computing node, hardware scanning is carried out on a physical machine or a physical node, multiple full-time computing task execution ends (computing task heterogeneous execution ends) are generated and started according to the searched heterogeneous resource conditions, one full-time computing task execution end corresponds to one heterogeneous resource, and after the computing task heterogeneous execution ends are started, the computing resources are registered to the management node for availability. And the user screens available computing resources according to the computing needs and submits a final computing resource screening result to the management node. The management node processes according to the screening result, each computing task heterogeneous execution end participates in the job if the computing task heterogeneous execution end is selected, the unselected default computing task execution end is restored to a free state, and the unselected computing task heterogeneous execution end (the computing task heterogeneous execution end obtained by copying) is closed.
③ Job submission and initiation: after heterogeneous computing resources are selected, task division is carried out on the jobs according to the data range to be computed, meanwhile, information of a computing task flow configuration file is set, and after setting of job information is completed, job submission and job starting work are carried out.
④ Operation: after a user submits and starts a job, the management node firstly sends information of a configuration file of a computing task flow to each computing node, each computing node reads the configuration file from a designated position to the local according to the information, then loads and analyzes the configuration file, autonomously processes tasks according to a task processing flow defined in the configuration file, such as reading a computing task plug-in or a program from a remote place, inputting data of the program and the like to a node local directory, loads the computing task plug-in, requests a computing task, calculates tasks, returns task results and the like. Each computing node realizes an autonomous and customized task processing flow method. The call to different geophysical prospecting algorithm libraries or the call to the calculation process can be realized through the plug-in function interface.
⑤ And (3) finishing the operation: when the operation is completed, the user processes the calculation task result, after the processing is completed, the management node is notified of the operation completion, meanwhile, the management node sends an operation completion command to each calculation node participating in the operation calculation, after the calculation task execution end corresponding to the calculation node receives the operation completion message, each unloading and cleaning work is completed, if the operation is a default calculation task execution end (default calculation task heterogeneous execution end), the operation is reset to an idle state, and if the operation is a calculation task heterogeneous execution end (copied calculation task heterogeneous execution end), the program is exited.
The following is another example to facilitate an understanding of how the invention may be practiced.
The invention provides a multi-machine heterogeneous parallel method applicable to calculation nodes of different geophysical prospecting algorithms, which mainly comprises the following implementation processes:
1) Computing task plug-in development
Developing a computing task plug-in to realize a geophysical prospecting service algorithm library, such as the service algorithm library shown in fig. 5, which may include: a geophysical prospecting algorithm library (as shown in fig. 4), a geophysical prospecting algorithm program (as shown in fig. 4), or a function call of other algorithm plug-ins.
And finishing the development of the configuration file of the computing task plug-in and realizing the definition of the plug-in interface.
And (3) completing the design of the processing flow of the computing task, and designing the interface calling sequence and parameter (calling sequence and calling mode) setting of the computing task plug-in according to the processing steps of the geophysical prospecting algorithm.
2) Multi-machine heterogeneous parallel computing operation
① Screening heterogeneous computing resources (a multi-machine heterogeneous resource using and processing method compatible with a collaborative or non-collaborative computing program of multi-heterogeneous equipment in a single machine)
The method comprises the steps that a user sends a command for searching heterogeneous computing resources to a management node, the management node informs idle computing nodes of searching heterogeneous computing resources, a default computing task executing end (computing task heterogeneous executing end) of the management node receives the heterogeneous computing resource searching command, available heterogeneous computing resources in a physical machine are detected, computing task heterogeneous executing end programs representing corresponding heterogeneous computing resources are generated and started, the management node is registered, namely, the management node is copied on a peer catalog of the default computing task executing end (computing task heterogeneous executing end) and configured to generate a plurality of special computing task executing ends (computing task heterogeneous executing ends) calculated by different heterogeneous devices, the computing task heterogeneous executing ends are started and registered and used with the management node, the management node receives registration information of each computing task heterogeneous executing end and feeds back to the user node, the user performs resource screening, the selected computing task executing end (comprising the default computing task heterogeneous executing end and the copied computing task heterogeneous executing end) participates in computation of parallel jobs, and the unselected default computing task executing end (computing task heterogeneous executing end) is set to be idle, and the computing heterogeneous executing end obtained by unselected copy is closed.
② Job preparation and job submission (a method for autonomous operation of a computing node based on a computing task plug-in and a computing task flow design)
Before multi-machine parallel operation is performed, input data, task input data, application program data (including calculation task plugins, plugin configuration files, calculation task flows, dependent geophysical prospecting business algorithm libraries can also be called dynamic libraries and the like) and the like required by the operation need to be prepared, after the preparation of the operation input data is completed, basic information of the operation and task flow settings are submitted, and then the operation is started.
Namely, the function interfaces in the geophysical prospecting algorithm library, the geophysical prospecting algorithm process and the geophysical prospecting algorithm plugins are called by utilizing the calculation task plugins, plugin function interfaces are defined through plugin configuration files, the calling sequence and parameter setting of the plugin function interfaces are defined through calculation task flows, the parallel calculation tasks of the common geophysical prospecting algorithm can be packaged, and the geophysical prospecting calculation task algorithm is realized by loading the plugins according to the pre-designed task flows and calling plugin functions through plugin operation modules and task flow analysis and execution modules (shown in fig. 4) of a calculation task execution end.
③ Operation of job
After the operation is started, the management node distributes task flow setting information (calculation task processing flow) to the calculation nodes, after the calculation task execution end of each calculation node reads the task flow setting information, reading, loading, analyzing and executing operations are performed, the execution of the calculation tasks is automatically performed according to the task flow setting (calculation task flow configuration file), for example, task data are read, calculation task plug-ins and programs are read, calculation tasks are requested, plug-ins are loaded, task calculation is performed, task results are returned and the like, the calculation nodes automatically operate according to preset flow configuration (calculation task flow configuration file), and the management node is responsible for the overall operation progress.
④ Job completion
When the job is completed, the user node processes the task result, the management node notifies each computing node that the job is completed, the default computing task execution end is set to be in an idle state, the next job is waited, and the heterogeneous computing task execution end is closed.
Based on the same inventive concept, the embodiment of the invention also provides a multi-machine heterogeneous parallel computing device for geophysical prospecting application, as described in the following embodiment. Since the principle of solving the problem by the multi-machine heterogeneous parallel computing device for geophysical prospecting application is similar to that of the multi-machine heterogeneous parallel computing method for geophysical prospecting application, implementation of the multi-machine heterogeneous parallel computing device for geophysical prospecting application can be referred to implementation of the multi-machine heterogeneous parallel computing method for geophysical prospecting application, and repeated parts are omitted. As used below, the term "unit" or "module" may be a combination of software and/or hardware that implements the intended function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 6 is a schematic structural diagram of a multi-machine heterogeneous parallel computing device for geophysical prospecting according to an embodiment of the present invention, as shown in fig. 6, the device includes: a user node 01, a management node 02 and a plurality of computing nodes 03; wherein:
The user node 01 is used for sending a heterogeneous computing resource searching command to the management node before performing the seismic exploration multi-machine parallel computation; the heterogeneous computing resource search command comprises search parameter information; screening the heterogeneous computing resource information fed back by each computing node by utilizing preset screening conditions to obtain selected heterogeneous computing resource information, and sending the selected heterogeneous computing resource information to a management node;
A management node 02 for broadcasting heterogeneous computing resource search commands to each computing node; heterogeneous computing resource information fed back by each computing node is sent to the user node; sending heterogeneous computing resource use confirmation information to each computing node according to the selected heterogeneous computing resource information;
Each computing node 03 is configured to perform hardware scanning on a physical machine after receiving a heterogeneous computing resource search command, generate and start a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource condition and the search parameter information, and feed back heterogeneous computing resource information to a management node after the start of the plurality of computing task heterogeneous execution ends is completed; each heterogeneous execution end of the computing task corresponds to one type of heterogeneous computing resource; according to the heterogeneous computing resource use confirmation message, determining a computing task heterogeneous execution end participating in multi-machine parallel computing, and performing seismic exploration and seismic exploration multi-machine parallel computing by the determined computing task heterogeneous execution end.
In particular, the number of user nodes may be plural.
In one embodiment, the user node may be further configured to divide the job into computing tasks according to the selected heterogeneous computing resource information, set a computing task flow configuration file, and perform job submitting and starting operations after the computing task flow configuration file is set; the computing task flow configuration file comprises a computing task processing flow;
the management node may be further configured to send the computing task flow configuration file to each computing node after receiving a trigger instruction for submitting a job and starting an operation;
Each computing node can be specifically used for determining a heterogeneous executing end of the computing task to perform seismic exploration computing task processing according to the computing task processing flow.
In one embodiment, the determined heterogeneous execution end of the computing task may be specifically configured to:
Loading a corresponding computing task plug-in according to the computing task processing flow; the computing task processing flow comprises a predefined computing task plug-in and a calling sequence and a calling mode thereof;
And calling a corresponding geophysical prospecting algorithm from a pre-established geophysical prospecting service algorithm library through the loaded calculation task plug-in, so as to complete the seismic exploration calculation task processing.
In one embodiment, each computing node may be specifically configured to:
after receiving the heterogeneous computing resource searching command, carrying out hardware scanning on the physical machine;
Copying the default heterogeneous execution end program of the computing node on the same-level catalog of the computing node according to the searched heterogeneous computing resource condition and the searching parameter information to obtain a plurality of heterogeneous execution ends of the computing task;
The default heterogeneous execution ends of the computing tasks respectively start the copied heterogeneous execution ends of the computing tasks;
and after the starting of the copied heterogeneous execution ends of the computing tasks is completed, feeding back heterogeneous computing resource information to the management node.
In one embodiment, each computing node may also be configured to: and according to the heterogeneous computing resource use confirmation message, when the fact that the copied computing task heterogeneous execution end is not selected is determined, closing the copied computing task heterogeneous execution end, and when the fact that the default computing task heterogeneous execution end is not selected is determined, setting the default computing task heterogeneous execution end to be in a free state.
In one embodiment, the user node may comprise a first exception handling unit for:
When the abnormality of the management node is detected, the received calculation task results are detected, the completed calculation task results are combined, and an unfinished calculation task list is given.
In one embodiment, the user node may comprise a second exception handling unit for:
When the abnormality of the user node is detected, program restarting processing is carried out, the content under the operation result directory is detected, and the completed calculation task results are combined and simultaneously an unfinished calculation task list is given.
In one embodiment, the management node may include a third exception handling unit for:
And when detecting that the computing node is abnormal, reassigning the computing task of the computing node with the abnormal occurrence to other computing nodes.
The embodiment of the invention also provides computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the multi-machine heterogeneous parallel computing method aiming at geophysical prospecting application when executing the computer program.
The embodiment of the invention also provides a computer readable storage medium, which stores a computer program for executing the multi-machine heterogeneous parallel computing method aiming at geophysical prospecting application.
The technical scheme provided by the embodiment of the invention has the beneficial technical effects that:
According to different use modes of computing resources of multi-machine heterogeneous computing in geophysical prospecting application, the multi-machine parallel computing method suitable for single-machine intra-isomerism cooperative computing and isomerism non-cooperative computing is extracted, multi-machine parallel heterogeneous computing requirements of different applications can be met, heterogeneous parallel computing with load balancing is performed by using existing computing resources to the greatest extent, and efficient multi-machine heterogeneous parallel is achieved.
Meanwhile, in order to meet the processing flow requirements of different geophysical prospecting algorithms, different geophysical prospecting algorithm realization modes are supported, a multi-machine parallel computing method based on computing task plug-ins and computing task flows is realized, function call is realized by calling algorithm library functions or process interfaces through plug-in function interfaces, the computing task flows are configured to run according to the designated algorithm flows, and autonomous execution of computing tasks is realized at computing nodes.
Based on the two points, a multi-machine heterogeneous parallel processing flow and a method with a configurable computing task flow are provided, the computing task flow is submitted as a job submission premise through heterogeneous computing resource screening, a job processing mode based on computing task flow distribution and analysis is adopted, a computing task autonomous execution mode with computing tasks processed according to a preset flow as a core is adopted, and a management node macroscopically monitors and manages the whole job operation on a higher level.
The invention can utilize the existing multi-machine heterogeneous computing resource and directly utilize the existing geophysical prospecting algorithm to the greatest extent to realize multi-machine heterogeneous computing.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations can be made to the embodiments of the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (12)

1. A multi-machine heterogeneous parallel computing method for geophysical prospecting applications, comprising:
Before performing seismic exploration multi-machine parallel computation, a user node sends a heterogeneous computing resource searching command to a management node; the heterogeneous computing resource search command comprises search parameter information;
the management node broadcasts the heterogeneous computing resource searching command to each computing node;
After receiving the heterogeneous computing resource searching command, each computing node performs hardware scanning on the physical machine, generates and starts a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource conditions and the searching parameter information, and feeds back heterogeneous computing resource information to the management node after the starting of the plurality of computing task heterogeneous execution ends is completed; each heterogeneous execution end of the computing task corresponds to one type of heterogeneous computing resource;
The management node sends heterogeneous computing resource information fed back by each computing node to the user node;
The user node utilizes preset screening conditions to carry out screening treatment on heterogeneous computing resource information fed back by each computing node to obtain selected heterogeneous computing resource information, and the selected heterogeneous computing resource information is sent to the management node;
The management node sends heterogeneous computing resource use confirmation information to each computing node according to the selected heterogeneous computing resource information;
Each computing node determines a computing task heterogeneous execution end participating in multi-machine parallel computing according to the heterogeneous computing resource use confirmation message, and the determined computing task heterogeneous execution end performs seismic exploration multi-machine parallel computing;
The multi-machine heterogeneous parallel computing method for geophysical prospecting application further comprises the following steps: the user node divides the operation into calculation tasks according to the selected heterogeneous calculation resource information, sets a calculation task flow configuration file, and performs operation of submitting and starting the operation after the calculation task flow configuration file is set; the computing task flow configuration file comprises a computing task processing flow; after receiving trigger instructions of job submission and starting operation, the management node sends the computing task flow configuration file to each computing node;
each computing node determines a computing task heterogeneous execution end participating in multi-machine parallel computing according to the heterogeneous computing resource use confirmation message, and the determined computing task heterogeneous execution end performs seismic exploration multi-machine parallel computing, and the method comprises the following steps: the heterogeneous execution end of the calculation task determined in each calculation node carries out seismic exploration calculation task processing according to the calculation task processing flow;
The heterogeneous execution end of the calculation task determined in each calculation node carries out seismic exploration calculation task processing according to the calculation task processing flow, and the heterogeneous execution end comprises the following steps: the heterogeneous execution end of the computing task loads a corresponding computing task plug-in according to the processing flow of the computing task; the computing task processing flow comprises a predefined computing task plug-in and a calling sequence and a calling mode thereof; the heterogeneous execution end of the calculation task calls a corresponding geophysical prospecting algorithm from a pre-established geophysical prospecting business algorithm library through a loaded calculation task plug-in, and the seismic exploration calculation task processing is completed;
After receiving the heterogeneous computing resource search command, each computing node performs hardware scanning on the physical machine, generates and starts a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource condition and the search parameter information, and feeds back heterogeneous computing resource information to the management node after the starting of the plurality of computing task heterogeneous execution ends is completed, wherein the method comprises the following steps: after each computing node receives the heterogeneous computing resource searching command, carrying out hardware scanning on a physical machine; each computing node copies the default computing task heterogeneous execution end program of the computing node on the peer directory to obtain a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource condition and the searching parameter information; the default heterogeneous execution end of the computing task of each computing node respectively starts the copied heterogeneous execution ends of the computing tasks; and after the starting of the copied heterogeneous execution ends of the computing tasks is completed, each computing node feeds back heterogeneous computing resource information to the management node.
2. The multi-machine heterogeneous parallel computing method for geophysical prospecting application according to claim 1, further comprising:
And each computing node performs closing processing on the copied computing task heterogeneous execution end when determining that the copied computing task heterogeneous execution end is not selected according to the heterogeneous computing resource use confirmation message, and sets the default computing task heterogeneous execution end to be in a free state when determining that the default computing task heterogeneous execution end is not selected.
3. The multi-machine heterogeneous parallel computing method for geophysical prospecting application according to claim 1, further comprising:
When the abnormality of the management node is detected, the user node detects the received calculation task results, combines the completed calculation task results and simultaneously gives an unfinished calculation task list.
4. The multi-machine heterogeneous parallel computing method for geophysical prospecting application according to claim 1, further comprising:
When the abnormality of the user node is detected, the user node performs program restarting processing, detects the content under the operation result list, performs merging processing on the completed calculation task results, and simultaneously gives an unfinished calculation task list.
5. The multi-machine heterogeneous parallel computing method for geophysical prospecting application according to claim 1, further comprising:
When the abnormal occurrence of the computing node is detected, the management node reassigns the computing task of the computing node with the abnormal occurrence to other computing nodes.
6. A multi-machine heterogeneous parallel computing device for geophysical prospecting applications, comprising:
The user node is used for sending heterogeneous computing resource searching commands to the management node before performing seismic exploration multi-machine parallel computation; the heterogeneous computing resource search command comprises search parameter information; screening the heterogeneous computing resource information fed back by each computing node by utilizing preset screening conditions to obtain selected heterogeneous computing resource information, and sending the selected heterogeneous computing resource information to a management node;
The management node is used for broadcasting heterogeneous computing resource searching commands to all computing nodes; heterogeneous computing resource information fed back by each computing node is sent to the user node; sending heterogeneous computing resource use confirmation information to each computing node according to the selected heterogeneous computing resource information;
Each computing node is used for carrying out hardware scanning on the physical machine after receiving the heterogeneous computing resource searching command, generating and starting a plurality of computing task heterogeneous execution ends according to the searched heterogeneous computing resource condition and the searching parameter information, and feeding back heterogeneous computing resource information to the management node after the starting of the plurality of computing task heterogeneous execution ends is completed; each heterogeneous execution end of the computing task corresponds to one type of heterogeneous computing resource; determining a calculation task heterogeneous execution end participating in multi-machine parallel calculation according to the heterogeneous calculation resource use confirmation message, and performing seismic exploration multi-machine parallel calculation by the determined calculation task heterogeneous execution end;
The user node is also used for dividing the operation into calculation tasks according to the selected heterogeneous calculation resource information, setting a calculation task flow configuration file, and carrying out operation of submitting and starting the operation after the calculation task flow configuration file is set; the computing task flow configuration file comprises a computing task processing flow;
the management node is also used for sending the calculation task flow configuration file to each calculation node after receiving a trigger instruction of the operation submitting and starting operation;
Each computing node is specifically used for performing seismic exploration computing task processing by the determined computing task heterogeneous execution end according to the computing task processing flow;
The determined heterogeneous execution end of the computing task is specifically used for: loading a corresponding computing task plug-in according to the computing task processing flow; the computing task processing flow comprises a predefined computing task plug-in and a calling sequence and a calling mode thereof; calling a corresponding geophysical prospecting algorithm from a pre-established geophysical prospecting service algorithm library through a loaded calculation task plug-in, and completing seismic exploration calculation task processing;
Each computing node is specifically configured to: after receiving the heterogeneous computing resource searching command, carrying out hardware scanning on the physical machine; copying the default heterogeneous execution end program of the computing node on the same-level catalog of the computing node according to the searched heterogeneous computing resource condition and the searching parameter information to obtain a plurality of heterogeneous execution ends of the computing task; the default heterogeneous execution ends of the computing tasks respectively start the copied heterogeneous execution ends of the computing tasks; and after the starting of the copied heterogeneous execution ends of the computing tasks is completed, feeding back heterogeneous computing resource information to the management node.
7. The multi-machine heterogeneous parallel computing device for geophysical prospecting applications as recited in claim 6, wherein each computing node is further configured to: and according to the heterogeneous computing resource use confirmation message, when the fact that the copied computing task heterogeneous execution end is not selected is determined, closing the copied computing task heterogeneous execution end, and when the fact that the default computing task heterogeneous execution end is not selected is determined, setting the default computing task heterogeneous execution end to be in a free state.
8. The multi-machine heterogeneous parallel computing device for geophysical applications of claim 6 wherein the user node comprises a first exception handling unit to:
When the abnormality of the management node is detected, the received calculation task results are detected, the completed calculation task results are combined, and an unfinished calculation task list is given.
9. The multi-machine heterogeneous parallel computing device for geophysical applications of claim 6 wherein the user node comprises a second exception handling unit for:
When the abnormality of the user node is detected, program restarting processing is carried out, the content under the operation result directory is detected, and the completed calculation task results are combined and simultaneously an unfinished calculation task list is given.
10. The multi-machine heterogeneous parallel computing device for geophysical prospecting application according to claim 6, wherein the management node comprises a third exception handling unit for:
And when detecting that the computing node is abnormal, reassigning the computing task of the computing node with the abnormal occurrence to other computing nodes.
11. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1 to 5 when executing the computer program.
12. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program for executing the method of any one of claims 1 to 5.
CN202010173738.3A 2020-03-13 2020-03-13 Multi-machine heterogeneous parallel computing method and device for geophysical prospecting application Active CN113391917B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010173738.3A CN113391917B (en) 2020-03-13 2020-03-13 Multi-machine heterogeneous parallel computing method and device for geophysical prospecting application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010173738.3A CN113391917B (en) 2020-03-13 2020-03-13 Multi-machine heterogeneous parallel computing method and device for geophysical prospecting application

Publications (2)

Publication Number Publication Date
CN113391917A CN113391917A (en) 2021-09-14
CN113391917B true CN113391917B (en) 2024-04-30

Family

ID=77615835

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010173738.3A Active CN113391917B (en) 2020-03-13 2020-03-13 Multi-machine heterogeneous parallel computing method and device for geophysical prospecting application

Country Status (1)

Country Link
CN (1) CN113391917B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115729715B (en) * 2023-01-10 2023-09-01 摩尔线程智能科技(北京)有限责任公司 Load distribution method, device, equipment and medium for GPU (graphics processing Unit) system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1549964A (en) * 2002-01-04 2004-11-24 微软公司 Method for controlling calculation resource in coprocessor in computing system and computing device20010942134
CN104598425A (en) * 2013-10-31 2015-05-06 中国石油天然气集团公司 General multiprocessor parallel calculation method and system
CN110471758A (en) * 2019-07-02 2019-11-19 中国电力科学研究院有限公司 A kind of network analysis applications multi-user concurrent job scheduling system and method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8869162B2 (en) * 2011-04-26 2014-10-21 Microsoft Corporation Stream processing on heterogeneous hardware devices
US10580190B2 (en) * 2017-10-20 2020-03-03 Westghats Technologies Private Limited Graph based heterogeneous parallel processing system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1549964A (en) * 2002-01-04 2004-11-24 微软公司 Method for controlling calculation resource in coprocessor in computing system and computing device20010942134
CN101685391A (en) * 2002-01-04 2010-03-31 微软公司 Methods and system for managing computational resources of a coprocessor in a computing system
CN104598425A (en) * 2013-10-31 2015-05-06 中国石油天然气集团公司 General multiprocessor parallel calculation method and system
CN110471758A (en) * 2019-07-02 2019-11-19 中国电力科学研究院有限公司 A kind of network analysis applications multi-user concurrent job scheduling system and method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HITGRID:基于动态规划技术的网格任务调度中间件;高雷, 胡铭曾, 张伟哲;计算机工程与应用;20051201(第12期);全文 *
一种针对物探应用的多机并行计算框架;潘英杰;马青坡;杜清波;王汉钧;;地球物理学进展;20170415(第02期);全文 *
新常态下一种新型计算技术在物探中的应用;李洋;;中国石油和化工标准与质量;20170908(第17期);全文 *

Also Published As

Publication number Publication date
CN113391917A (en) 2021-09-14

Similar Documents

Publication Publication Date Title
KR102501992B1 (en) Data processing method and related products
US8209703B2 (en) Apparatus and method for dataflow execution in a distributed environment using directed acyclic graph and prioritization of sub-dataflow tasks
US7647590B2 (en) Parallel computing system using coordinator and master nodes for load balancing and distributing work
Warneke et al. Exploiting dynamic resource allocation for efficient parallel data processing in the cloud
EP2898638B1 (en) High performance data streaming
Fadika et al. Delma: Dynamically elastic mapreduce framework for cpu-intensive applications
US20130067443A1 (en) Parallel Processing Development Environment Extensions
Zhang et al. Design and analysis of data management in scalable parallel scripting
CN113391917B (en) Multi-machine heterogeneous parallel computing method and device for geophysical prospecting application
US20230409974A1 (en) Modularized model interaction system and method
Liu et al. Hanayo: Harnessing wave-like pipeline parallelism for enhanced large model training efficiency
Finnerty et al. Self-adjusting task granularity for global load balancer library on clusters of many-core processors
Posner et al. Transparent resource elasticity for task-based cluster environments with work stealing
Tsoi et al. Programming framework for clusters with heterogeneous accelerators
Joshi et al. Anonymous remote computing: A paradigm for parallel programming on interconnected workstations
Freitas et al. PackStealLB: A scalable distributed load balancer based on work stealing and workload discretization
CN103455374A (en) Method and device for distributed computation on basis of MapReduce
US20200278916A1 (en) Methods and systems for identifying duplicate jobs in a continuous integration environment
Dufaud et al. Design of data management for multi SPMD workflow programming model
CN111160403A (en) Method and device for multiplexing and discovering API (application program interface)
CN112444851B (en) Reverse time migration imaging method based on MapReduce parallel frame and storage medium
CN116366467B (en) Server-agnostic distributed training software defined aggregate communication framework and method
Cai et al. SMSS: Stateful Model Serving in Metaverse With Serverless Computing and GPU Sharing
Zhang et al. ITIF: Integrated Transformers Inference Framework for Multiple Tenants on GPU
Galante et al. Extending parallel programming patterns with adaptability features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant