WO2012024937A1

WO2012024937A1 - Method and system for realizing parallel computing

Info

Publication number: WO2012024937A1
Application number: PCT/CN2011/072818
Authority: WO
Inventors: 周扬; 胡媛; 张艺夕; 李桂萍; 黄翔
Original assignee: 中兴通讯股份有限公司
Priority date: 2010-08-27
Filing date: 2011-04-14
Publication date: 2012-03-01
Also published as: CN102385536A; CN102385536B

Abstract

A method for realizing parallel computing is disclosed. The method comprises: recording log information of Worker nodes and Master nodes executing tasks after an overall task is initiated; when a fault occurs on a Worker node executing a task, obtaining the recorded log information of the broken-down Worker node and keeping on processing the operation flow of the broken-down Worker node from the breakpoint at which the fault occurs according to the log information by a new Worker node; and/ or, when a fault occurs on a Master node executing a task, after a new Master node is initiated, obtaining the recorded log information of the broken-down Master node and keeping on processing the operation flow of the broken-down Master node from the breakpoint at which the fault occurs according to the log information. Furthermore, a system for realizing parallel computing is disclosed. When a fault occurs on a node, it can keep on executing tasks from the breakpoint at which the fault occurs to use the method and the system.

Description

Method and system for realizing parallel computing

The present invention relates to the field of cloud computing, and more particularly to a method and system for implementing parallel computing. Background technique

MapReduce was first proposed by Google engineers. It is a system architecture that can process massive amounts of data in parallel. The MapReduce system works by: automatically breaking a task into multiple subtasks and then executing these subtasks in parallel, when all subtasks are executed. After the completion, the processing results will be summarized.

Figure 1 shows the architecture of the existing MapReduce system. As you can see from Figure 1, MapReduce divides the data processing into two phases: the Map phase and the Reduce phase. The MapReduce system mainly includes: a client (Client), a host (Master) node, and a worker (Worker) node; wherein, the Client is used to submit a MapReduce task, and the Master node is used to automatically decompose the MapReduce task into a Map task and a Reduce task, and then These tasks are scheduled to be executed on the Worker node. After the Worker node receives the Map or Reduce task request from the Master node, it performs the task in the request. The MapReduce system automatically implements parallel processing, distributed data, fault tolerance, and balanced load.

In the existing MapReduce system, when a worker node fails during the execution of the task, the master node reassigns the task that the faulty worker node is responsible to to other worker nodes. After the other worker nodes receive the task, the other worker node The task is re-executed from the beginning. When the master node fails during the execution of the entire task, the entire task needs to be re-executed from the beginning, thus reducing the data processing efficiency and affecting the user experience. Summary of the invention

In view of the above, it is a primary object of the present invention to provide a method and system for implementing parallel computing that can continue to perform tasks from a breakpoint at the time of failure when a node fails.

In order to achieve the above object, the technical solution of the present invention is achieved as follows:

The present invention provides a method for implementing parallel computing, the method comprising:

After the overall task is started, the log information of the Worker node and the Master node performing the task is recorded;

When the worker node performing the task fails, the new worker node obtains the log information of the recorded faulty worker node, and continues to process the business process of the faulty worker node according to the log information from the breakpoint at the time of the fault; and/or, when When the master node that performs the task fails, the new master node starts to obtain the log information of the faulty master node, and continues to process the service flow of the faulty master node from the breakpoint when the fault occurs.

In the foregoing solution, the new worker node obtains log information of the faulty worker node,

After receiving the information, the new Worker node sends the query request information to the global information monitoring function entity;

After receiving the query request information, the global information monitoring function entity searches the log information of the faulty worker node saved by itself according to the query request information, and returns the log information of the faulty worker node to the new worker node.

In the above solution, the new master node obtains the log information of the faulty master node, which is:

Sending a query request letter to the global information monitoring function entity by the new master node Interest rate

After receiving the query request information, the global information monitoring function entity searches for the log information of the faulty Master node saved according to the query request information, and returns the log information of the faulty Master node to the new Master node.

In the foregoing solution, before recording the log information of the Master node and the Worker node, the method further includes:

After the user program starts the overall task by calling the client library, a node is selected as the master node for executing the task, and then the input data source to be processed is sent to the selected master node of the execution task;

After receiving the input data source to be processed, the master node performing the task performs segmentation processing on the input data source;

The master performing the task selects a worker node that executes the task, and assigns a task to be executed to each worker node that performs the task;

The worker node performing the task reads the divided data block and performs the assigned task. In the above solution, the log information of the Worker node and the Master node that record the task is:

After the overall task is started, the worker node and the master node performing the task upload their own log information to the global information monitoring function entity in real time;

The global information monitoring function entity saves the log information of the Worker node and the Master node that perform the task.

In the foregoing solution, before the global information monitoring function entity saves the log information of the worker node and the master node that perform the task, the method further includes:

After receiving the log information uploaded by the worker node, the global information monitoring function entity determines whether the identity information of the node carried in the information of the worker node is consistent with the identity information of the saved worker node, and when the consistency is determined, the worker node is saved. Log information, indeed When the inconsistency is determined, the log information of the worker node is discarded.

The present invention also provides a method for obtaining log information, the method comprising:

After the overall task is started, the information of the Master node and the Worker node of the execution task is saved in real time;

When the worker node performing the task fails, and after receiving the query request information sent by the new Worker node, the log information of the saved faulty worker node is searched according to the query request information, and the faulty worker node is returned to the new Worker node. And/or, when the master node performing the task fails and receives the query request information sent by the new master node, searches for the saved fault information of the master node according to the query request information, and The new Master node returns the fault information of the faulty Master node.

In the foregoing solution, before saving the log information of the master node and the worker node performing the task in real time, the method further includes:

It is determined whether the identity information of the node carried in the log information of the worker node is consistent with the identity information of the saved worker node. When the consistency is determined, the information of the worker node is saved, and when the inconsistency is determined, the log information of the worker node is discarded.

The present invention also provides a global information monitoring entity that obtains log information, where the global information monitoring entity includes: a storage module and a query module;

The storage module is configured to save the log information uploaded by the master node and the worker node performing the task in real time after the whole task is started;

The query module is configured to: when the worker node performing the task fails and after receiving the query request information sent by the new worker node, search for the log information of the faulty worker node saved by the storage module according to the query request information, and send the log information to the new The worker node returns the log information of the faulty worker node; and/or, when the master node performing the task fails and receives the query request information sent by the new master node, searches for the faulty master node saved by the storage module according to the query request information. Log information, and return the fault master to the new master node Log information for the node.

In the foregoing solution, the global information monitoring entity further includes: a determining module, configured to

When the worker node uploads the log information, it determines whether the identity information of the node carried in the log information of the worker node is consistent with the identity information of the saved worker node. When the consistency is determined, the log information of the worker node is saved, and when the inconsistency is determined, Discard the log information of the Worker node.

In the above solution, the storage module is further configured to save the identity information of the worker node.

The present invention also provides a system for implementing parallel computing, the system comprising: a global information monitoring function entity, a first worker node, and/or a first master node;

The global information monitoring function entity is configured to record the log information of the worker node and the master node performing the task after the overall task is started;

The first worker node is configured to: when the worker node performing the task fails, obtain the log information of the faulty worker node from the global information monitoring function entity, and continue to process the service of the faulty worker node according to the log information from the breakpoint at the time of the fault occurrence. And the first master node is configured to: when the master node performing the task fails, obtain the log information of the faulty master node from the global information monitoring function entity after the self-starting, and according to the log information, the fault occurs. The breakpoint continues to process the business process of the failed master node.

In the above solution, the system further includes: a User Program unit, a second Master node, and a second Worker node; wherein

The User Program unit is configured to select the second master node as the master node for executing the task after initiating the overall task by calling the client library, and send the input data source to be processed to the second master node;

The second master node is set to receive the input that needs to be processed sent by the User Program unit. After entering the data source, the input data source is divided, and then the worker node that performs the task is selected, and the task that needs to be executed is assigned to each worker node that performs the task;

The second worker node is configured to perform the assigned task after receiving the task assigned by the second master node.

In the foregoing solution, the second master node is further configured to: when the second worker node fails, send information about performing the task to the first worker node;

The first worker node is configured to: after receiving the information sent by the second master node, send the query request information to the global information monitoring function entity, and receive the log information of the second worker node returned by the global information monitoring function entity;

The global information monitoring function entity is further configured to: after receiving the query request information sent by the first worker node, search for the information of the second worker node saved by the first worker node according to the query request information, and return the second work to the first worker node. Log information of the Worker node.

In the above solution, the first master node is configured to: when the second master node fails, send query request information to the global information monitoring function entity, and receive log information of the second master node returned by the global information monitoring function entity. ;

The global information monitoring function entity is further configured to: after receiving the query request information sent by the first master node, search for the information of the second master node saved by the first master node according to the query request information, and return the second information to the first master node. Log information of the master node.

In the above solution, the second worker node is further configured to upload its own log information to the global information monitoring function entity in real time after the whole task is started;

The second master node is further configured to upload its own log information to the global information monitoring function entity in real time after the overall task is started;

The global information monitoring function entity is further configured to save log information of the second worker node and the second master node.

In the above solution, the global information monitoring function entity is further set to save the second Worker. Before the log information of the node and the second master node is determined, it is determined whether the identity information of the node carried in the log information of the second worker node is consistent with the identity information of the saved worker node, and when the consistency is determined, the log of the second worker node is saved. When the information is determined to be inconsistent, the log information of the second worker node is discarded.

The method and system for implementing parallel computing provided by the present invention, the new worker node obtains the log information of the recorded faulty worker node, and continues to process the business process of the faulty worker node from the breakpoint at the time of the fault according to the log information; and/or The new master obtains the log information of the faulty master node, and continues to process the service flow of the faulty master node from the breakpoint at the time of the fault according to the log information, so that when the node fails, the fault occurs at the moment of the fault. Continue to perform tasks at the point, thereby improving data processing efficiency, saving system resources, and improving user experience. DRAWINGS

FIG. 1 is a schematic structural diagram of an existing MapReduce system;

2 is a schematic flowchart of a method for implementing parallel computing according to an embodiment of the present invention;

3 is a schematic flowchart of a method before recording log information of a Master node and a Worker node according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a system for implementing parallel computing according to an embodiment of the present invention. detailed description

The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

The method for implementing parallel computing according to the present invention, as shown in FIG. 2, includes the following steps:

Step 201: After the whole task is started, record the log information of the worker node and the master node that execute the task;

Here, before recording the log information of the Master node and the Worker node, as shown in FIG. 3, the method may further include the following steps: Step 301: After the user program starts the overall task by calling the client program library, selecting a node as the master node for executing the task, and then sending the input data source to be processed to the selected master node performing the task;

Step 302: After the master node performing the task receives the input data source to be processed, the input data source is segmented, and then step 303 is performed;

Here, the Master node can call the split function in the User Program to divide the input data source; the User Program can inform the Master node of the calling program parameters in advance, or can send the calling function to the Master node in advance by means of a message.

Step 303: The node performing the task selects the Worker node that executes the task, and assigns a task to be executed to each Worker node that executes the task;

Step 304: The worker node performing the task reads the divided data block and performs the assigned task.

The step 30 is the same as the existing process, and is not described here. The log information includes: status information of the node operation and status and key data of the service process flow; wherein, the status information of the node operation , which may be: network status, CPU, memory, disk space, execution status of a Map task or a Reduce task, etc.; the status and key data of the business process flow are related to the specific business process being processed, for example, A business process of using MapReduce to send short messages of weather forecast to 100,000 mobile phone users in parallel, the state and key data of the business process flow include phone number information of the mobile phone user; in actual application, it may be added in the MapReduce system. A global information monitoring function entity records the log information of the master node and the worker node by the global information monitoring function entity, and configures the identity information of the global information monitoring function entity on all nodes in the MapReduce system in advance, and the global information monitoring function Entity identity Information may be interconnection between network protocols (IP, Internet Protocol) address, all the identification number (ID, Identity) can show information such as the identity of the entity of the global information monitoring function; MapReduce system in the A node may upload its own log information to the global information monitoring function entity according to the identity information of the global information monitoring function entity; after the overall task is started, the master node and the worker node upload their own log information to the global information monitoring function in real time. entity;

In order to ensure that the entire log recording process is reliable, when the overall task is started, the master node assigns the overall task to which worker nodes to execute, and sends the identity information of the worker nodes to the global information monitoring function entity, and the global information monitoring function entity receives and The identifier information of the worker node is saved. If the worker node uploads the log information, the global information monitoring function entity determines whether to save the log information of the worker node according to the saved identity information of the worker node, specifically, the log information of the worker node. If the identity information of the node is the same as the identity information of the saved worker node, the log information of the worker node is saved, otherwise, the log information of the worker node is discarded; the identity information of the worker node refers to Information that identifies the identity of the Worker node, such as: IP address, machine name, or ID;

The specific form of the global information monitoring function entity may be a log database, or may be an aggregate composed of one or more nodes;

The Worker node refers to a collection of all Worker nodes that perform the task.

Step 202: When the worker node performing the task fails, the new worker node obtains the log information of the faulty worker node, and continues to process the business process of the faulty worker node according to the log information from the breakpoint when the fault occurs; and / Or, when the master node performing the task fails, the new master node starts, obtains the recorded fault information of the master node, and continues to process the faulty master node service from the breakpoint when the fault occurs according to the information. Process;

Here, the master node can know that the worker node performing the task is faulty through the heartbeat detection between itself and the worker node; after the worker node performing the task fails, the master node can be based on the load of other nodes in the MapReduce system, namely: existing Automatic load balancing processing in the MapReduce system, selecting a node as a new Worker node; the new Worker node may be a healthy Worker node that is performing the task, or may be a healthy Worker node that does not perform the task. ;

After the task is started, the User Program of the MapReduce system starts a timer. After the timer expires, the task execution result returned by the master node has not been received. The master node is considered to be faulty. You need to select a new node as the master. The node, when selected, can be based on the load of other nodes in the MapReduce system, that is, the automatic load balancing processing in the existing MapReduce system, and select a node as the new master node; the new master node can be executed. The master node of the task may also be another master node that does not perform the task;

The new worker node obtains the log information of the faulty worker node, where specifically: the master node sends information about performing the task to the new worker node;

After receiving the query request information, the global information monitoring function entity searches for the fault information of the faulty worker node saved by itself according to the query request information, and returns the log information of the faulty worker node to the new worker node;

The information about the execution task includes a task data source, a task ID, and identity information of the faulty worker node;

The query request information includes a task ID, a node identifier information of the fault worker, and the like, and the node identifier information of the fault worker may be information such as an IP address, a machine name, an ID, and the like, which can identify the identity of the faulty worker node;

The new master node obtains the log information of the faulty master node, specifically: the new master node sends the query request information to the global information monitoring function entity; after receiving the query request information, the global information monitoring function entity according to the query request information Find Log information of the faulty master node saved by itself, and returning log information of the faulty master node to the new master node;

The query request information includes information such as identity information or task ID information of the faulty master node, which can identify the log record of the faulty master node; the identity information of the faulty master node may be an IP address, a machine name, an ID, and the like. Everything that identifies the identity of the failed Master node.

When the task of each worker node is completed, the external interface is called to upload its own log information to the global information monitoring function entity, and the master node is notified that the task it is responsible for has been processed. After receiving the notification, the master node will The task of the Worker node is marked as completed. After receiving notifications from all the Worker nodes that the processing has been completed, the Master node ends the overall task.

In order to implement the foregoing method, the present invention further provides a global information monitoring entity that obtains log information, where the global information monitoring entity includes: a storage module and a query module;

a storage module, configured to save log information uploaded by the master node and the worker node performing the task in real time after the whole task is started;

a query module, configured to: when a worker node performing a task fails and after receiving the query request information sent by the new worker node, search for log information of the faulty worker node saved by the storage module according to the query request information, and send the log information to the new The worker node returns the log information of the faulty worker node; and/or, when the master node performing the task fails and receives the query request information sent by the new master node, searches for the faulty master node saved by the storage module according to the query request information. The message information is returned to the new Master node and the log information of the failed Master node is returned.

The global information monitoring entity may further include a determining module, configured to: when the worker node uploads the information, determine whether the identity information of the node carried in the information of the worker node and the identity information of the saved worker node are Consistent, when determining consistency, The log information of the worker node is saved. Otherwise, the log information of the worker node is discarded. The storage module is further configured to save identity information of the Worker node.

In addition, the present invention further provides a system for implementing parallel computing. As shown in FIG. 4, the system includes: a global information monitoring function entity 41, a first worker node 42, and/or a first master node 43;

The global information monitoring function entity 41 is configured to record log information of the worker node and the master node that perform the task after the whole task is started;

The first worker node 42 is configured to: when the worker node performing the task fails, obtain the log information of the faulty worker node from the global information monitoring function entity 41, and continue to process the faulty worker node from the breakpoint when the fault occurs according to the log information. And the first master node 43 is configured to: when the master node performing the task fails, obtain the log information of the faulty master node from the global information monitoring function entity 41 after the self-starting, and according to the log information The processing of the faulty Master node is continued from the breakpoint at the time of the fault.

Here, it should be noted that: the first worker node 42 may be a healthy worker node that is performing the task, and may also be a healthy worker node that does not perform the task; the first master node 43 may be a master node that performs the task. It can also be another Master node that does not perform this task.

The system may further include a User Program unit, a second Master node, and a second Worker node;

a User Program unit, configured to start a whole task by calling a client library, select a second master node as a master node for performing a task, and send an input data source to be processed to the second master node;

The second master node is configured to receive the input data source to be processed sent by the User Program unit, perform the segmentation process on the input data source, and then select the worker node that performs the task, And assign a task to be executed to each worker node that performs the task;

Here, it should be noted that: the second worker node may be a collection of more than one worker node performing the task.

The second master node is further configured to send information about performing a task to the first worker node 42 when the second worker node fails;

The first worker node is specifically configured to: after receiving the information sent by the second master node, send the query request information to the global information monitoring function entity 41, and receive the log of the second worker node returned by the global information monitoring function entity 41. Information

The global information monitoring function entity 41 is further configured to: after receiving the query request information sent by the first worker node 42, search for the information of the second worker node saved by the first worker node according to the query request information, and send the information to the first worker node 41. Returns the information of the second Worker node.

The first master node 42 is specifically configured to: when the second master node fails, send the query request information to the global information monitoring function entity 41, and receive the second master node returned by the global information monitoring function entity 41. Log information

The global information monitoring function entity 41 is further configured to: after receiving the query request information sent by the first master node 43 , search for the information of the second master node saved by the first master node 43 according to the query request information, and send the information to the first master node 43 Returns the information of the second master node.

The second worker node is further configured to upload its own log information to the global information monitoring function entity 41 in real time after the overall task is started;

The second master node is further configured to upload its own log information to the global information monitoring function entity 41 in real time after the overall task is started;

The global information monitoring function entity 41 is further configured to save log information of the second worker node and the second master node. The global information monitoring function entity 41 is further configured to determine the identity information of the node carried in the log information of the second worker node and the saved Worker before saving the log information of the second worker node and the second master node. If the identity information of the node is consistent and the consistency is determined, the log information of the second worker node is saved, and the log information of the second worker node is discarded.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included. Within the scope of protection of the present invention.

Claims

Claim

A method for implementing parallel computing, wherein the method comprises:

After the overall task is started, record the log information of the worker node and the master node that perform the task;

2. The method according to claim 1, wherein the new Worker node obtains the information of the faulty worker node, which is:

The master node performing the task sends a message to perform execution of the task to the new worker node, and after receiving the information, the new worker node sends the query request information to the global information monitoring function entity;

The method according to claim 1, wherein the new master node obtains log information of the faulty master node, which is:

The new master node sends query request information to the global information monitoring function entity;

After receiving the query request information, the global information monitoring function entity searches for the information of the faulty Master node saved by itself according to the query request information, and returns to the new Master node. Returns the log information of the faulty master node.

The method according to claim 1, 2 or 3, wherein before the log information of the worker node and the master node performing the task is recorded, the method further comprises:

The worker node performing the task reads the divided data block and performs the assigned task.

5. The method according to claim 4, wherein the log information of the worker node and the master node that record the task is:

After the overall task is started, the worker node and the master node performing the task upload their own information to the global information monitoring function entity in real time;

The global information monitoring function entity saves log information of the Worker node and the Master node that perform the task.

The method according to claim 5, wherein before the global information monitoring function entity saves the log information of the worker node and the master node that perform the task, the method further includes: the global information monitoring function entity receiving the worker node uploading After the log information is obtained, it is determined whether the identity information of the node carried in the log information of the worker node is consistent with the identity information of the saved worker node. When the consistency is determined, the log information of the worker node is saved, and when the inconsistency is determined, the worker node is discarded. Log information.

7. A method for obtaining log information, wherein the method comprises:

After the overall task is started, the date of the Master node and the Worker node that execute the task is saved in real time. Information

When the worker node performing the task fails, and after receiving the query request information sent by the new worker node, searching for the saved fault information of the faulty worker node according to the query request information, and returning the faulty worker to the new worker node The node information of the node; and/or, when the master node performing the task fails and receives the query request information sent by the new master node, searches for the saved fault information of the master node according to the query request information, and The new Master node returns the fault information of the faulty Master node.

8. The method according to claim 7, wherein, before the log information of the master node and the worker node performing the task are saved in real time, the method further includes:

It is determined whether the identity information of the node carried in the information of the worker node is consistent with the identity information of the saved worker node. When the consistency is determined, the log information of the worker node is saved, and when the inconsistency is determined, the log information of the worker node is discarded.

A global information monitoring entity that obtains log information, where the global information monitoring entity includes: a storage module and a query module;

The query module is configured to: when the worker node performing the task fails and after receiving the query request information sent by the new worker node, search for the log information of the faulty worker node saved by the storage module according to the query request information, and send the log information to the new The worker node returns the log information of the faulty worker node; and/or, when the master node performing the task fails and receives the query request information sent by the new master node, searches for the faulty master node saved by the storage module according to the query request information. The information is sent, and the log information of the faulty master node is returned to the new master node.

The global information monitoring entity according to claim 9, wherein the global information monitoring entity further comprises: a determining module, configured to determine, when the worker node uploads log information, If the identity information of the node in the log information of the worker node is the same as the identity information of the saved worker node, the log information of the worker node is saved when the consistency is determined. .

The global information monitoring entity according to claim 9 or 10, wherein the storage module is further configured to save identity information of the worker node.

12. A system for implementing parallel computing, wherein the system comprises: a global information monitoring function entity, a first worker node, and/or a first master node; wherein

The first worker node is configured to: when the worker node performing the task fails, obtain the log information of the faulty worker node from the global information monitoring function entity, and continue to process the service of the faulty worker node according to the log information from the breakpoint at the time of the fault occurrence. And the first master node is configured to: when the master node performing the task fails, obtain the information of the faulty master node from the global information monitoring function entity after the self-starting, and generate the fault information according to the log information. At the breakpoint of the time, the business process of the failed master node continues to be processed.

The system of claim 12, wherein the system further comprises: a User Program unit, a second Master node, and a second Worker node;

The second master node is configured to receive the input data source to be processed sent by the User Program unit, perform the segmentation process on the input data source, select the worker node that performs the task, and assign the worker node to each task to be executed. Task

The second worker node is set to receive the task assigned by the second master node, and the execution point is Matching tasks.

14. The system of claim 13 wherein

The second master node is further configured to send information about performing a task to the first worker node when the second worker node fails;

The global information monitoring function entity is further configured to: after receiving the query request information sent by the first worker node, search for the information of the second worker node saved by the first worker node according to the query request information, and return the second work to the first worker node. The information of the worker node.

15. The system of claim 13 wherein

The first master node is configured to: when the second master node fails, send the query request information to the global information monitoring function entity, and receive the information of the second master node returned by the global information monitoring function entity;

The global information monitoring function entity is further configured to: after receiving the query request information sent by the first master node, search for the information of the second master node saved by the first master node according to the query request information, and return the second information to the first master node. The information of the Master node.

16. The system of claim 13, 14 or 15, wherein

The second worker node is further configured to upload its own log information to the global information monitoring function entity in real time after the overall task is started;

17. The system of claim 16 wherein The global information monitoring function entity is further configured to determine the identity information of the node carried in the log information of the second worker node and the identity of the saved worker node before saving the log information of the second worker node and the second master node. If the identification information is consistent, the log information of the second worker node is saved when the consistency is determined. If the inconsistency is determined, the log information of the second worker node is discarded.