CN102385536A

CN102385536A - Method and system for realization of parallel computing

Info

Publication number: CN102385536A
Application number: CN2010102693321A
Authority: CN
Inventors: 周扬; 胡媛; 张艺夕; 李桂萍; 黄翔
Original assignee: ZTE Corp
Current assignee: NANTONG JINGHAISHEN AQUATIC PRODUCT CO., LTD.
Priority date: 2010-08-27
Filing date: 2010-08-27
Publication date: 2012-03-21
Anticipated expiration: 2030-08-27
Also published as: CN102385536B; WO2012024937A1

Abstract

The invention discloses a method for realization of parallel computing. The method comprises the following steps of: recording log information of a Worker node and a Master node of executing a task after an overall task is started; acquiring the recorded log information of the fault Worker node by a new Worker node when a fault occurs in the Worker node of executing the task, and continuously handling the service flow of the fault Worker node from a breakpoint of fault occurrence moment according to the log information; and/or acquiring the recorded log information of the fault Master node when a fault occurs in the Master node of executing the task and after a new Master node starts, and continuously handling the service flow of the fault Master node from the breakpoint of the fault occurrence moment according to the log information. The invention simultaneously discloses a system for realization of the parallel computing. With the adoption of the method and the system, when the fault occurs in the node, the task can be continuously executed from the breakpoint of the fault occurrence moment.

Description

A kind of method and system that realize parallel computation

Technical field

The present invention relates to the cloud computing field, be meant a kind of system and method for realizing parallel computation especially.

Background technology

MapReduce is proposed by the slip-stick artist of Google at first; It is a kind of system architecture that can the parallel processing mass data; The principle of work of MapReduce system is: automatically a task is resolved into a plurality of subtasks; After all subtasks are finished, gather result these subtasks of executed in parallel then.

Fig. 1 is the configuration diagram of existing MapReduce system, and as can be seen from Figure 1, MapReduce is divided into two stages with data processing: mapping (Map) stage and abbreviation (Reduce) stage.The MapReduce system mainly comprises client (Client), host's (Master) node and workman (Worker) node; Wherein, Client is used to submit to the MapReduce task; The Master node is used for automatically the MapReduce task being decomposed into Map task and Reduce task; Afterwards these task schedulings are carried out to the Worker node, the Worker node is used for after receiving the Map or Reduce task requests that Master sends, and carries out the task in the request.The MapReduce system can realize parallel processing, distributed data, fault-tolerant, and function such as equally loaded automatically.

In the existing MapReduce system; When breaking down in the process that certain Worker node is being executed the task; The task that the Master node is responsible for this fault Worker node; Redistribute to other Worker nodes, after other Worker nodes are received task, this task is started anew to carry out again one time.When the Master node breaks down in whole task executions process, then need whole task be started anew all to carry out again one time, so, reduce data-handling efficiency, and then influence user experience.

Summary of the invention

In view of this, fundamental purpose of the present invention is to provide a kind of method and system that realize parallel computation, can when node break down, from fault breakpoint constantly take place and continue to execute the task.

For achieving the above object, technical scheme of the present invention is achieved in that

The invention provides a kind of method that realizes parallel computation, this method comprises:

After overall task starts, the Worker node that record is executed the task and the log information of Master node;

When the Worker node of executing the task broke down, new Worker node obtained the log information of the fault Worker node of record, and continued the operation flow of handling failure Worker node according to the breakpoint of log information when fault takes place; And/or; When the Master node of executing the task breaks down; After new Master node starts, obtain the log information of the fault Master node of record, and according to the operation flow of the breakpoint continuation handling failure Master node of log information when fault takes place.

In the such scheme, said new Worker node obtains the log information of fault Worker node, for:

The Master node sends the information of executing the task to said new Worker node;

After said new Worker node is received information, send query requests information to global information monitoring function entity;

After global information monitoring function entity is received query requests information, the log information of the fault Worker node of preserving according to query requests information searching self, and return the log information of fault Worker node to said new Worker node.

In the such scheme, said new Master node obtains the log information of fault Master node, for:

Said new Master node sends query requests information to global information monitoring function entity;

After global information monitoring function entity is received query requests information, the log information of the fault Master node of preserving according to query requests information searching self, and return the log information of fault Master node to said new Master node.

In the such scheme, before the log information of record Master node and Worker node, this method further comprises:

User Program selects a node as the Master node through after calling client-side program storehouse startup overall task, needs the input data source of processing afterwards to the transmission of Master node;

The Master node will be imported data source and carry out dividing processing after receiving the input data source that needs to handle;

Master selects the Worker node execute the task, and needs the task of execution to each Worker node distributions of executing the task;

The Worker node of executing the task reads the divided data piece, carries out the task of distributing.

In the such scheme, the Worker node that said record is executed the task and the log information of Master node, for:

After overall task started, Worker node of executing the task and Master node were uploaded to global information monitoring function entity in real time with the log information of self;

The Worker node that the preservation of global information monitoring function entity is executed the task and the log information of Master node.

In the such scheme, preserve at global information monitoring function entity before the log information of the Worker node execute the task and Master node, this method further comprises:

After global information monitoring function entity is received the log information that the Worker node uploads; Whether the identification information of judging the node that carries in the log information of Worker node is consistent with the identification information of the Worker node of preservation; Confirm consistent; Then preserve the log information of Worker node, confirm inconsistently, then abandon the log information of Worker node.

The present invention also provides a kind of method of obtaining log information, and this method comprises:

After overall task starts, the Master node that preservation is in real time executed the task and the log information of Worker node;

When the Worker node of executing the task breaks down; And after receiving the query requests information that new Worker node sends; The log information of the fault Worker node of preserving according to the query requests information searching, and return the log information of fault Worker node to said new Worker node; And/or; After the Master node of executing the task breaks down and is receiving the query requests information of new Master node transmission; The log information of the fault Master node of preserving according to the query requests information searching, and return the log information of fault Master node to said new Master node.

In the such scheme, before the log information of preserving in real time the Master node of executing the task and Worker node, this method further comprises:

Whether the identification information of judging the node that carries in the log information of Worker node is consistent with the identification information of the Worker node of preservation; Confirm consistent; Then preserve the log information of Worker node, confirm inconsistently, then abandon the log information of Worker node.

The present invention also provides a kind of global information monitoring entity that obtains log information, and this global information monitoring entity comprises: memory module and enquiry module; Wherein,

Memory module after being used for overall task and starting, is preserved the log information that the Master node of executing the task and Worker node are uploaded in real time;

Enquiry module; Be used for after the Worker node of executing the task breaks down and receiving the query requests information of new Worker node transmission; The log information of the fault Worker node of preserving according to query requests information searching memory module, and return the log information of fault Worker node to said new Worker node; And/or; After the Master node of executing the task breaks down and is receiving the query requests information of new Master node transmission; The log information of the fault Master node of preserving according to query requests information searching memory module, and return the log information of fault Master node to said new Master node.

In the such scheme; This global information monitoring entity further comprises: judge module, when being used for the Worker node and uploading log information, judge whether the identification information of this node that carries in the log information of Worker node is consistent with the identification information of the Worker node of preservation; When confirming unanimity; Preserve the log information of this Worker node, otherwise, the log information of this Worker node abandoned.

In the such scheme, said memory module, the identification information that also is used to preserve the Worker node.

The present invention also provides a kind of system that realizes parallel computation, and this system comprises: global information monitoring function entity, a Worker node, an and Master node; Wherein,

Global information monitoring function entity, after being used for overall task and starting, the Worker node that record is executed the task and the log information of Master node;

The one Worker node; Be used for when the Worker node of executing the task breaks down; Obtain the log information of fault Worker node from global information monitoring function entity, and continue the operation flow of handling failure Worker node according to the breakpoint of log information when fault takes place; And/or,

The one Master node; Be used for when the Master node of executing the task breaks down; After self starts; Obtain the log information of fault Master node from global information monitoring function entity, and continue the operation flow of handling failure Master node according to the breakpoint of log information when fault takes place.

In the such scheme, this system further comprises: User Program unit, the 2nd Master node and the 2nd Worker node; Wherein,

User Program unit is used for selecting a node as the Master node through after calling client-side program storehouse startup overall task, needs the input data source of processing afterwards to the 2nd Master node transmission;

The 2nd Master node; Be used for after the input data source of receiving the needs processing that User Program unit sends; To import data source and carry out dividing processing, select the Worker node execute the task afterwards, and need the task of execution to each Worker node distributions of executing the task;

The 2nd Worker node is used for after receiving the task that the 2nd Master node distributes, carrying out the task of distributing.

In the such scheme, said the 2nd Master node also is used for when the 2nd Worker node breaks down, and sends the information of executing the task to a Worker node;

A said Worker node specifically is used for: after receiving the information that the 2nd Master node sends, send query requests information to global information monitoring function entity, and receive the log information of the 2nd Worker node that global information monitoring function entity returns;

Said global information monitoring function entity; Also be used for after receiving the query requests information that a Worker node sends; The log information of the 2nd Worker node of preserving according to query requests information searching self, and return the log information of the 2nd Worker node to a Worker node.

In the such scheme; A said Master node; Specifically be used for: when the 2nd Master node breaks down, send query requests information, and receive the log information of the 2nd Master node that global information monitoring function entity returns to global information monitoring function entity;

Said global information monitoring function entity; Also be used for after receiving the query requests information that a Master node sends; The log information of the 2nd Master node of preserving according to query requests information searching self, and return the log information of the 2nd Master node to a Master node.

In the such scheme, said the 2nd Worker node also is used for after overall task starts, and self log information is uploaded to global information monitoring function entity in real time;

Said the 2nd Master node also is used for after overall task starts, and self log information is uploaded to global information monitoring function entity in real time;

Global information monitoring function entity also is used to preserve the log information of the 2nd Worker node and the 2nd Master node.

In the such scheme; Said global information monitoring function entity also is used for before the log information of preserving the 2nd Worker node and the 2nd Master node, judging whether the identification information of the node that carries in the log information of the 2nd Worker node is consistent with the identification information of the Worker node of preservation; Confirm consistent; Then preserve the log information of the 2nd Worker node, confirm inconsistently, then abandon the log information of the 2nd Worker node.

The method and system of realization parallel computation provided by the invention, new Worker node obtains the log information of the fault Worker node of record, and continues the operation flow of handling failure Worker node according to the breakpoint of log information when fault takes place; And/or new Master obtains the log information of the fault Master node of record, and continues the operation flow of handling failure Master node according to the breakpoint of log information when fault takes place; So; Can from fault breakpoint constantly take place and continue to execute the task when node break down, and then improve the treatment effeciency of data; Save system resource, promote user experience.

Description of drawings

Fig. 1 is the configuration diagram of existing MapReduce system;

Fig. 2 realizes the method flow synoptic diagram of parallel computation for the present invention;

Fig. 3 is the method flow synoptic diagram before the log information of record Master node and Worker node;

Fig. 4 realizes the system architecture synoptic diagram of parallel computation for the present invention.

Embodiment

Below in conjunction with accompanying drawing and specific embodiment the present invention is remake further detailed explanation.

The present invention realizes the method for parallel computation, and is as shown in Figure 2, may further comprise the steps:

Step 201: after overall task starts, the Worker node that record is executed the task and the log information of Master node;

Here, before the log information of record Master node and Worker node, as shown in Figure 3, this method can further include following steps:

Step 301:User Program selects a node as the Master node through after calling client-side program storehouse startup overall task, needs the input data source of processing afterwards to the transmission of Master node.

Step 302:Master node will be imported data source and carry out dividing processing after receiving the input data source that needs to handle, and execution in step 303 afterwards;

Here, the Master node can call the segmentation function among the User Program, will import data source and carry out dividing processing; User Program can tell Master node with the calling program parameter in advance, perhaps, can be in advance the mode of call function through message be sent to the Master node.

Step 303:Master node is selected the Worker node execute the task, and needs the task of execution to each Worker node distributions of executing the task.

Step 304: the Worker node of executing the task reads the divided data piece, carries out the task of distributing;

Wherein, step 301～304 are identical with existing processing procedure, repeat no more here;

Said log information comprises: the status information of node operation and the state and the critical data of business processing flow; Wherein, the status information of said node operation can be: network condition, CPU, internal memory, disk space, Map task or Reduce task executions state etc.; The state of said business processing flow is relevant with the concrete operation flow of processing with critical data; Give an example; Use MapReduce to walk abreast for one and send the operation flow of the short message of weather forecast to 100,000 cellphone subscribers, then the state of said business processing flow and critical data comprise cellphone subscriber's telephone number information;

When practical application; Can in the MapReduce system, set up a global information monitoring function entity; Log information by global information monitoring function entity record Master node and Worker node; And dispose global information monitoring function identity of entity identification information in advance on all nodes in the MapReduce system, said global information monitoring function identity of entity identification information can be that agreement (IP) address, identify label number (ID) interconnected between the network waits all can show the information of global information monitoring function entity identities; All nodes in the MapReduce system can be according to said global information monitoring function identity of entity identification information, and the log information of uploading self is to global information monitoring function entity; After overall task started, Master node and Worker node were uploaded to global information monitoring function entity in real time with the log information of self;

Reliable in order to guarantee whole log record process; After overall task started, which Worker node the Master node distributes to overall task was carried out, and the identification information of these Worker nodes is sent to global information monitoring function entity; Global information monitoring function entity receives and preserves the identification information of Worker node; If when having the Worker node to upload log information, global information monitoring function entity judges whether to preserve the log information of this Worker node according to the identification information of the Worker node of preserving, particularly; When the identification information of the identification information of this node that carries in the log information of Worker node and the Worker node of preservation is consistent; Then preserve the log information of this Worker node, otherwise, the log information of this Worker node abandoned; The identification information of said Worker node is meant the information that can identify Worker node identity, such as: IP address, machine name or ID etc.;

The concrete form of said global information monitoring function entity can be a log database, the aggregate that can also be made up of one or more nodes;

Said Worker node is meant the set of all Worker nodes of this task of execution.

Step 202: when the Worker node of executing the task broke down, new Worker node obtained the log information of the fault Worker node of record, and continued the operation flow of handling failure Worker node according to the breakpoint of log information when fault takes place; And/or; When the Master node of executing the task breaks down; After new Master node starts, obtain the log information of the fault Master node of record, and according to the operation flow of the breakpoint continuation handling failure Master node of log information when fault takes place;

Here, the Master node can know that through the heartbeat detection between self and the Worker node Worker node of executing the task breaks down; After the Worker node of executing the task breaks down; The Master node can be according to the loading condition of other node in the MapReduce system; That is: the processing of the automatic load balancing in the existing MapReduce system is selected a node as new Worker node; Said new Worker node can be a Worker node of carrying out the health of this task, can also be the Worker node of not carrying out the health of this task;

Behind task start; The User Program of MapReduce system can start a timer, behind timer expiry, does not also receive the task action result that the Master node returns; Just think that this Master node breaks down; Need to select a new node as the Master node, when selecting, can be according to the loading condition of other node in the MapReduce system; That is: the processing of the automatic load balancing in the existing MapReduce system is selected a node as new Master node; Said new Master node can be a Master node of carrying out this task, can also be other Master node of not carrying out this task;

Said new Worker node obtains the log information of fault Worker node, is specially:

After global information monitoring function entity is received query requests information, the log information of the fault Worker node of preserving according to query requests information searching self, and return the log information of fault Worker node to said new Worker node;

Wherein, said information of executing the task comprises the identification information of task data source, task ID, fault Worker node etc.;

Said query requests information comprises the node identification information of task ID, fault Worker etc., the node identification information of said fault Worker can be IP address, machine name, ID etc. all can identify the information of fault Worker node identity;

Said new Master node obtains the log information of fault Master node, is specially:

After global information monitoring function entity is received query requests information, the log information of the fault Master node of preserving according to query requests information searching self, and return the log information of fault Master node to said new Master node;

Wherein, said query requests information identification information or task ID information of comprising fault Master node etc. can identify fault Master node log information recorded; The identification information of said fault Master node can be IP address, machine name, ID etc. all can identify the information of fault Master node identity.

When the task of each Worker node is finished, can calls external interface self log information is uploaded to global information monitoring function entity, notify the Master node simultaneously, self being responsible for of task disposes; After the Master node is notified, self the task flagging of Worker node is become accomplish.After receiving the notice of having finished dealing with that all Worker nodes send, the Master node finishes overall task.

For realizing said method, the present invention also provides a kind of global information monitoring entity that obtains log information, and this global information monitoring entity comprises: memory module and enquiry module; Wherein,

Wherein, This global information monitoring entity can further include judge module, when being used for the Worker node and uploading log information, judges whether the identification information of this node that carries in the log information of Worker node is consistent with the identification information of the Worker node of preservation; When confirming unanimity; Preserve the log information of this Worker node, otherwise, the log information of this Worker node abandoned.

Said memory module, the identification information that also is used to preserve the Worker node.

Simultaneously, the present invention provides a kind of system that realizes parallel computation again, and is as shown in Figure 4, and this system comprises: global information monitoring function entity 41, a Worker node 42, an and Master node 43; Wherein,

Global information monitoring function entity 41, after being used for overall task and starting, the Worker node that record is executed the task and the log information of Master node;

The one Worker node 42; Be used for when the Worker node of executing the task breaks down; Obtain the log information of fault Worker node from global information monitoring function entity 41, and continue the operation flow of handling failure Worker node according to the breakpoint of log information when fault takes place; And/or,

The one Master node 43; Be used for when the Master node of executing the task breaks down; After self starts; Obtain the log information of fault Master node from global information monitoring function entity 41, and continue the operation flow of handling failure Master node according to the breakpoint of log information when fault takes place.

Here, need to prove: a Worker node 42 can be a Worker node of carrying out the health of this task, can also be the Worker node of not carrying out the health of this task; The one Master node 43 can be a Master node of carrying out this task, can also be other Master node of not carrying out this task.

Wherein, this system can further include User Program unit, the 2nd Master node and the 2nd Worker node; Wherein,

Here, need to prove: the 2nd Worker node can be the set of the Worker node of executing the task more than.

Wherein, said the 2nd Master node also is used for when the 2nd Worker node breaks down, and sends the information of executing the task to a Worker node 42;

A said Worker node; Specifically be used for: after receiving the information that the 2nd Master node sends; Send the query requests information to global information monitoring function entity 41, and receive the log information of the 2nd Worker node that global information monitoring function entity 41 returns;

Said global information monitoring function entity 41; Also be used for after receiving the query requests information that a Worker node 42 sends; The log information of the 2nd Worker node of preserving according to query requests information searching self, and return the log information of the 2nd Worker node to a Worker node 41.

Wherein, A said Master node 42; Specifically be used for: when the 2nd Master node breaks down, send the query requests information, and receive the log information of the 2nd Master node that global information monitoring function entity 41 returns to global information monitoring function entity 41;

Said global information monitoring function entity 41; Also be used for after receiving the query requests information that a Master node 43 sends; The log information of the 2nd Master node of preserving according to query requests information searching self, and return the log information of the 2nd Master node to a Master node 43.

Said the 2nd Worker node also is used for after overall task starts, and self log information is uploaded to global information monitoring function entity 41 in real time;

Said the 2nd Master node also is used for after overall task starts, and self log information is uploaded to global information monitoring function entity 41 in real time;

Global information monitoring function entity 41 also is used to preserve the log information of the 2nd Worker node and the 2nd Master node.

Wherein, Said global information monitoring function entity 41 also is used for before the log information of preserving the 2nd Worker node and the 2nd Master node, judging whether the identification information of the node that carries in the log information of the 2nd Worker node is consistent with the identification information of the Worker node of preservation; Confirm consistent; Then preserve the log information of the 2nd Worker node, confirm inconsistently, then abandon the log information of the 2nd Worker node.

The above is merely preferred embodiment of the present invention, is not to be used to limit protection scope of the present invention, all any modifications of within spirit of the present invention and principle, being done, is equal to replacement and improvement etc., all should be included within protection scope of the present invention.

Claims

1. a method that realizes parallel computation is characterized in that, this method comprises:

After overall task starts, workman's (Worker) node that record is executed the task and the log information of host (Master) node;

2. method according to claim 1 is characterized in that, said new Worker node obtains the log information of fault Worker node, for:

3. method according to claim 1 is characterized in that, said new Master node obtains the log information of fault Master node, for:

4. according to claim 1,2 or 3 described methods, it is characterized in that before the log information of record Master node and Worker node, this method further comprises:

5. method according to claim 4 is characterized in that, the Worker node that said record is executed the task and the log information of Master node, for:

6. method according to claim 5 is characterized in that, preserves at global information monitoring function entity before the log information of the Worker node execute the task and Master node, and this method further comprises:

7. a method of obtaining log information is characterized in that, this method comprises:

8. method according to claim 7 is characterized in that, before the log information of preserving in real time the Master node of executing the task and Worker node, this method further comprises:

9. a global information monitoring entity that obtains log information is characterized in that, this global information monitoring entity comprises: memory module and enquiry module; Wherein,

10. global information monitoring entity according to claim 9 is characterized in that, this global information monitoring entity further comprises: judge module; When being used for the Worker node and uploading log information; Whether the identification information of judging this node that carries in the log information of Worker node is consistent with the identification information of the Worker node of preservation, when confirming unanimity, preserves the log information of this Worker node; Otherwise, abandon the log information of this Worker node.

11., it is characterized in that said memory module, the identification information that also is used to preserve the Worker node according to claim 9 or 10 described global information monitoring entities.

12. a system that realizes parallel computation is characterized in that, this system comprises: global information monitoring function entity, a Worker node, an and Master node; Wherein,

13. system according to claim 12 is characterized in that, this system further comprises: UserProgram unit, the 2nd Master node and the 2nd Worker node; Wherein,

14. system according to claim 13 is characterized in that,

Said the 2nd Master node also is used for when the 2nd Worker node breaks down, and sends the information of executing the task to a Worker node;

15. system according to claim 13 is characterized in that,

A said Master node specifically is used for: when the 2nd Master node breaks down, send query requests information to global information monitoring function entity, and receive the log information of the 2nd Master node that global information monitoring function entity returns;

16. according to claim 13,14 or 15 described systems, it is characterized in that,

Said the 2nd Worker node also is used for after overall task starts, and self log information is uploaded to global information monitoring function entity in real time;

17. system according to claim 16 is characterized in that,

Said global information monitoring function entity; Also be used for before the log information of preserving the 2nd Worker node and the 2nd Master node; Whether the identification information of judging the node that carries in the log information of the 2nd Worker node is consistent with the identification information of the Worker node of preservation, confirms unanimity, then preserves the log information of the 2nd Worker node; Confirm inconsistently, then abandon the log information of the 2nd Worker node.