CN103294558A

CN103294558A - MapReduce scheduling method supporting dynamic trust evaluation

Info

Publication number: CN103294558A
Application number: CN2013102066155A
Authority: CN
Inventors: 沈晴霓; 刘龙; 杨雅辉; 吴中海
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2013-05-29
Filing date: 2013-05-29
Publication date: 2013-09-11
Anticipated expiration: 2033-05-29
Also published as: CN103294558B

Abstract

The invention discloses a MapReduce scheduling method supporting dynamic trust evaluation. The method includes the steps of firstly, dividing entities in a system to build a tree structure; secondly, initiating the tree structure, trust threshold, submitting threshold, inheriting factors, feedback factors, and trust values of the entities at main nodes; thirdly, by the system, calculating trust threshold required by a job according to attributes of the to-be-submitted job, searching for multiple entities whose trust values are larger than the trust threshold to serve as schedulable entities, and executing the job; fourthly, by the system, verifying completeness of the executed results of the schedulable entities, increasing the trust values of the entities if the completeness is verified, or else decreasing the trust values of the entities; fifthly, updating the trust values of parent entities of the entities through a feedback mechanism according to trust value variation of the entities until the root of the tree structure. By the MapReduce scheduling method, credibility of calculating results is increased greatly.

Description

A kind of MapReduce dispatching method of supporting the dynamic trust assessment

Technical field

The invention belongs to the security fields of cloud computing environment, relate to a kind of MapReduce dispatching method of supporting the dynamic trust assessment, the MapReduce that is mainly used in Hadoop calculates on the framework.

Background technology

Hadoop is not only a distributed file system that is used for storage, and is to design the framework that is used in the large-scale cluster execution Distributed Application of being made up of universal computing device, and MapReduce is the calculating framework of Hadoop.The MapReduce framework is used for handling at the distributed parallel environment calculating of mass data.It resolves into more parts of fine-grained subtasks with a task, these subtasks between the processing node of free time, be scheduled and fast processing after, finally merge the final result of generation by specific rule, its transaction module is similar to decomposition and the inductive method in traditional programming model a little.The MapReduce model is abstracted into Map and two steps of Reduce with distributed arithmetic, thereby realizes Distributed Application efficiently.Wherein the Map step is responsible for Key-Value according to user input to generating intermediate result, and intermediate result adopts the right form of Key-Value equally.The Reduce step then merges all intermediate result according to Key, generate net result then.Be exactly Map and the Reduce function logic that realizes oneself and the developer need do, submit to MapReduce running environment then.

The characteristics of MapReduce cause the execution result of each subtask can influence final result of calculation.Cloud computing environment is complicated day by day at present, publicly-owned cloud and privately owned cloud merge, distributed computing system is more and more open, make the assailant have an opportunity to take advantage of, external attacker obtains the control of certain or a plurality of cloud computing clustered nodes by the various attack means, and the person's of internaling attack existence, all threatened the computationally secure of MapReduce.To the attack of MapReduce, can be summarized as two kinds of deception forms:

Deception 1 hypothesis map function is f, calculates participant (mapper) and is assigned with a split D, requires the record X ∈ D={x among all D ₁..., x _n, all carry out f (x) and calculate.The assailant is only to the subset D of D ' in record carry out f and calculate, and claim all records carried out map calculating, finish by heartbeat notice master task.

Deception 2 hypothesis map functions are f, calculate participant (mapper) and are assigned with a split D, require the record x ∈ D={x among all D ₁..., x _n, all carry out f (x) and calculate.The assailant carries out g to the record among the D and calculates, and claims all records have been carried out f calculating, finishes by heartbeat notice master task.

Difference according to the means of attacking can be divided into the assailant following three classes.

The first kind is crude and rash the attack.The assailant controls mapper, always returns wrong result.Attack for this class, as long as the result is once simply verified at random, just can find the existence of attacking.This attack is penetrated and is avoided also the easiliest.

Second class is common attack.The assailant returns wrong result with certain probability.It is the extreme case (probability is 100%) that this class is attacked that the first kind is attacked.This attack is more common, and attacking than the first kind has higher success ratio, also more is difficult to penetrate.For this type of attack, can adopt the way of voting to verify result's integrality, just distribute a plurality of mapper simultaneously to a split computing, after finishing, map can compare output, if the result of a plurality of mapper is the same, think that then the result is correct, if Different Results is arranged, then choose the maximum result of occurrence number as correct result.The shortcoming of this way is to bring great performance loss.

The 3rd class is quick-witted the attack.The assailant supposes in the Job scheduling mechanism of MapReduce trust systems is arranged, and returns correct result in a period of time always, loosens supervision to it up to system, even trusts the assailant fully, the result who can the uncensored assailant of acceptance provides.At this moment, the assailant just can return wrong result.This attack means is the most cunning, also is difficult to most take precautions against.

The attack pattern that the assailant takes may be more more complicated than above-mentioned.The assailant may control a plurality of nodes, and this is the inspection of easier avoidance system.According to whether being the attack that cooperation is arranged, attacking scene and can be divided into following two kinds again.

First kind is non-conspiracy attack.The attack pattern of Lun Shuing belongs to this type of substantially before.Under the model of non-conspiracy, the assailant has only controlled a worker, has perhaps controlled a plurality of worker, but each worker is independent execution in attack process, not cooperation and interactive between worker.Such as, an assailant has controlled two worker, and two worker are assigned with the same map operation of carrying out on the same split, and these two worker have returned wrong result, but the result of these two mistakes but is different.

Second kind is conspiracy attack.The assailant has controlled a plurality of worker, and the behavior of one of them worker depends on the behavior with other worker of its conspiracy.Can there be exchange and the communication of information between the worker.This is attacked main answering system and adopts replication or voting to carry out integrity verification as a result.Replication is the special case of voting, and two worker of system assignment carry out same calculating, if two worker return results differences then have at least one to be wrong.Such as, the assailant has controlled two worker, and master is after the worker allocating task, and two worker can know whether they have been assigned with identical input split.If identical, they can return identical result, but this result is wrong, has so just avoided the risk that is detected.The difficulty that conspiracy attack is implemented is higher, and probability of successful is less, but can bring very large threat.

The more than existence of Gong Jiing, make that making a cover can guarantee to calculate credible framework, be that a kind of believable MapReduce trust metrics and scheduling strategy seem particularly important, this cover strategy should compatible existing application program, also will consider how to obtain believable result of calculation at incredible node.Below be can find at present with MapReduce in the result of calculation integrity protection patent situation relevant with credible scheduling strategy.

Publication number is US20090651100, denomination of invention be " SUSPICIOUS NODE DETECTION AND RECOVERY IN MAPREDUCE COMPUTING " disclosure of the Invention a kind of method that detects and correct malicious node in the cloud computing environment, this method adopts suspicion index (suspicion index), the suspicion threshold value is set, whether the suspicion index that checks node in task is carried out surpasses this threshold value, if surpass the recovery of then carrying out, otherwise the result that recipient node is carried out.

Though this patent also relates to the detection of malicious node; arrange the detection of malicious node by the suspicion threshold value; and providing the recovery of malicious node but the integrity protection aspect not do not calculated at MapReduce according to this, the focus of its concern is more general attack pattern.

The patent No. is 200910311687.X, and denomination of invention has the method for adaptive task division and task scheduling for the invention of " a kind of self-adaptation job scheduling method based on MapReduce " provides a kind of based on computing node actual computation ability.This invention relates to MapReduce self-adaptation job scheduling method in the distributed parallel calculating field, comprises the following steps: that MapReduce calculates the Capability index of each computing node list CPU core; Calculate the block size of MapReduce operation; Scheduling node is divided the data of the MapReduce operation that newly enters; Scheduling node is dynamically dressed up task with the data chunk of MapReduce operation, distributes to each computing node; The resource utilization of each computing node of dynamic statistics if resource utilization is lower than thresholding, recomputates the block size of MapReduce operation.

This patent is dispatched it at the calculating in the distributed system and storage resources according to resource utilization, solution be the problem of calculated performance among the MapReduce.Though also be under the MapReduce framework, task to be dispatched, this patent be not one from the patent of security standpoint, its attention be the efficiency that solves Distributed Application.

Summary of the invention

At integrality and the credible assurance problem of MapReduce result of calculation, current not relevant patent relates to.But along with MapReduce in big data processing field application more and more widely, provide a kind of believable mechanism to guarantee that the credibility of result of calculation becomes the demand of demanding urgently satisfying.The present invention is directed to the demand, calculate framework based on existing MapReduce and propose a kind of dispatching method of supporting the dynamic trust assessment, its emphasis is to provide the dynamic trust evaluation system of the entity that is scheduled flexibly as a result on the integrity verification scheme.The integrity verification scheme adopts test checking (Quiz) and checkpoint verification (Checkpoint) method to combine as a result, the two replenishes mutually, has avoided judging by accident and failing to judge, and be the technical foundation of total system, two kinds of verification methods can alternatively be used, and dirigibility is provided; The trust evaluation system quantizes trusting this abstract concept, and by inheritance mechanism and feedback mechanism trust value is carried out dynamic evaluation based on the result of above-mentioned integrity verification; Scheduling strategy adds in the scheduler of MapReduce trusting as the schedule considerations factor, and different operating is arranged the trust value threshold value, thereby reach the optimum balance of performance and trust then based on the former two.

Following emphasis is set forth three main points in the invention:

One, integrity verification as a result: comprise based on the integrity verification as a result of Quiz with based on the integrality as a result of Checkpoint.Quiz and normal calculation task are as broad as long, and only the result of calculation of Quiz can be verified.The basic thought of integrity verification as a result based on Quiz is to insert Quiz in normal tasks, and the participant of calculating can't distinguish the existence of Quiz.By the checking to Quiz result, the client can select to accept or the refusal result calculated.Its specific practice is, the size of establishing a task is s, and client is chosen the task of t size, the task of t+m=s size sent to calculate the executor then, and wherein m is Quiz.After this task was finished, the client checked whether the result of calculation of hiding Quiz is correct.Have only the result of Quiz correct, this time result calculated just can be accepted.Otherwise the client can abandon all result of calculation, reschedules and calculates.The Quiz data are provided by the user, or are generated at random automatically according to user-defined function or rule by system.The Quiz data directly are inserted in the input data of mapper by the agency, after map calculates and to finish, to the checking of comparing of the result of calculation of Quiz.After checking is finished, reject Quiz, by the position to the pattern match location Quiz of the output key/value of map, then it be deleted from file.

In the integrity verification as a result based on Checkpoint, system adopts redundant computation, same task is not distributed to carried out at two Worker of same node.When calculating certain ad-hoc location of carrying out input file, Worker (goes as the 1st, the 100th row, the 1000th row), and arrive certain key position (can not store Worker output data fully need write new file the time as a file), suspend the execution of Worker, be set at Checkpoint.Two Worker for carrying out same task choose corresponding Checkpoint, and data with existing is calculated cryptographic hash, the cryptographic hash of the two when system compares this Checkpoint then.If two cryptographic hash are identical, accept the result of this Checkpoint, calculating is proceeded; If two cryptographic hash differences, it is wrong then having a result at least, and stop the execution of two Worker this moment immediately, chooses two Worker again and carry out this task.

System adopts the proof scheme of Quiz and Checkpoint combination, when a task began, two worker distributed identical normal tasks and Quiz task simultaneously, carry out the checking of Checkpoint in the task implementation, if checking is passed through, then task is carried out the checking of Quiz after complete.If the comparison of all Checkpoint cryptographic hash all is identical, and the correct result of the result of calculation of Quiz and expectation is consistent, then is considered as by integrity verification as a result, otherwise is considered as integrity verification failure as a result.The Checkpoint scheme can be found the existence of non-conspiracy attack and part conspiracy attack as early as possible, for conspiracy attack, because the Quiz scheme does not rely on task and copies, can stop the existence of this kind attack.And this combination advantage be effectively to avoid failing to judge to various attack.

Two, trust evaluation: with in the cloud computing system from data center, computing node to process each entity of different level of trusts set up a kind of tree-shaped partition structure according to physics or relation of inclusion in logic.An entity (father's entity) can comprise several other entities (fructification), comprises a plurality of computing nodes as a cluster, and a computing node comprises a plurality of processes.An entity (fructification) is only to be contained in some other entities (father's entity) (be contained in a computing node as a process, a computing node is contained in a cluster etc.).Fig. 4 is the tree structure between the entity in the typical cloud computing system.

The trust evaluation system will be trusted this abstract concept and be quantized according to the tree structure attribute of entity in the system and their historical behavior.Trust in the invention has three characteristics: one, can inherit and feed back.Inheritance mechanism refers to that when a new entity added system, its initial trust value was inherited in the entity that comprises it, and namely the trust value * of the trust value of initiate entity=father's entity inherits the factor; Feedback mechanism refers to that then the trust value change of a fructification can have influence on the trust value of his father's entity, i.e. the trust value variable quantity * feedback factor of the former trust value+fructification of new trust value=father's entity of father's entity.Two, it is dynamic trusting, the trust value of any entity changes its in service being assessed, the trust value of related entities is interactional by succession and feedback mechanism simultaneously, and trust value in service was decayed along with the time in system, such as having spent 1 hour, trust value subtracts 1, but because the existence of other mechanism, trust value integral body also may increase.Three, trust has life cycle, the trust life cycle difference of different entities.Restarting of system, the configuration that system is new, the system manager defines etc. and can cause new life cycle.

An entity begins its life cycle when adding system, causes the initialization of trust value.System has been moved the particular entity of management role, comprised host node (master node), the trust value initial value of checking node (verifier node) determines by default value that at first service provides according to experience gives an one basic value; Be subjected to the influence of inheritance mechanism for its initial value of other entities by system management (as mapper), the trust value of father's entity is given to fructification after multiply by factor of influence.In the execution of entity, Quiz and Checkpoint scheme can be carried out integrity verification to execution result, trust award by checking, increase the trust value of this entity, otherwise then punish, reduce the trust value of this entity.The trust value of this entity changes, and influences the trust value of his father's entity by feedback mechanism, and the like, up to top entity.

Three, scheduling strategy: the scheduling strategy of system, on the basis of the original scheduling strategy of Hadoop, increased the factor of trusting, guarantee that all tasks can both be carried out reliably, system is different with the service of user's purchase according to user's input to be different priorities with task division, the mission critical high to the integrality susceptibility, priority scheduling is carried out at the high entity of trust.System is to trust threshold and submit to threshold value to set, to satisfy the client to the demand of different degree of beliefs.

Scheduling strategy of the present invention comprises laziness submission (Lazy Committing) mechanism, task rollback mechanism and the trust scheduling mechanism of task.Lazy (Lazy Committing) mechanism of submitting to refers to that the result of calculation of an entity is not to submit at once, but put into global buffer (GRB) earlier, execution along with computing, by the complete new checking of result (Quiz checking and Checkpoint checking) if the correctly execution then can constantly accumulate trust value always of this entity, (submitting threshold value to is the empirical value that is arranged by system up to surpassing the submission threshold value, be defaulted as unified value, but can arrange separately certain entity), the result who produces before this entity just can submit to, it is the submission of mapper task among the Hadoop, notice master task is complete, and this entity (worker) enters new life cycle after the submission.If certain entity is being found cheating in the integrity verification as a result, its trust value is reduced according to the punitive measures of system, when the trust value of this entity when negative, show that namely this entity can not be trusted again, this thread is terminated, and new thread is filled into and waits for scheduling in the thread pool.Rollback mechanism that Here it is, rollback mechanism is accompanied by and reschedules.

Trusting scheduling mechanism adopts trust threshold as the requirement of scheduling.Trust threshold is a specific trust value, is set according to customer demand by the service provider, and in job scheduling, scheduler only can be higher than the trusted entities allocating task of operation trust threshold to trust value.Trust threshold is more high, and the entity that satisfies the operation executive condition is just more few, and execution result is more credible.The more computational resource of system assignment is given the high entity of trust value, and the operation that trust threshold is low is assigned to the low entity of trust value as far as possible, to reach the balancing the load between different entities.Simultaneously, cloud service provider externally provides calculation services, and trust threshold can be used as a standard of price.Can select to buy more expensive service to credible demanding user and obtain higher trust threshold.

Compared with prior art, good effect of the present invention is:

One, provides scheme based on two kinds of Quiz and Checkpoint different integrity verifications as a result, the client can select scheme according to the application-specific scene, and can be easy to add the new scheme of integrity verification as a result as the foundation of trust evaluation, do not need general frame is changed, dirigibility is provided;

Two, technology in the past is often only with the influence factor as scheduling such as CPU, disk, the present invention quantizes trusting simultaneously, with trusting as the factor that influences job scheduling, can effectively prevent from comprising the attack at computationally secure of conspiracy attack, and reach performance and believable balance;

Three, existing MapReduce program need not be revised or only need very little modification can run on the present invention, have to last and under compatibility.

Description of drawings

Fig. 1 is method flow diagram of the present invention;

Fig. 2 is the generation approach process flow diagram of Quiz;

Fig. 3 is the comparison of Quiz and removes process flow diagram;

Fig. 4 is the tree structure of the trust collection of a typical system.

Embodiment

Below in conjunction with accompanying drawing concrete grammar of the present invention is explained in further detail.

Method flow of the present invention as shown in Figure 1, the information that the MapReduce system of this support trust dynamic evaluation is needed by the true(-)running of master node maintenance system, the tree structure that comprises trusted entities, system's blacklist, trust threshold is submitted threshold value to, inherit the factor, feedback factor, the trust value of each trusted entities etc., wherein the trust value of trusted entities on-the-fly modifies in the job scheduling process as a field of entity attribute data structure.At first to the above-mentioned parameter initialization, calculate and store according to the trust value of inheritance mechanism to each entity when system's operation begins.Initial value has default reference value, and is subjected to the influence of inheritance mechanism.The calculating of trust value takes place at master.InitCredit () function at first calls GetDomainParent () function, obtains father's entity of the entity that will calculate.By GetDomainCredit () function, obtain the trust value of father's entity afterwards.The trust value of the entity that calculates is that the trust value * that comprises its father's entity inherits the factor.Note, when father's entity of the entity that will calculate is on blacklist, will not carry out above-mentioned calculating, but indirect assignment is-1, namely adds blacklist.

When user's submit job, desired parameters (trust threshold of operation, job priority) calculates according to job property in system, and the scheduling strategy traversal reads each entity trusts value, finds trust value greater than the entity of trust threshold, can the scheduling entity set.But in all scheduling entity, sort from small to large according to trust value, from the entity of ordering, select the forward enough entities of ordering, make the estimated value of computing power of these entities can reach requirement of client, then operation is distributed according to former Hadoop scheduling factor (as distance) on these entities.

Begin after task has assigned to carry out, and the result is carried out integrity verification.Integrity verification adopts the scheme based on Quiz and Checkpoint as a result, the combination of Quiz and checkpoint is the same by add worker (but scheduling entity) the input data that Agent layer ProxyReader guarantees to carry out same map task between the class RecordReader in the raw data of performed operation and Hadoop.

In the integrity verification as a result based on Quiz, the generation approach of Quiz has two kinds as shown in Figure 2, a kind of client of being directly provides Quiz data to system, another kind provides rule, produced by randomizer according to rule by system, for each field of Quiz, at first confirm type and scope, system generates in the scope of giving at random.The insertion of Quiz is realized by ProxyReader, system adds the ProxyReader that acts on behalf of that data read between RecordReader and data source, ProxyReader work is identical with original RecordReader, new RecorReader is equivalent to one deck packing of ProxyReader, guarantees that mapper reads the not change of API of data.Add Hook Function in ProxyReader, function triggers according to certain probability, and when function triggered, (being in the map input) added the Quiz data in the output of ProxyReader.As the unique data source, the different Worker inputs that guaranteed to carry out same task are identical to RecordReader ProxyReader.After Quiz calculates and to finish, to carrying out fuzzy matching with expected result, and reject (otherwise Quiz result can influence subsequent calculations) according to matching result according to key/value.The comparison of Quiz and removal are as shown in Figure 3.Quiz is verified by master a complete back of map task.

The setting of Checkpoint depends on trigger function C heckpointTrigger () and Freeze () function.In RecordReader, add the CheckpointTrigger function, Worker is by RecordReader class reading and recording, when carrying out checkpoint, CheckpointTrigger () calls Freeze () function, Freeze () function sends the PAUSE signal to RecordReader, suspend it to the reading of input file, and wait for that the record that has read finishes calculating, finish calculating and the comparison of cryptographic hash afterwards.

The trust evaluation system is dynamically adjusted the trust value of each entity according to the The above results integrity verification.If certain entity passes through integrity verification (Quiz checking and Checkpoint checking are all correct) as a result, then triggering CalCredit () function increases the trust value of this entity, the former trust value of the trust value that this entity is new=this entity+trust award value, trust the award value and be system definition on the occasion of.The variation of the trust value of an entity influences the trust value of his father's entity by feedback mechanism.Feedback mechanism is realized by CreditFeedback () function in the system.Changing whenever the trust value of an entity is to trigger the execution of this function.Logical level according to entity is divided, and this function is carried out from bottom to top.The trust value of supposing an entity has changed a, and CreditFeedback () is triggered.It at first utilizes GetDomainParent () to find father's entity of this entity, the trust value that father's entity is new=old trust value+a* feedback factor.If the root (entity that does not have father's entity) that this father's entity is tree structure, then feedback finishes, otherwise proceeds, up to reaching root.The variation of the trust value of all entities has influenced scheduling strategy behavior next time again.Otherwise, if certain entity is not by integrity verification as a result (Quiz checking and Checkpoint verify to have a failure at least), then be considered as attacking and be present in this entity inside, this entity is put on the blacklist, this time the result of integrity verification can be used as and attacks the reference of analyzing as a result, after attack is got rid of by system, this entity can be taken down from blacklist.Poly-in total system is strong, realized supporting the MapReduce scheduling strategy that dynamic trust is assessed.

In addition, system's establishing time attenuation function.Some entities (as node) life cycle is longer, and historical behavior more of a specified duration weakens current reference significance, and the trust value of this entity can move along with system, and each a period of time deducts pad value at current trust value.Pad value also is rule of thumb to be arranged by system.

Scheduling strategy depends on global buffer (Global Result Buffer) and rollback function, has realized lazy submission (Lazy Committing) mechanism and task rollback mechanism.The result of calculation of mapper is not to consign to reducer by the Shuffle process immediately after finishing in the native system, but is placed on global buffer earlier in (GRB, Global Result Buffer), and GRB is stored in the master node.Place and finish by function R egisterGRB ().RegisterGRB () does not carry out real data and moves, and just registers in GRB.Worker (mapper, Map operation among the execution MapReduce) result of calculation is placed on the local disk of node, master does not also know the actual location of data, and the conversion that is stored in the position of the registration entries among the GRB and data actual storage locations in disk is finished by the mapping that is arranged in the TaskTracker from the node.After registration is finished, just exist record not submit result's item among the GRB.When the trust value of a worker accumulation surpassed the submission threshold value, the data that are recorded among the GRB that belong to it will be submitted to by function T rustedSubmit (), the shuffle after carrying out and reduce operation.If this worker by integrity verification as a result, is put on the blacklist, the content that belongs among the last GRB who submits to behind the point of this worker all can be abandoned.RollBack () function is according to the record among the GRB, and the coupling system daily record judges which or which task correctly do not finish.After finding these tasks, give notice to master again, inform that these tasks of master need reschedule, these tasks are put into task pool wait scheduling next time.

Claims

1. a MapReduce dispatching method of supporting the dynamic trust assessment the steps include:

1) is the different set of trusting with intrasystem each entity division of MapReduce, sets up the tree structure of MapReduce system;

2) trust value of the described tree structure of initialization, trust threshold, submission threshold value, the succession factor, feedback factor, each entity on the master of MapReduce system node;

3) the required trust threshold of this operation is calculated according to the job property of institute's submit job by the MapReduce system; But from described tree structure, search trust value then greater than some entities conduct scheduling entity of this trust threshold, carry out this operation;

4) but the MapReduce system carries out integrity verification to the execution result of described scheduling entity, if by checking, but then increase the trust value of this scheduling entity, otherwise but then reduce the trust value of this scheduling entity;

5) the MapReduce system changes according to the trust value of entity, successively upgrades by the upper layer entity trust value of feedback mechanism to this entity, up to the root of described tree structure.

2. the method for claim 1, it is characterized in that described MapReduce system adopts the proof scheme that combines based on Quiz and Checkpoint that but the execution result of described scheduling entity is carried out integrity verification, its method is: but at first in the raw data of institute's submit job with carry out that to add the scheduling entity input data that an Agent layer guarantees to carry out same map task between the entity of map operation be the same; Carry out Checkpoint checking then in this operation implementation, if checking passes through, then this operation is carried out the Quiz checking after complete; Wherein, by described Agent layer the Quiz data are joined in the map input, carry out described Quiz checking.

3. method as claimed in claim 1 or 2, it is characterized in that but described scheduling entity puts into a buffer memory earlier with the execution result of this operation, when but the trust value of this scheduling entity surpasses the submission threshold value of setting, but should scheduling entity submit this operation execution result that produces before to, notice master node task is complete.

4. method as claimed in claim 3, it is characterized in that if if but certain scheduling entity has been found cheating in described integrity verification, but reduce the trust value of this scheduling entity, but when the trust value of this scheduling entity when negative, but stop this scheduling entity and add a new trusted entities carrying out this operation.

5. method as claimed in claim 3 is characterized in that described buffer memory is one to be positioned at the global buffer of master node; The method of the described execution result of buffer memory is: but described execution result be stored on the local disk of scheduling entity, and described execution result is registered in this global buffer; Set up the position of registration entries and the mapping between the execution result actual storage locations at the master node then.

6. method as claimed in claim 5, but it is characterized in that if scheduling entity when carrying out the trust value of this operation accumulation and surpassing the submission threshold value of setting, but the data that belong in the described global buffer of being recorded in of this scheduling entity are submitted, and notice master node task is complete.

7. method as claimed in claim 5 is characterized in that comprising a blacklist in the described master node; If but the execution result of scheduling entity passes through integrity verification, but then should scheduling entity add described blacklist, but abandon the data behind the last submission point that belongs to this scheduling entity in the described global buffer.

8. method as claimed in claim 3 is characterized in that the method for the trust value of described each entity of initialization is: the trust value initial value of at first setting the root of described tree structure; Utilize inheritance mechanism then, the trust value of father's entity be multiply by one be given to its fructification after inheriting the factor, successively calculate the trust value initial value of each fructification.

9. method as claimed in claim 8 is characterized in that comprising a blacklist in the described master node, if father's entity of certain entity on described blacklist, then the trust value indirect assignment with this entity is-1.

10. the method for claim 1 is characterized in that each entity is divided into the different set of trusting according to physics or relation of inclusion in logic in the MapReduce system, sets up described tree structure; Wherein, father's entity comprises several fructifications in the described tree structure, and a fructification only is contained in some father's entities.

11. the method for claim 1 is characterized in that the MapReduce system according to the priority of job property calculating institute submit job, the operation high to priority distributes the high entity of trust value; Each trusted entities has a life cycle, the trust life cycle difference of different trusted entities; The trust value of described trusted entities was decayed along with the time.

12. the method for claim 1, it is characterized in that describedly by feedback mechanism the upper layer entity trust value of this entity successively being carried out method for updating and being: the trust value variable quantity of the fructification of entity be multiply by a feedback factor, add that then the former trust value of this entity is as the new trust value of this entity.