CN1687917A - Large scale data parallel computing main system and method under network environment - Google Patents

Large scale data parallel computing main system and method under network environment Download PDF

Info

Publication number
CN1687917A
CN1687917A CN200510025730.8A CN200510025730A CN1687917A CN 1687917 A CN1687917 A CN 1687917A CN 200510025730 A CN200510025730 A CN 200510025730A CN 1687917 A CN1687917 A CN 1687917A
Authority
CN
China
Prior art keywords
group
planes
computing node
computing
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200510025730.8A
Other languages
Chinese (zh)
Other versions
CN100357930C (en
Inventor
陈庆奎
那丽春
图占乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CNB2005100257308A priority Critical patent/CN100357930C/en
Publication of CN1687917A publication Critical patent/CN1687917A/en
Application granted granted Critical
Publication of CN100357930C publication Critical patent/CN100357930C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Hardware Redundancy (AREA)
  • Multi Processors (AREA)

Abstract

The invention discloses a large-scale data parallel calculating main system and method under network environment, where the system is composed of multiple computer clusters interconnected by LAN or Intranet, and each computer cluster and each calculating node are of different structures and multiple trustable levels; in the sequence of synthetic calculating ability of the calculating nodes from the highest down, numbering all the calculating nodes to compose a calculating node logic ring; similarly in sequence of synthetic calculating ability of the computer clusters from the highest down, numbering all the computer clusters to compose a computer cluster logic ring; each computer cluster on the computer cluster logic ring is composed of calculating node logic ring; the method is an data parallel calculating algorithm based on dynamic redundancy mechanism, and uses the policies of constructing dynamic logic rings for computer clusters and calculating nodes and of m-redundancy distribution, to effectively solve the technical problems of dynamic redundancy, dynamic load balancing, linear acceleration ratio, etc.

Description

Large scale data parallel computing main system under the grid environment and method
Technical field
The present invention relates to a kind of computing technique of computing machine, particularly relate to a kind of by under the large-scale calculations environment that constitutes at common, cheap computer group, utilize existing vacant computational resource, computing technique and algorithm that general data parallel large-scale computing systems realizes.
Background technology
Along with fast development of information technology and universal day by day, the processing of magnanimity information and the demand of high-performance calculation are more and more urgent, and these demands are proposed by the every field of nation-building gradually.Magnanimity information processing technology, the high-performance calculation technology of seeking high performance price ratio become the subject matter that urgent need that industrial community and academia faced solves.At this problem, grid and data grids become one of feasible best solution with the diversity of its good autonomy, self-similarity, isomerism, management, powerful parallel I/O ability, very high characteristics such as the ratio of performance to price.The Intranet that is made of a plurality of computer networks gets more and more at present, and the personal computing device of a large amount of cheapnesss is seen everywhere, but their resource utilization is very low.Pertinent literature research is pointed out, under a network environment, there are many resources not to be used in the given time, even one day the busiest in, still have 1/3rd workstation usefulness not fully, 70% in the cluster is in idle condition to 85% network memory (being distributed in the internal memory of each node on the network).So vacant computational resource, storage resources, the communication resource on the grid that digging utilization is made of a plurality of computer groups can obtain a large amount of, non-special use, cheap, large-scale high-performance treatments and computational resource.Yet along with the increase of cluster nodes number and network number, the dynamic transfer ability of the reliability of system and resource will descend.So the reliability engineering of a group of planes, the research of extensible technique become the research focus in this field, wherein the LifeKeeper of the Failover of the Wolfpack of Microsoft, Oracle, NCR is typical case's representative that a reliability group of planes calculates.Yet, variation day by day along with expansion day by day, gridding resource and the service of data grids resource extent, these traditional reliability engineerings can't be fit to the regulatory requirement of gridding resources isomery, many confidence levels, and are therefore more and more urgent based on the research of the brand-new large-scale parallel theoretical model of grid and algorithm.
Summary of the invention
At the defective that exists in the above-mentioned prior art, technical matters to be solved by this invention provides a kind of present existing computational resource, Internet resources structure that can utilize and has good failure tolerance, good speed-up ratio characteristic, very high dynamic load ability, large scale data parallel computing main system and method under cheap, large-scale, reliable, the stable grid environment.
In order to solve the problems of the technologies described above, the large scale data parallel computing main system under a kind of grid environment provided by the present invention comprises:
One DGSS of monitoring grid system (DATA GRID SUPERVISE SYSTEM) utilizes the multi-Agent cooperative mechanism DG to be implemented the grid management system of effectively dynamic condition monitoring;
Also comprise a computing system DGCS (DATA GRID COMPUTINGSYSTEM) who constitutes by group of planes logic box, wherein:
Described group of planes logic box is made of by the number order connection the computer group of setting numbering, and a logical successor group of planes of numbering a last group of planes is to be numbered a group of planes of 1; Described numbering is according to the numbering COMPREHENSIVE CALCULATING ability of all computing nodes of computer group and that all group of planes are set by order from big to small; Except that the 1st numbered between a group of planes and the maximum numbering group of planes, the COMPREHENSIVE CALCULATING ability of a group of planes that closes on more was close more on group of planes logic box;
Each computer group on the described group of planes logic box is made of the computing node logic box, described computing node logic box, be made of by the number order connection the computing node of setting numbering, the logical successor node of numbering last computing node is to be numbered 1 computing node; Described numbering is the numbering of by order from big to small all computing nodes being set according to the COMPREHENSIVE CALCULATING ability of each computing node, and the COMPREHENSIVE CALCULATING ability of described each computing node is calculated according to weight vectors W; Except that the 1st numbered between computing node and the maximum numbering computing node, the COMPREHENSIVE CALCULATING ability of the computing node that closes on more was close more on the computing node logic box;
Constituted dynamic DG (data grids computing system) by group of planes logic box and computing node logic box; Described monitoring grid system connects each computer group and the computing node of described dynamic grid computing system by common Lan or Intranet.
In order to solve the problems of the technologies described above, the redundant allocation strategy of the m-of the large scale data parallel computing main system under a kind of grid environment provided by the present invention: its step is as follows:
On a logic box (comprising group of planes ring or computing node ring), establishing a computing unit (comprising a computing node or a group of planes) logical number is k, and its computing power is CP k
For described computing unit distributes CP k* M (* is multiplying) task amount;
Again CP k* to be evenly distributed to logical number be k+1 to the M task amount, k+2 ..., on common m the computing unit of k+m; Task CP like this k* M is distributed simultaneously by DG and carries out twice;
Claim that in the present invention this redundancy strategy is the redundant allocation strategy of m-; In fact, task CP k* M is only distributed once by redundant.
In order to solve the problems of the technologies described above, the large scale data parallel computational algorithm under a kind of grid environment provided by the present invention, the DGCS key data structure constitutes just like lower member:
If grid DG is made of c computer group, the number dynamic change of the computing node in each group of planes; DPC is a parallel calculation task of data on the DG, | DPC| represents its general assignment amount, and W is its computational resource requirements weight vectors; Q TDTTask distribution message queue for DPC; M is the basic task unit; DGSS is the monitoring grid system;
The step of large scale data parallel algorithm is as follows:
1) initialization;
A) according to M decomposing D PC;
B) calculate: the termination condition of finished=DPC;
C) auxiliary data (as matrix of coefficients) of broadcasting DPC is to all computing nodes of DG;
D) Count=0; / * parallel computation time counter initialization */
2) While (when Finished does not set up) do
3) [the Master distribution DPC task of DG]/* circulation execute the task */
A) obtain the resource state information of DG from DGSS;
B) group of planes logic box of structure DG;
C) start all group of planes structures computing node logic box separately;
D) calculate the Dynamic Two-dimensional address for each computing node;
E) obtain the overall computing power CCP of each group of planes i(0≤i≤c);
F) to each group of planes CC i(0≤i≤c) do:
{
Calculate CCP i/ ∑ CCP j(the ratio of 0≤j≤c);
Calculating is at group of planes CC iThe task amount T of the DPC of last distribution i=(CCP i/ ∑ CCP j(0≤j≤c)) * | DPC|/M;
};
For?i=1?to?c
G) transformation task T iTo group of planes CC i
H) on group of planes ring, press the redundant allocation strategy distribution T of 1- iTo CC i1 group of planes CC of logical successor I+1On;
i)End?for;
4) all group of planes CC i(0≤i≤c) do concurrently step 5) ~ 11):
5) group of planes CC iCOMPREHENSIVE CALCULATING ability CP according to its each computing node j(0≤j≤p, p are CC iThe computing node number) calculate the bear amount of each computing node to subtask Ti, promptly as follows:
a)For?j=1?to?p
b)T ij=(CP j/∑CP k(0≤k≤p))*|T i|/M;
C) transmission subtask T IjTo computing node C j
D) on the computing node ring, press the redundant allocation strategy distribution T of m- IjTo C jM computing node C of back J+1, C J+2, ..., C J+mOn;
E) End for; DATA DISTRIBUTION end * on the/* computing node/
6) Master of group of planes structure group of planes CC iThe local task distribution message queue Q of this subtask TDTiConcurrent Q TDTiDeliver to the Master of DG, the Master of DG constructs overall task distribution message queue Q TDT
7) CC iMaster start its all computing nodes of having jurisdiction over and finish this calculation task, repeat to do step 8) 9) 10);
8) CC iMaster monitor this group of planes Q TDTiTask situation about finishing;
Accept its follow-up group of planes to Q TDTiRedundant computation situation about finishing;
Transmit this group of planes to the intermediate result of its forerunner's redundant computation to its forerunner's group of planes;
Transmit this group of planes Q TDTiThe Master of calculating intermediate result DG;
9) if
((this Q TDTiResult of calculation, all obtain by self or its descendant node)
Or
In the middle of the Master of (obtaining the finish command of the Master of DG)/* DG obtains all by a redundant group of planes
As a result */
)
Then finish this subtask and calculate, and forward step 11) to;
10) if CC iMaster obtain some computing nodes from DGSS and lost efficacy,
Then { putting this node is failure state;
By making the redundant allocation strategy of m-, calculate the inefficacy task amount, put into the inefficacy task queue;
Send the Master of fail message simultaneously to DG;
}
Finish the calculation task of oneself when some computing node after, obtaining corresponding task to the inefficacy task queue and continue to carry out, is empty up to the inefficacy formation;
11) accept the Task Distribution next time of the Master of DG; This grid parallel computation end of/* */
Count++;
12) Master of DG is according to overall Q TDTGather this and calculate intermediate result; Revise the algorithm termination condition; Conversion intermediate result is new overall calculation task DPC;
13)End?while;
14) output result of calculation notifies all computing nodes to finish this calculating.
Described step 3) b) group of planes logic box in, its constitution step is as follows:
A class DPC and resource requirement weight vectors W=(w thereof among the known given DG 1, w 2, w 3), to any one computer group CC of DG i∈ CSS, CC iThe COMPREHENSIVE CALCULATING ability be its all computing nodes the COMPREHENSIVE CALCULATING ability and, be designated as CCP j(0≤j≤c);
In DG according to CCP i(order from big to small of 0≤j≤c) is to all group of planes numberings;
Constitute a group of planes logic box by this numbering, a logical successor group of planes of numbering a last group of planes is to be numbered a group of planes of 1.
Described step 3) c) the computing node logic box in, its constitution step is as follows:
A class DPC and resource requirement weight vectors W=(w thereof among the known given DG 1, w 2, w 3), to any one computer group CC of DG i∈ CSS, the number of its computing node is p, the COMPREHENSIVE CALCULATING ability that calculates each computing node according to weight vectors W is CP j(0≤j≤p);
At CC iIn according to CP j(order from big to small of 0≤j≤p) is to all computing nodes numberings;
Constitute a computing node logic box by this numbering, the logical successor node of numbering last computing node is to be numbered 1 node.
The Dynamic Two-dimensional address of calculating described step 3) d), it is set at two tuples (r, o) address; Each computing node on the DG can both obtain one two tuple (r from the group of planes logic box of being constructed and computing node logic box, o) address, wherein r is the group of planes ring numbering of this computing node place group of planes, and o is the logical number of this computing node in the computing node ring of group of planes r; Because group of planes ring and the dynamic change in the parallel computation process of computing node ring of DG, so claim that (r o) is the Dynamic Two-dimensional address of this computing node.
Task distribution message queue in the described step 6): the TDT of all elementary cell tasks constitutes a task distribution message queue Q TDTDG has the Q of an overall situation TDT, each group of planes has the Q of a part TDTi
Inefficacy task queue in the described step 10): one of each computer group structure of DG is deposited mission bit stream formation behind the local calculation node failure, and its form is the same with the task distribution message queue.
Utilize the large scale data parallel computing main system under the grid environment provided by the invention, provide a cover feasible calculating back-up system and implementation method for the large scale data parallel based on the internet calculates.The present invention utilizes the vacant resource of existing computer network, computing node to carry out large-scale parallel to calculate, and the structure of these computational resources, software systems can be isomery, and the network interconnection can be any technology.By adjusting the size of basic task piece, adjust parallel granularity according to the actual conditions of network.Energy force function by each computational resource of dynamic calculation, dynamically construct computing machine group rings, computing node ring, according to the balanced distribution of m-redundancy strategy load, make the data parallel algorithm of this system's support have good speed-up ratio, dynamic load leveling, effective fault-tolerant ability then.
The effective fault tolerant mechanism of described data parallel algorithm based on dynamic redundancy mechanism divides three step proofs as follows: one, in a single group of planes. and be without loss of generality, only need the average fault-tolerant ability of this algorithm of proof; The COMPREHENSIVE CALCULATING ability of supposing each computing node in the single group of planes is identical, so the host computer task that each node is assigned to all is T.The redundant separately T/m that deposits this task of the m of this node follow-up computing node like this.If the failure probability of each computing node is q.Only analyze the size of the data volume of the inefficacy distribution situation of m+1 neighborhood calculation node and inefficacy formation.
When 1 node failure, fail data is 0;
When 2 node failures, (T/m) q is arranged 2Fail data; When 3 node failures, (T/m) q is arranged 3Fail data;
When k node failure, (T/m) q is arranged kFail data;
……
The average fail data amount of m failure conditions generation is so:
T ave((T/m)(q 2+2q 3+3q 4+…+(m-1)q m))/m
=(T/m 2)(q 2+2q 3+3q 4+…+(m-1)q m)
=T/(m 2(1-q) 2)-(T/m)(q m+1/(1-q))………(1)
In formula (1), when q is tending towards 0, T Ave=T/m 2When q is tending towards 0.5, T Ave≈ 1.33 (T/m 2);
If the part that it is m+1 that the computing node ring of a group of planes can be divided into h length, the average inefficacy queue length of this group of planes is h T so Ave, the visible length of suitably adjusting the computing node logic box can effectively be controlled the length of fault tolerant queue.So the redundancy scheme in the single group of planes is effective.
Two, owing to adopt the 1-redundancy strategy between a group of planes, actual is mirror policy.The mirror image redundancy scheme is effective.
Three, can utilize and calculate communication performance and recently calculate m in the m-redundancy strategy.The optimum redundancy quantity of information should be make node computing power just can with the network communications capability balance.
In sum, described data parallel algorithm based on dynamic redundancy mechanism provides effective fault tolerant mechanism.
About the described data parallel algorithm proof that is dynamic load leveling:
In the step 3) of algorithm, the starting stage that each parallel computation starts is all according to each group of planes, the computing power structure group of planes logic box of each computing node, the computing node logic box of DG at that time.
According to the setting of two logic boxs, according to algorithm steps 3) description, the load of each group of planes task is to divide by ability.Simultaneously, according to the step 5) of algorithm as can be known, it also is to divide by ability that the load in each group of planes distributes.So in each parallel stage of algorithm, load balancing.
In addition, because the structure of two logic boxs was constructed in real time in each parallel stage, be dynamic so this load distributes.
So described data parallel algorithm is a dynamic load leveling.
Large scale data parallel computing main system of the present invention is based on above-mentioned advantage, the grid environment that is highly suitable for common computer group of planes formation solves the large-scale calculations problem down, has proposed significant valuable system realization technology, method to utilizing existing vacant computational resource to implement high-performance calculation.The present invention provide one under the data grid environment that constitutes by a multicomputer group of planes, the parallel calculating of data-oriented, based on the large-scale parallel computing main system and the method for dynamic redundancy mechanism.Theoretical analysis and practice show that this system and method has good dynamic load leveling, fault-tolerance and speed-up ratio characteristic, can support large scale data parallel to calculate effectively.
Description of drawings
Fig. 1 is the dynamic grid DG synoptic diagram that is made of two logic boxs of the present invention;
Fig. 2 is the redundant allocation strategy synoptic diagram of m-of the present invention;
Fig. 3 is the state exchange synoptic diagram of computing node in the multi-Agent model operating mechanism.
Embodiment
Below in conjunction with description of drawings embodiments of the invention are described in further detail, but present embodiment is not limited to the present invention, every employing analog structure of the present invention and method and similar variation thereof all should be listed protection scope of the present invention in.
In order to construct the grid environment of supporting that large scale data parallel calculates, utilize common computational grid, computer resource to constitute the computing system of many confidence levels, in order to describe the implementation of this DGCS system effectively, this instructions is done following setting:
Set 1, computer group: a computer group (Computer Cluster) be one two tuple CC (Master, CS), wherein Master is the CC master controller; CS={C 1, C 2..., C pIt is the set of all computing nodes of CC.
Set 2, data grids: a data grid (Data Grid) be a four-tuple DG (Master, CCS, N, R); Wherein Master is the DG master controller; CSS={CC 1, CC 2..., CC cIt is the set of the collection of computer group; N={N 1, N 2..., N nFor connecting the set of network, connecting network is the high speed switching network; R is a concatenate rule.Computing node among each DG has separate processor and external storage.
Set 3, the data parallel on the DG calculates: DG (Master, CCS, N, R) the parallel computation process of data on is as follows:
(1) the calculation task scale, be decomposed into the subtask;
(2) start computing node among all CCS;
(3) calculate: the condition of Finish=task termination;
(4)i=1;
(5) While (when Finish does not set up) do
(6) decomposing global data Data is D 1, D 2..., D p
(7) send D concurrently kTo C k(1≤k≤p);
(8) drive C k(1≤k≤p) finds the solution subtask i simultaneously;
(9) synchronous C k(1≤k≤p) finds the solution subtask i's, and exchange local data forms new global data New_Data;
(10)Data=New_Data;
(11)i=i+1;
(12)End?while;
(13) synthetic result of calculation;
(14) notice C k(1≤k≤p) finish to calculate;
(15) finish this calculating.
Set 4, the resource requirement weight vectors: it is different to the demand of calculating (CPU) performance of computing node, storage (RAM) capacity, I/O (DISK) speed that the every class data parallel among the DG calculates DPC (Data ParallelComputing), every class data parallel calculating is analyzed, can provide demand weight, with vectorial W=(w to above-mentioned three kinds of resources 1, w 2, w 3) represent, be called the resource requirement weight vectors of such DPC.
Set 5, calculate the communication performance ratio: a class DPC and resource requirement weight vectors W=(w thereof among the known given DG 1, w 2, w 3).Any one computer group CC to DG i∈ CSS, the number of its computing node is p, calculates the COMPREHENSIVE CALCULATING ability CP of each computing node according to weight vector W j(0≤j≤p); If CC iThe network bandwidth is B, so group of planes CC iCalculating communication performance ratio to DPC is set at: R=∑ CP j(0≤j≤p)/pB; Its implication is group of planes CC iThe average comprehensive treatment capability of computing node and the ratio of this cluster network bandwidth.
Because the communication bandwidth of general networking is generally 100M and 1000M rank at present, it is more and more faster that the comprehensive treatment capability of processing node then improves, so the value of R can be greater than 1; Unit of account as for the computing node ability can be set according to the demand of DPC, and for example can establish per hundred megahertz processor frequencies is a CPU unit, and per million storeies are a storage cell, and every millisecond I/O speed is as speed unit of hard disk etc.
In the grid parallel architecture,
Set 6, the computing node logic box: a class DPC and resource requirement weight vectors W=(w thereof among the known given DG 1, w 2, w 3).Any one computer group CC to DG i∈ CSS, the number of its computing node is p, the COMPREHENSIVE CALCULATING ability that calculates each computing node according to weight vector W is CP j(0≤j≤p); At CC iIn according to CP j(order from big to small of 0≤j≤p) is to all computing nodes numberings; And by computing node logic box of this numbering formation, the logical successor node of numbering last computing node is to be numbered 1 node, claims that this logic box is the computing node logic box.
Can know that by setting except that the 1st numbered between computing node and the maximum numbered node, the COMPREHENSIVE CALCULATING ability of the computing node that closes on more was close more on the computing node logic box.
Set 7, group of planes logic box: a class DPC and resource requirement weight vectors W=(w thereof among the known given DG 1, w 2, w 3).Any one computer group CC to DG i∈ CSS, CC iThe COMPREHENSIVE CALCULATING ability be its all computing nodes the COMPREHENSIVE CALCULATING ability and, be designated as CCP j(0≤j≤c).In DG according to CCP i(order from big to small of 0≤j≤c) is to all group of planes numberings; And by group of planes logic box of this numbering formation, a logical successor group of planes of numbering a last group of planes is to be numbered a group of planes of 1, claims that this logic box is a group of planes logic box.
Equally, except that the 1st numbered between a group of planes and the maximum numbering group of planes, the COMPREHENSIVE CALCULATING ability of a group of planes that closes on more was close more on group of planes logic box.
Referring to shown in Figure 1, constitute dynamic DG by group of planes logic box 1 and computing node logic box 2 dicyclos.
The Dynamic Two-dimensional address of setting 8, computing node: each computing node on the DG can both be from obtaining one two tuple (r according to setting 6,7 two logic boxs of being constructed, o) address, wherein r is the group of planes ring numbering of this computing node place group of planes, o is the logical number of this computing node in the computing node ring of group of planes r, because group of planes ring and the dynamic change in the parallel computation process of computing node ring of DG, so claim that (r o) is the Dynamic Two-dimensional address of this computing node.
Set 9, the basic task unit: a class DPC and resource requirement weight vectors W=(w thereof among the known given DG 1, w 2, w 3).According to the characteristics of DPC, DPC is decomposed into several sizes is that the task piece of M, this instructions claim that M is the basic task unit of DPC on DG.
Setting 10, the redundant allocation strategy of m-: on a logic box (group of planes ring or computing node ring), establishing a computing unit (computing node or a group of planes) logical number is k, and its computing power is CP kDistributing CP for this computing unit k* behind M (* the is multiplying) task amount, again CP k* to be evenly distributed to logical number be k+1 to the M task amount, k+2 ..., on common m the computing unit of k+m, task CP like this k* M is distributed simultaneously by DG and carries out twice, claims that this redundancy strategy is the redundant allocation strategy of m-, as shown in Figure 2; In fact, on a computing node ring 3, task CP k* M is only distributed once by redundant.
Setting 11, task distribution information tabular: a mission bit stream distribution table TDT (Mlink is set in each the basic task unit among the DG, Slink), wherein Mlink is the Dynamic Two-dimensional address link list of the computing node that distributes for the first time of this TU task unit, and Slink is the Dynamic Two-dimensional address link list of computing node of the redundant distributions of this TU task unit;
The task distribution message queue: the TDT of all elementary cell tasks constitutes a task distribution message queue Q TDTDG has the Q of an overall situation TDTEach group of planes has the Q of a part TDTi
The inefficacy task queue: one of each computer group structure of DG is deposited mission bit stream formation behind the local calculation node failure, and its form is the same with the task distribution message queue.
The monitoring grid system is DGSS: in order to support the effective operation based on the data parallel large-scale parallel algorithm on the DG of said structure, must have one to the effectively dynamic in real time grid management system of DG.Utilize the multi-Agent cooperative mechanism to study gridding resource discovery, monitoring, dynamic debugging system, can guarantee effective operation of DG system, about the description of this system with introduce referring to this instructions decline based on twin nuclei; For the description of this instructions, the name of remembering this system is DGSS (DG Supervise System).
Large scale data parallel computing main system under a kind of grid environment that the embodiment of the invention provided is established grid DG and is made of c computer group, the number dynamic change of the computing node in each group of planes; DPC is a parallel calculation task of data on the DG, | DPC| represents its general assignment amount, and W is its computational resource requirements weight vectors; Q TDTTask distribution message queue for DPC; M is the basic task unit; DGSS is the monitoring grid system;
Large-scale data concurrency arthmetic statement is as follows:
1) initialization;
A) according to M decomposing D PC;
B) calculate: the termination condition of finished=DPC;
C) auxiliary data (as matrix of coefficients) of broadcasting DPC is to all computing nodes of DG;
D) Count=0; / * parallel computation time counter initialization */
2) While (when Finished does not set up) do
3) [the Master distribution DPC task of DG]/* circulation execute the task */
A) obtain the resource state information of DG from DGSS;
B) group of planes logic box of structure DG;
C) start all group of planes structures computing node logic box separately;
D) calculate the Dynamic Two-dimensional address for each computing node;
E) obtain the overall computing power CCP of each group of planes i(0≤i≤c);
F) to each group of planes CC i(0≤i≤c) do:
{
Calculate CCP i/ ∑ CCP j(the ratio of 0≤j≤c);
Calculating is at group of planes CC iThe task amount T of the DPC of last distribution i=(CCP i/ ∑ CCP j(0≤j≤c)) * | DPC|/M;
};
For?i=1?to?c
G) transformation task T iTo group of planes CC i
H) on group of planes ring, press the redundant allocation strategy distribution T of 1- iTo CC i1 group of planes CC of logical successor I+1On;
i)End?for;
4) all group of planes CC i(0≤i≤c) do concurrently step 5) ~ 11):
5) group of planes CC iCOMPREHENSIVE CALCULATING ability CP according to its each computing node j(0≤j≤p, p are CC iThe computing node number) calculate the bear amount of each computing node to subtask Ti, promptly as follows:
a)For?j=1?to?p
b)T ij=(CP j/∑CP k(0≤k≤p))*|T i|/M;
C) transmission subtask T IjTo computing node C j
D) on the computing node ring, press the redundant allocation strategy distribution T of m- IjTo C jM computing node C of back J+1, C J+2, ..., C J+mOn;
E) End for; DATA DISTRIBUTION end * on the/* computing node/
6) Master of group of planes structure group of planes CC iThe local task distribution message queue Q of this subtask TDTiConcurrent Q TDTiDeliver to the Master of DG, the Master of DG constructs overall task distribution message queue Q TDT
7) CC iMaster start its all computing nodes of having jurisdiction over and finish this calculation task, repeat to do step 8) 9) 10);
8) CC iMaster monitor this group of planes Q TDTiTask situation about finishing;
Accept its follow-up group of planes to Q TDTiRedundant computation situation about finishing;
Transmit this group of planes to the intermediate result of its forerunner's redundant computation to its forerunner's group of planes;
Transmit this group of planes Q TDTiThe Master of calculating intermediate result DG;
9) if
((this Q TDTiResult of calculation, all obtain by self or its descendant node)
Or
In the middle of the Master of (obtaining the finish command of the Master of DG)/* DG obtains all by a redundant group of planes
As a result */
)
Then finish this subtask and calculate, and forward step 11) to;
10) if CC iMaster obtain some computing nodes from DGSS and lost efficacy,
Then { putting this node is failure state;
By making the redundant allocation strategy of m-, calculate the inefficacy task amount, put into the inefficacy task queue;
Send the Mast ē r of fail message simultaneously to DG;
}
Finish the calculation task of oneself when some computing node after, obtaining corresponding task to the inefficacy task queue and continue to carry out, is empty up to the inefficacy formation;
11) accept the Task Distribution next time of the Master of DG; This grid parallel computation of/* finishes */Count++;
12) Master of DG is according to overall Q TDTGather this and calculate intermediate result; Revise the algorithm termination condition; Conversion intermediate result is new overall calculation task DPC;
13)End?while;
14) output result of calculation notifies all computing nodes to finish this calculating.
Embodiments of the invention: effective development environment of isomerous environment is a Web Service technology.The embodiment of the invention utilizes the SunOne technology of .Net of Microsoft and sun company as development environment, technology, has developed this system.And effective analysis, test job have been carried out.
In order to check the validity of this algorithm, the embodiment of the invention is to the general iterative method of linear equations group [13]Solution is decomposed, and tests from speed-up ratio and two aspects of fault-tolerance then.
Grid DG is made of 8 group of planes, and each group of planes has 6 computing nodes, and network is made of the cascade of 8 100M switches.
The embodiment of the invention is divided into 3 classes to computing node, and every class configuring condition is as shown in table 1, and every class computing node has 2 in each group of planes.
All kinds of computing nodes of table 1. are joined information
The node classification Processor Internal memory Hard disk Network interface card
??CT1 ??p2.8Ghz ??256M 5400 change ??100M
??CT2 ??p2.4Ghz ??256M 7200 change ??100M
??CT3 ??p2.0Ghz ??256M 5400 change ??100M
The embodiment of the invention is to the system of equations of individual 5000 * 5000 matrix of coefficients, respectively with 1000 times as once intactly calculating.DPC is a general iterative method.Basic task unit M is the one-component of finding the solution vector.Get the m=2 in the m-redundancy strategy.Test is carried out under the situation of 1 ~ 8 group of planes respectively, and test result is as shown in table 2.From data as can be seen, the speed-up ratio of this algorithm is a near-linear.
The speed-up ratio of table 2. algorithm
Group of planes number ??1 ??2 ??3 ??4
Response time (s) ??6543 ??3331 ??2577 ??1755
Group of planes number ??5 ??6 ??7 ??8
Response time (s) ??1439 ??1148 ??1023 ??917
In order to test the fault-tolerant ability of this model, the embodiment of the invention is respectively carried out 100 times above-mentioned test under the different loads situation of computational resource respectively 4 different periods, checks the complete failure rate (can not normally finish the ratio of calculating) of algorithm.The result shows, even in DG busy the peak morning and afternoon, the complete failure rate of this algorithm also is very low, and the fault tolerance that this algorithm is described is effective.Test result is as shown in table 3.
The complete failure rate of different periods of table 3.
Period ??6:00-8:30 ??9:00-11:30 ??1:00-2:30 ??20:00-22:30
Complete failure rate (%) ??3 ??6 ??7 ??1
Because the traffic in parallel each stage of process of iteration is not too big, and data volume is stable, and this algorithm has showed good speed-up ratio.To the parallel bigger and unsettled parallel JOIN of data volume of phase communication amount [12]The test job of algorithm is carried out, and other is gone out paper describe.In the realization of the large-scale parallel algorithm that the data parallel based on grid calculates, dynamic load leveling and effective fault tolerant mechanism are very important.The dynamic redundancy strategy that this paper proposes, the dynamic application strategy of gridding resource, the load balancing strategy of logic-based ring have solved this problem effectively.This instructions is found in practice, seek out stable grid computing resource, also needs effective environmental management mechanism (as many Master technology, computer lab management strategy etc.), hardware fault-tolerant strategy etc. to give security.
For large scale data parallel computing main system and the method under the grid environment of the present invention is described effectively, this instructions is described as follows the network monitoring system DGSS (DATA GRID SUPERVISESYSTEM) that utilizes the multi-Agent cooperative mechanism:
One, basic structure is set:
Set 1, computer group, a computer group (Computer Cluster) be one two tuple CC (Master, CS), wherein Master is the CC master controller; CS={C 1, C 2..., C pIt is the set of all computing nodes of CC.
Set 2, data grids, a data grid (Data Grid) be a four-tuple DG (Master, CCS, N, R); Wherein Master is the DG master controller; CSS={CC 1, CC 2..., CC cIt is the set of the collection of computer group; N={N 1, N 2..., N nFor connecting the set of network, connecting network is the high speed switching network; R is a concatenate rule.Computing node among each DG has separate processor and external storage.
Set 3, the data parallel on the DG calculates, DG (Master, CCS, N, R) the parallel computation process of data on is as follows:
(1) the calculation task scale, be decomposed into the subtask;
(2) start computing node among all CCS;
(3) n=subtask number;
(4)i=1;
(5) decomposition data Data is D 1, D 2..., D p
(6) send D kTo C k(1≤k≤p);
(7)While?i<n?do
(8) drive C k(1≤k≤p) finds the solution subtask i simultaneously;
(9) synchronous C k(the finding the solution of 1≤k≤p) to subtask i;
(10)i=i+1;
(11)End?while;
(12) reclaim C k(result of calculation of 1≤k≤p), and synthetic;
(13) notice C k(1≤k≤p) finish to calculate;
(14) finish this calculating.
Set 4, the multi-Agent cooperative model, the multi-Agent cooperative model MS among the DG can formally be set at a four-tuple MS=(Agents, Tm, Sm, Space), wherein Agents is the set of all cooperation entity A gent; Tm is the communication mechanism between Agent; Sm is the service mechanism between Agent; Space is the space that exists of all Agent.
Setting 5, Agent, an Agent can describe Agent=(A with a four-tuple Id, A Type, A Area, A Desc, A BDI, A Prg), A wherein IdUnique identifier for Agent; A TypeType for Agent; A AreaBe the scope of activities of Agent on grid, A DescDescription vector for Agent.A BDIBe rule bases such as the conviction of Agent, hope, intention; A PrgExecutable code for Agent.
According to the demand that the data parallel based on grid calculates, this instructions has been done following classification to the Agent in the grid:
Setting 6, resource management intelligence body A Rm, DG (Master, CCS, N, R) the resource management intelligence body A on RmIt is dynamic change management on all computing nodes of CCS, that be used for resource; Its conviction is that the place computing node is to have ability most; Its hope is to excavate calculating, storage, the communication capacity of place node, improves the vital role of its place computing node in grid as far as possible; It is intended that resource situation, state according to this node, cooperates, competes with other Agent in the grid, reaches the optimization duty of its computing node of having jurisdiction over.
Setting 7, reliability management intelligence body A a, the reliability management intelligence body A on the DG aBe to be present in stability status on all computing nodes of CCS, that be used for resource to detect, the duty of the main cpu resource that detects its place computing node, memory source, Internet resources, various services, and revise computing node ground, place dependability parameter according to the data of these states; Its conviction is to believe that various faults can appear in the place computing node; Its hope is to find calculating, storage, the communication failure of place node, reduces the vital role of its place computing node in grid as far as possible, itself and A RmBe conflicting; It is intended that the resource state information according to this node, cooperates, competes with other Agent in the grid, reaches the optimization duty of its computing node of having jurisdiction over.
Setting 8, cluster management intelligence body A Cc, the cluster management intelligence body A on the DG CcBe to be present on the Master of each group of planes in CCS, to be used for the ability of its all computing nodes of having jurisdiction over is carried out integrated management, comprehensively line up, manage, be responsible for A simultaneously with other group of planes according to its calculating, storage, communication, service ability etc. CcCoordination, cooperation work; Its conviction is to believe that it has jurisdiction over group of planes ability is the strongest; Its hope be find the group of planes of having jurisdiction over win calculating, storage, communication, Service Source as far as possible, improve the vital role in its place grid as far as possible; It is intended that the resource state information according to this group of planes, with the A of its group of planes in the grid CcCooperate, compete, reach the optimization duty of its group of planes of having jurisdiction over.
Setting 9, user agent's intelligence body A User, the mesh-managing intelligence body A on the DG UserBe to be present on the Master of DG, to be used for the services request of proxy user,, be responsible for A with a group of planes according to requirements such as the calculating of request, storage, communication, services GridCoordination, cooperation work.
Setting 10, mesh-managing intelligence body A Grid, the mesh-managing intelligence body A on the DG GridBe to be present on the Master of DG, to be used for the ability of its all group of planes of having jurisdiction over is carried out integrated management, comprehensively line up, manage, be responsible for A simultaneously with a group of planes according to the calculating of all group of planes, storage, communication, service ability etc. CcCoordination, cooperation work; Its conviction is to believe that its mesh capabilities of having jurisdiction over is the strongest; Its hope is to find calculating as much as possible, storage, communication, the Service Source of the grid of having jurisdiction over, and improves the throughput and the efficient of grid as far as possible; It is intended that the resource state information according to this grid, cooperates with service broker Agent, reaches the optimization duty of its grid of having jurisdiction over.
Setting 11, mesh services intelligence body A Service, the mesh services intelligence body A on the DG ServiceIt is the specific computing function that is present on all computing nodes of DG, is used for finishing parallel computation, as set 3 (8) subtask and find the solution problem, decorrelation in parallel minute of the computational problem that they are main with different, it is the externally pith of parallel service of grid, it is an intelligent body set often, and the mesh services intelligence body set on this instructions meter DG is SAS.
Each mesh services intelligence body all has the BDI of oneself, generally speaking, and an A ServiceConviction be to believe its service that can offer the best; Its hope is the service that self is provided as much as possible; It is intended that seek on the grid of place to own useful best resource, and move to the best resource node, be optimized service.
Two, multi-Agent model operating mechanism
Because the formation characteristics of DG, along with the increase of the number of computer cluster number, network number, computing node, the integrity problem of grid becomes extremely important, and therefore the effective fault tolerant mechanism of a cover must be arranged.The check point fault tolerant mechanism has been played the part of the key player on traditional System Fault Tolerance, but this fault tolerant mechanism can produce Domino effect, is not suitable for using based on the large-scale calculations of grid.This instructions utilizes dynamic redundancy mechanism, multi-Agent cooperative mechanism to solve this problem.Fault-tolerant technique also is the problem that this cooperative model of this instructions is mainly considered.
Redundancy fault-tolerant mechanism is exactly that an important mesh services subtask is finished by being distributed in mesh services intelligence body common implementings on the various computing node, a plurality of congenerous, and one of them is first person of finishing, and other is the reserve person of finishing; After first person of finishing was lost efficacy, the person of finishing took over by reserve, the consequence that the grid performance that can avoid single failpoint to cause like this descends.
1) mesh services intelligence body state
Mesh services intelligence body on the DG is divided into three states: main attitude, be equipped with attitude, be sunk into sleep.In service process, when mesh services intelligence body is first person of finishing, it is main attitude intelligence body that this instructions claims this to serve intelligent body; When mesh services intelligence body is the reserve person of finishing, this instructions claims this to serve intelligent body for being equipped with attitude intelligence body; If claiming this to serve intelligent body, the intelligent body of service never participation service in a computing node, this instructions be the intelligent body of being sunk into sleep.
2) state of grid computing node
A computing node C on the DG iOne of four states is arranged: main attitude, appearance attitude, attitude, inefficacy fully.In the period of determining, if computing node C iOn all non-be sunk into sleep the service intelligent bodies all be in main attitude, claim that so this computing node is main attitude node; If computing node C iOn the non-intelligent body master attitude of service and be equipped with attitude and deposit of being sunk into sleep, claim this computing node for holding the attitude node so; If computing node C iOn all serve intelligent body and all be in fully attitude, claim this computing node for being equipped with the attitude node so; If computing node C iOn all serve intelligent body and all be in slumber, claim that so this computing node is a failure node.
3) grid node formation
At 4 above-mentioned states of the computing node of grid, this instructions is constructed 4 node queues on grid DG, be respectively main attitude formation Q Master, hold attitude formation Q Slave, be equipped with attitude formation Q Bak, inefficacy formation Q FailureThe length of these 4 formations is designated as respectively: LQ Master, LQ Slave, LQ Bak, LQ Failure
According to the BDI that sets 6 ~ 11 described all kinds of Agent, all Agent are that computing node, the computer group that makes separately to be served brought into play main effect as far as possible by cooperation, competition purpose, promptly are in as far as possible in the main attitude formation of grid; According to this driving mechanism, this instructions is according to priority lined up four formations as follows:
Main attitude formation>appearance attitude formation>attitude formation>inefficacy formation fully
According to this principle, this instructions provides the state exchange mechanism of computing node, and the state conversion model of grid computing node as shown in Figure 3.
4) multi-Agent cooperation rule
A) resource management intelligence body A RmRule
A grid computing node C iThe measurement parameter of performance generally can comprise following four aspects:
1. C iCurrent available CPU computing power P Cpu
2. C iCurrent available memory ability P Mem
3. C iCurrent available network communications capability P Net
4. C iCurrent available magnetic disc i/o ability P I/O
Resource management intelligence body A RmBe responsible for dynamic monitoring computing node C iFour parameter P Cpu, P Mem, P Net, P I/OVariation, that utilizes that their constitute computing node current time can force function:
P node=f(P cpu,P mem,P net,P I/O)????…………(1)
If the ability functional value in a last moment of this computing node is PL Node, resource management intelligence body A RmUtilize formula (1) to calculate the ability functional value PC of current time Node
Rule 1, resource management intelligence body A RmIf rule is PC Node>PL Node, A then RmTo A CCApplication is to upper level computing node state-transition.
B) reliability management intelligence body A aRule
A grid computing node C iThe measurement parameter of unfailing performance generally can include following two aspects:
1. the normal condition marker P of present node MarkWhen computing node just often, its value is 1; When computing node lost efficacy, its value was 0;
2. computing node completes successfully the ratio of mesh services intelligence body: P SuccThe service number that the service number that=computing node completes successfully/computing node is accepted.If l 1, l 2, l 3Be three decimals, and 0<l 1<l 2<l 3<1, so
Rule 2, reliability management intelligence body A aRule,
If computing node C iP MarkBe false, then A RmTo A CCCircular C iBecome failure state;
If P Succ∈ (0, L 1), C then iBecome failure state;
If P Succ∈ (L 1, L 2), C then iBecome backup status;
If P Succ∈ (L 2, L 3), C then iBecome and hold the attitude state;
If P Succ∈ (L 3, 1), C then iBecome main attitude state;
C) cluster management intelligence body A CcRule
A group of planes CC iThe measurement parameter of performance as follows:
P Cci=CC iAffiliated computing node is the number/LQ of main attitude Master
P CciReflected group of planes CC iThe distribution situation of the main attitude computing node in DG, P CciBig more, CC is described iEffect big more.
For whole DG, P Cc1+ P Cc2+ ... + P Ccm=1.
CC iA CcReception is from group of planes CC iA on all computing nodes Rm, A aThe P that regularly sends Node, P Mark, P Succ
Utilize the heartbeat detection techniques to detect the effective status of all nodes simultaneously.
Group of planes CC iCurrent time can force function:
PC CC=g(∑P node,∑P mark,∑P succ)????…………(2)
Here ∑ is represented comprehensive group of planes CC iThe parameter of all computing nodes.PC CCThe computing node of an expression group of planes is at the main attitude formation Q of DG MasterIn number.
If group of planes CC iLast one constantly ability functional value be PL CC, cluster management intelligence body A CcUtilize formula (2) to calculate the ability functional value PC of current time Cc
Rule 3, cluster management intelligence body A CcRule,
If CC iComputing node C iP MarkBe false, then make C iBecome failure state;
If PC Cc>PL CcOr PC Cc<PL Cc, A then CcTo A GridApplication computing node state-transition.
D) mesh-managing intelligence body A GridRule
A GridReception is from the A of all group of planes of DG CcThe P that regularly sends Node, P Mark, P Succ, PC Cc
According to following regular allocation LQ Master
Rule 4, mesh-managing intelligence body A GridThe distribution of main attitude formation,
for?i=1?to?c
To group of planes CC iAll main attitude computing nodes by its resource P that provides NodeOrdering forms interim formation Q from big to small Cci
Each group of planes distributes a counter C Count[i]=0;
End?for;
P count=LQ master
For i=1 to c/*c be grid DG group of planes number */
Get group of planes CC iPC Cc
If?C Count[i]<PC cc?then
Get group of planes CC iInterim formation Q CciMiddle maximum computing node adds becomes owner of attitude formation Q Master,, and from Q CciDelete this computing node;
C Count[i] ++; / * group of planes CC iMain attitude node counts device add 1*/
P Count--; Main attitude node index * of/* distribution/
End?if;
End?for;
For?i=1?to?c
Get group of planes CC iInterim formation Q CciIn remaining computing node add the appearance attitude formation of grid;
End?for;
A GridBroadcast new main attitude, appearance attitude, be equipped with attitude, fail message to all group of planes and computing node;
In Fig. 3,5,6 and 7 are respectively the resource management intelligence body A in the multi-Agent cooperation rule RmRule, the intelligent body A of reliability management aRule and cluster management intelligence body A CcRule.
This distribution rule is to distribute according to the ability employing round-robin mechanism of each group of planes current time, make the main attitude interstitial content of each group of planes meet its computing power, can bring into play the actual computation usefulness of each computing node like this, make again simultaneously and load on equiblibrium mass distribution on the grid, help the raising of the extensibility of grid scale.

Claims (8)

1. the large scale data parallel computing main system under the grid environment comprises:
The DGSS of monitoring grid system utilizes the multi-Agent cooperative mechanism DG to be implemented the grid management system of effectively dynamic condition monitoring;
It is characterized in that, also comprise a computing system DGCS who constitutes by group of planes logic box, wherein:
Described group of planes logic box is made of by the connection of number order logic the computer group of setting numbering, and a logical successor group of planes of numbering a last group of planes is to be numbered a group of planes of 1; Described numbering is according to the numbering COMPREHENSIVE CALCULATING ability of all computing nodes of computer group and that all group of planes are set by order from big to small;
Each computer group on the described group of planes logic box is made of the computing node logic box, described computing node logic box is made of by the connection of number order logic the computing node of setting numbering, and the logical successor node of numbering last computing node is to be numbered 1 computing node; Described numbering is the numbering of by order from big to small all computing nodes being set according to the COMPREHENSIVE CALCULATING ability of each computing node, and the COMPREHENSIVE CALCULATING ability of described each computing node is calculated according to weight vectors W;
Constituted the dynamic data computing system by group of planes logic box and computing node logic box; Described monitoring grid system is by common Lan or Intranet supervision and each computer group and the computing node that are connected described dynamic grid computing system.
2. the redundant allocation strategy of the m-of the large scale data parallel computing main system under the grid environment, its step is as follows:
On a computer group ring or computing node ring, establish the computing unit of a computer group or computing node, logical number is k, its computing power is CP k
For described computing unit distributes CP k* M task amount;
It is characterized in that,
Again CP k* to be evenly distributed to logical number be k+1 to the M task amount, k+2 ..., on common m the computing unit of k+m; Task CP like this k* M is distributed simultaneously by DG and carries out twice.
3. the large scale data parallel computational algorithm under the grid environment, its DGCS key data structure constitutes just like lower member:
If grid DG is made of c computer group, the number dynamic change of the computing node in each group of planes; DPC is a parallel calculation task of data on the DG, | DPC| represents its general assignment amount, and W is its computational resource requirements weight vectors; QTDT is the task distribution message queue of DPC; M is the basic task unit; DGSS is the monitoring grid system;
It is characterized in that the step of large scale data parallel algorithm is as follows:
1) initialization:
A) according to M decomposing D PC;
B) calculate: the termination condition of finished=DPC;
C) auxiliary data of broadcasting DPC is to all computing nodes of DG;
D) parallel computation time counter Count=0;
2) While (when Finished does not set up) do
3) execute the task:
A) obtain the resource state information of DG from DGSS;
B) group of planes logic box of structure DG;
C) start all group of planes structures computing node logic box separately;
D) calculate the Dynamic Two-dimensional address for each computing node;
E) obtain the overall computing power CCP of each group of planes i(0≤i≤c);
F) to each group of planes CC i(0≤i≤c) do:
{
Calculate CCP i/ ∑ CCP j(the ratio of 0≤j≤c);
Calculating is at group of planes CC iThe task amount T of the DPC of last distribution i=(CCP i/ ∑ CCP j(0≤j≤
c))*|DPC|/M;
};
g)For?i=1?to?c
H) transformation task T iTo group of planes CC i
I) on group of planes ring, press the redundant allocation strategy distribution T of 1- iTo CC i1 group of planes CC of logical successor I+1On;
j)End?for;
4) all group of planes CC i(0≤i≤c) do concurrently step 5) ~ 11):
5) group of planes CC iCOMPREHENSIVE CALCULATING ability CP according to its each computing node j(0≤j≤p, p are CC iThe computing node number) calculate each computing node to subtask T iThe amount of bearing, promptly as follows:
a)For?j=1?to?p
b)T ij=(CP j/∑CP k(0≤k≤p))*|T i|/M;
C) transmission subtask T IjTo computing node C j
D) on the computing node ring, press the redundant allocation strategy distribution T of m- IjTo C jM computing node C of back J+1, C J+2..., C J+mOn;
e)End?for;
6) Master of group of planes structure group of planes CC iThe local task distribution message queue Q of this subtask TDTiConcurrent Q TDTiDeliver to the Master of DG, the Master of DG constructs overall task distribution message queue Q TDT
7) CC iMaster start its all computing nodes of having jurisdiction over and finish this calculation task, repeat to do step 8) 9) 10);
8) CC iMaster monitor this group of planes Q TDTiTask situation about finishing;
Accept its follow-up group of planes to Q TDTiRedundant computation situation about finishing;
Transmit this group of planes to the intermediate result of its forerunner's redundant computation to its forerunner's group of planes;
Transmit this group of planes Q TDTiThe Master of calculating intermediate result DG;
9) if
((this Q TDTiResult of calculation, all obtain by self or its descendant node)
Or
(obtaining the finish command of the Master of DG))
Then finish this subtask and calculate, and forward step 11) to;
10) if CC iMaster obtain some computing nodes from DGSS and lost efficacy,
Then { putting this node is failure state;
By making the redundant allocation strategy of m-, calculate the inefficacy task amount, put into the inefficacy task queue;
Send the Master of fail message simultaneously to DG;
}
Finish the calculation task of oneself when some computing node after, obtaining corresponding task to the inefficacy task queue and continue to carry out, is empty up to the inefficacy formation;
11) accept the Task Distribution next time of the Master of DG; This grid parallel computation of/* finishes */Count++;
12) Master of DG is according to overall Q TDTGather this and calculate intermediate result; Revise the algorithm termination condition; Conversion intermediate result is new overall calculation task DPC;
13)End?while;
14) output result of calculation notifies all computing nodes to finish this calculating.
4. large scale data parallel computational algorithm according to claim 3 is characterized in that, described step 3) b) in group of planes logic box, its constitution step is as follows:
A class DPC and resource requirement weight vectors W=(w thereof among the known given DG 1, w 2, w 3), to any one computer group CC of DG i∈ CSS, CC iThe COMPREHENSIVE CALCULATING ability be its all computing nodes the COMPREHENSIVE CALCULATING ability and, be designated as CCP j(0≤j≤c);
In DG according to CCP i(order from big to small of 0≤j≤c) is to all group of planes numberings;
Constitute a group of planes logic box by this numbering, a logical successor group of planes of numbering a last group of planes is to be numbered a group of planes of 1.
5. large scale data parallel computational algorithm according to claim 3 is characterized in that, described step 3) c) in the computing node logic box, its constitution step is as follows:
A class DPC and resource requirement weight vectors W=(w thereof among the known given DG 1, w 2, w 3), to any one computer group CC of DG i∈ CSS, the number of its computing node is p, the COMPREHENSIVE CALCULATING ability that calculates each computing node according to weight vectors W is CP j(0≤j≤p);
At CC iIn according to CP j(order from big to small of 0≤j≤p) is to all computing nodes numberings;
Constitute a computing node logic box by this numbering, the logical successor node of numbering last computing node is to be numbered 1 node.
6. large scale data parallel computational algorithm according to claim 3 is characterized in that, described step 3) d) middle Dynamic Two-dimensional address of calculating, it is set at two tuples (r, o) address; Each computing node on the DG can both obtain one two tuple (r from the group of planes logic box of being constructed and computing node logic box, o) address, wherein r is the group of planes ring numbering of this computing node place group of planes, and o is the logical number of this computing node in the computing node ring of group of planes r.
7. large scale data parallel computational algorithm according to claim 3 is characterized in that, the task distribution message queue in the described step 6): the TDT of all elementary cell tasks constitutes a task distribution message queue Q TDTDG has the Q of an overall situation TDT, each group of planes has the Q of a part TDTi
8. large scale data parallel computational algorithm according to claim 3, it is characterized in that, inefficacy task queue in the described step 10): one of each computer group structure of DG is deposited mission bit stream formation behind the local calculation node failure, and its form is the same with the task distribution message queue.
CNB2005100257308A 2005-05-11 2005-05-11 Large scale data parallel computing main system and method under network environment Expired - Fee Related CN100357930C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100257308A CN100357930C (en) 2005-05-11 2005-05-11 Large scale data parallel computing main system and method under network environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100257308A CN100357930C (en) 2005-05-11 2005-05-11 Large scale data parallel computing main system and method under network environment

Publications (2)

Publication Number Publication Date
CN1687917A true CN1687917A (en) 2005-10-26
CN100357930C CN100357930C (en) 2007-12-26

Family

ID=35305958

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100257308A Expired - Fee Related CN100357930C (en) 2005-05-11 2005-05-11 Large scale data parallel computing main system and method under network environment

Country Status (1)

Country Link
CN (1) CN100357930C (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100386986C (en) * 2006-03-10 2008-05-07 清华大学 Hybrid positioning method for data duplicate in data network system
CN100440891C (en) * 2005-12-26 2008-12-03 北京航空航天大学 Method for balancing gridding load
CN101263700B (en) * 2005-10-28 2011-02-02 三菱电机株式会社 A method for assigning addresses to nodes in wireless networks
CN102216922A (en) * 2008-10-08 2011-10-12 卡沃有限公司 Cloud computing lifecycle management for n-tier applications
CN101217564B (en) * 2008-01-16 2012-08-22 上海理工大学 A parallel communication system and the corresponding realization method of simple object access protocol
CN104035819A (en) * 2014-06-27 2014-09-10 清华大学深圳研究生院 Scientific workflow scheduling method and device
CN104184674A (en) * 2014-08-18 2014-12-03 江南大学 Network simulation task load balancing method in heterogeneous computing environment
CN105574152A (en) * 2015-12-16 2016-05-11 北京邮电大学 Method and system for rapidly counting frequencies

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003280921A (en) * 2002-03-25 2003-10-03 Fujitsu Ltd Parallelism extracting equipment

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101263700B (en) * 2005-10-28 2011-02-02 三菱电机株式会社 A method for assigning addresses to nodes in wireless networks
CN100440891C (en) * 2005-12-26 2008-12-03 北京航空航天大学 Method for balancing gridding load
CN100386986C (en) * 2006-03-10 2008-05-07 清华大学 Hybrid positioning method for data duplicate in data network system
CN101217564B (en) * 2008-01-16 2012-08-22 上海理工大学 A parallel communication system and the corresponding realization method of simple object access protocol
CN102216922A (en) * 2008-10-08 2011-10-12 卡沃有限公司 Cloud computing lifecycle management for n-tier applications
US11418389B2 (en) 2008-10-08 2022-08-16 Jamal Mazhar Application deployment and management in a cloud computing environment
US10938646B2 (en) 2008-10-08 2021-03-02 Jamal Mazhar Multi-tier cloud application deployment and management
US9043751B2 (en) 2008-10-08 2015-05-26 Kaavo, Inc. Methods and devices for managing a cloud computing environment
US10454763B2 (en) 2008-10-08 2019-10-22 Jamal Mazhar Application deployment and management in a cloud computing environment
CN104035819B (en) * 2014-06-27 2017-02-15 清华大学深圳研究生院 Scientific workflow scheduling method and device
CN104035819A (en) * 2014-06-27 2014-09-10 清华大学深圳研究生院 Scientific workflow scheduling method and device
CN104184674B (en) * 2014-08-18 2017-04-05 江南大学 A kind of network analog task load balance method under heterogeneous computing environment
CN104184674A (en) * 2014-08-18 2014-12-03 江南大学 Network simulation task load balancing method in heterogeneous computing environment
CN105574152B (en) * 2015-12-16 2019-03-01 北京邮电大学 A kind of method and system of express statistic frequency
CN105574152A (en) * 2015-12-16 2016-05-11 北京邮电大学 Method and system for rapidly counting frequencies

Also Published As

Publication number Publication date
CN100357930C (en) 2007-12-26

Similar Documents

Publication Publication Date Title
CN1687917A (en) Large scale data parallel computing main system and method under network environment
CN1287282C (en) Method and system for scheduling real-time periodic tasks
CN1776622A (en) Scheduling in a high-performance computing (HPC) system
CN1777107A (en) On-demand instantiation in a high-performance computing (HPC) system
CN1157960C (en) Telecommunication platform system and method
CN1232071C (en) Communication network management
CN1509022A (en) Layer network node and network constituted throuth said nodes, the node and layer network thereof
CN1172239C (en) Method of executing mobile objects and recording medium storing mobile objects
CN1254994C (en) Network topologies
CN1021489C (en) Expert system development support system and expert system
CN1295583C (en) Method and system for realizing real-time operation
CN1669001A (en) Business continuation policy for server consolidation environment
CN1608257A (en) Aggregate system resource analysis including correlation matrix and metric-based analysis
CN101044498A (en) Workflow services architecture
CN1809815A (en) Managing locks and transactions
CN1838600A (en) Sensor network system and data transfer method for sensing data
CN1794729A (en) Data arrangement management method, data arrangement management system, data arrangement management device, and data arrangement management program
CN1734438A (en) Information processing apparatus, information processing method, and program
CN1783086A (en) System and method for query management in a database management system
CN1435043A (en) Method and device for call center operation
CN1760804A (en) Information processor, information processing method, and program
CN1879023A (en) Electric utility storm outage management
CN1684029A (en) Storage system
CN1805349A (en) Sensor network system and data retrieval method and program for sensing data
CN1779660A (en) Methods for duplicating among three units asynchronously

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20071226