Background technology
In the Distributed Calculation field; Particularly need carry out the field of great amount of calculation; For example at the power system safety and stability analysis field, along with electric power system constantly develops, scale of power enlarges day by day; Because on-line system is for the requirement of rapidity time response, computational speed has become the bottleneck of the online application of restriction.Parallel computation is to realize efficient ways extensive, that the complex electric network on-line analysis is calculated; Effectively manage for an extensive group of planes; Make it to be applicable to the various computing function of on-line analysis calculating; And, become the key that power domain improves the system-computed performance in the calculated performance that guarantees to bring into play to greatest extent on the basis of reliability an extensive group of planes.
Document one " distributed paralleling calculation platform and calculation task allocating method thereof " (application number: 200810239104.2) a kind of distributed paralleling calculation platform system and calculation task allocating method thereof are provided.Distributed paralleling calculation platform is responsible for receiving and is calculated input file in this system, forms online and off-line Task Distribution scheme.This method is divided into an off-line group of planes and an online group of planes according to Data Source with a group of planes, and an online group of planes only calculates online task, and an off-line group of planes can calculate online task and off-line task.Role's switching of cluster nodes needs manual configuration to accomplish.This method adopts the framework mode of client/server, and cluster nodes is unified the calculated data and other instructions of reception server, and the filtration of calculated data and processing be independent the completion on each cluster nodes.
Document two " fault-tolerance approach of utilizing cluster nodes to back up each other " (application number: 02159479.1) proposed a kind of fault-tolerance approach of utilizing cluster nodes to back up each other.This method connects, intercoms mutually and backs up through the heartbeat ring between cluster nodes; Host node distributes the position of newly added node in a group of planes, and returns the information on services that newly added node is born; Newly added node starts service depending process one by one, and corresponding service IP is set; If start failure, host node then selects other nodes to start this service; When node in the group of planes finds that adjacent node is unusual, confirm to this adjacent node; Host node is taken over this failed services.The node heat that this method has mainly solved in the cluster management is equipped with problem, but this method and be not suitable for the differentiation management to the various computing function of electric power system parallel computer group.
Document three " a kind of group of planes AMS and application management method thereof " (application number: 201010286186) proposed a kind of group of planes AMS that is applied to the large-scale cluster management.This system comprises execution engine modules and DBM, and DBM is used for storing the result that each is used in real time, and sets up monitoring form, the change information of the result of all associated application of a plurality of application of record in the said monitoring form; Carry out each application that engine modules is used for carrying out NOWs; And the result of each application write DBM in real time; Also be used for said monitoring form according to regular reading of data library module of the cycle that sets; Read behind the said monitoring form change information at every turn, judge respectively whether each trigger condition of using is satisfied, and when trigger condition satisfies, trigger application corresponding according to the result of a plurality of application of being read.The present invention also provides corresponding group of planes application management method.The present invention can reduce the database access linking number, reduces expense; Can handle various complex logic relations between application; Be convenient to management and operation more.
More than in three kinds of methods; Document one does not consider that electric power system Distributed Calculation management platform is in the influence of the emergency case of running (like the variation of calculating scale, the operation exception of cluster nodes) to system; The division of a group of planes is only according to the difference of data source; And the computing function of cluster nodes is just fixing when original allocation, in computational process, according to computing function the changes in demand of node resource is not adjusted automatically, and computational resource can't be fully used; Document two and document three are not considered the concurrent scheduling relation between electric power system parallel computation different application function, can't solve the variability issues of calculation requirement, Data Source and computational process of the various computing function of electric power system.Therefore above-mentioned three kinds of methods all do not have well to solve the problem of electric power system Distributed Calculation management platform cluster management, and the computational resource of cluster nodes can't be utilized.
Summary of the invention
Technical problem to be solved by this invention is; Overcome the limitation of prior art; Consider calculation features, work period and the consequent demand that computational resource is constantly changed of various computing function, the self-adapting regulation method of parallel computer group differentiated control available in a kind of electric power system is provided.
Among the present invention, computer cluster is carried out differentiated control according to the level of " system-working field-cluster nodes ", system is used to distinguish the different pieces of information source, and working field is used for satisfying the requirement in system's computing function different operating cycle.Cluster nodes in the system is divided the different working territory, and it all is unit with the working field that data processing and result gather, and the data processing between the different operating territory, task scheduling and result's passback all are independently.In each working field, all have a management node alone.Be implemented in dynamically adjustment between a plurality of working fields for the cluster nodes in the system through following method, concrete steps are following:
1. in cluster nodes deploy Distributed Calculation management platform, realize between role's identification, node of node data communication, task scheduling and management, data are obtained and function such as passback as a result;
2. in application server deploy cluster nodes self adaptation adjustment program, realize the Stateful Inspection of working field and computing node and the optimized distribution management of cluster nodes;
3. when working field running status or cluster nodes quantity change; Parameters such as the cluster nodes self adaptation adjustment program on the application server is counted according to the reference work cycle of each working field, minimum reserve section and maximum reserve section is counted; Readjust the distribution of cluster nodes on each working field, and adjusted assignment information is distributed to cluster nodes;
4. the Distributed Calculation management platform of moving on the cluster nodes realizes the dynamic switching of cluster nodes between a plurality of working fields according to the information of working field under the amended cluster nodes.
Assume that the normal operation of the system the number of working domain
, the normal operation of the number of cluster nodes
,
is the first i work area set reference duty cycle; Current
a cluster node
working domain distribution relationship matrix
indicates,
(unassigned status The default cluster nodes belong to the working domain 0) is the i-th and j-th cluster nodes working domain assignment affiliation (if 1 means that the i-th node from the other working domain switching to the working domain j use; if it is 0, which means that it does not belong to the working domain j; case of -1 indicates that the i-th node j from the working domain switching to other work domains use); formula (1) represents
a cluster node
working domain satisfy (4) - (7) types of constraints minimize the number of cluster nodes to adjust the optimal allocation of the objective function;
is the i-th cluster work area allocated number of nodes, as specified in formula (2) below; formula (3) ensure that a cluster node only be assigned to a work area (in the number of cluster nodes
is less than the sum of all the work domain of the maximum number of nodes and reservations
time); formula (4) to ensure that each work area allocated based on the reference duty cycle cluster node resources; formula (5) to ensure that all possible resources available cluster nodes are assigned to the work area (if the number of cluster nodes
domain over all the work of the maximum number of nodes and reservations
, then there exists a node is unassigned);
is the i-th cluster assigned work area a minimum number of nodes (defaults to 0), where (6) is working domain of computing resources during normal operation the minimum requirements;
is the i-th cluster allocation working domain maximum number of nodes (default is the total number of cluster nodes), formula (7) is working domain of computing resources during normal operation of the highest configuration.
Cluster nodes optimized distribution solution procedure realizes that based on the method for iterative computation concrete steps are following:
(1) working field set to be adjusted is designated as
; Initial value is
; Assigned the working field set and be designated as
; Initial value is
; Judge normal operation the cluster nodes number be
whether satisfies the minimum computational resource requirement that normally moves working field, promptly whether formula (8) is set up.If set up, change (2); If be false, then choose and distribute the minimum working field d of priority from working field set to be adjusted
k, join the working field that has assigned
In the set, promptly
,
, form new waiting and adjust working field set { d
1..., d
K-1, d
K+1..., d
r, and with working field d
kThe cluster nodes number that distributes
Be changed to 0, change (1);
(2) according to reference work cycle of each working field in working field the to be adjusted set
; Cluster nodes number
according to each working field expection distribution during formula (9) is calculated
; And round (if zero downwards; Then value is 1), unnecessary node number is distributed to the high working field of priority successively;
(3) if working field to be adjusted set
non-NULL; The interstitial content
that distributes for each working field checks successively according to formula (7) whether working field after the pre-adjustment satisfies the constraint of counting of maximum reserve section.If do not satisfy; The node number of this working field final assignment is set to maximum reserve section and counts; And this working field
joined in working field
set that has assigned; I.e.
,
.The maximum reserve section that
deducts working field
the node number of participating in the distribution that calculate as next iteration the back of counting changes (2);
(4) if working field to be adjusted set non-NULL; The interstitial content
that each working field distributes for
checks successively according to formula (6) whether working field after the pre-adjustment satisfies the constraint of counting of minimum reserve section.If do not satisfy; The node number of this working field final assignment is set to minimum reserve section and counts; And this working field
joined in working field
set that has assigned; I.e.
,
.The minimum reserve section that
deducts working field
the node number of participating in the distribution that calculate as next iteration the back of counting changes (2);
(5) if working field to be adjusted set
non-NULL; The interstitial content
that distributes for each working field all satisfies maximum node and the minimum joint constraint that keeps of keeping; All add in the working field set
that has assigned in all working territory in the working field set
that then will be to be adjusted; I.e.
,
;
(6) according to the cluster nodes number
of original normal each working field of operation with assigned the cluster nodes number
of each working field in back working field
set, pick out that working field set
that node reduces and node increase working field is Ji Heed
.
Be working field set D
DelThe number of middle working field,
Be the working field set
In each working field node number that need reduce;
Be the working field set
Middle working field number,
Be the working field set
In each working field interstitial content that need increase.For state is unappropriated node; Acquiescence is placed on working field 0 and handles; Working field 0 acquiescence belongs to working field set
, and
is
;
(7) the optimized distribution matrix
with cluster nodes is initialized as the preceding allocation matrix
of adjustment; And be that unappropriated node is the node of working field 0 as preassignment with state, simultaneously correspondence position in the allocation matrix
is changed to 1;
(8) node with each working field in the working field set
sorts by switching priority from low to high; Pick out the minimum individual node of
of priority, the state
of correspondence position in the allocation matrix
of corresponding cluster nodes is changed to-1.All elements in the allocation matrix
is picked out formation adjustment node set
for-1 node; And sort from low to high according to node priority; Distribute to the working field
(each node is only adjusted once) in
one by one, and guarantee that the final node number that increases of working field
is
.The state
of each node correspondence position in allocation matrix
is 1 in revising
, thereby forms the allocation matrix
of cluster nodes.
(9) according to the node during the allocation matrix
of cluster nodes is revised
in database or text affiliated working field number (element is 1 working field for the affiliated working field of-1 node is revised as column element in the corresponding row with allocation matrix
in), the node number of revising each working field is adjusted node number.
In computational process, the reference work cycle of each working field can be supported to carry out online correction according to the online actual calculation cycle.Formula (10) is the weighted average calculation formula in the cycle that works online of working field i; Wherein P is current round;
is the cycle that the works online weighted average of P-1 wheel before the working field i, and
works as the computing time of front-wheel for working field i.Formula (11) supports to set according to the ratio of user preset the revisory coefficient k in reference work cycle simultaneously;
is (identical with the cycle unit that works online for the manual reference work cycle of setting; If do not need not revise according to the cycle of working online, can revisory coefficient k be set to 0).
In addition; To the distribution of the computing node resource of each working field except can distributing according to formula (4) the reference work cycle; All right calculated performance according to a working field promotes efficient as distributing target function, chooses the maximum working field of calculated performance lifting and distributes.
The present invention is based on the thought of multiple management; Cluster nodes is managed according to the three-decker of " system-working field-cluster nodes "; The node redundancy scheme by expanding between working field in traditional working field, and is supported to carry out automatically according to the running status of working field in the system and cluster nodes the optimized distribution of cluster nodes resource.When improving system reliability, also fully improved the cluster nodes efficiency of resource, effectively avoid moving aperiodic in the system resources idle problem of working field.
Embodiment
For the purpose, technical scheme and the advantage that make the embodiment of the invention is clearer, the present invention is described in further detail below in conjunction with embodiment and accompanying drawing.But the present invention is not limited to given example.
Fig. 1 has provided the self-adapting regulation method system configuration of electric power system parallel computer group differentiated control and has needed mutual information.Between application server and cluster nodes, cluster nodes inside needs mutual node and working field information and file (comprising data and result) all to carry out alternately through network message.
Fig. 2 has provided the processing logic sketch map of cluster nodes optimized distribution module on the application server.This module is adjusted the cluster nodes number that each working field distributes according to user pre-configured working field and cluster nodes information and related constraint condition, to improve the overall computational performance of system.Concrete steps are following:
What (1) Fig. 2 step 1 was described is in the iterative computation process, participates in the working field of adjustment and the screening process of cluster nodes.Pick out the working field and the cluster nodes of all operations during initial start,, check whether the cluster nodes of current participation adjustment satisfies the minimum reservation joint constraint in all working territory according to the constraint that the minimum reserve section of each working field is counted.If do not satisfy, pick out the minimum working field of priority and directly join and distribute in the completed working field set, remaining working field and cluster nodes are carried out the minimum reserve section constraint inspection of counting again.Ultimately selected to participate iterative adjustment of the domain to be set
and the cluster node sets
.
What (2) Fig. 2 step 2 was described expects that according to each working field of reference work computation of Period the node number
that distributes is (because the interstitial content that each working field distributes possibly be non-integer; Therefore; The result is rounded downwards; Remove fractional part, the difference part mean allocation of sum of finding the solution as a result and actual node number is given the working field that needs to increase the node number).
(3) Fig. 2 step 3 is described is that restriction and the minimum reserve section restriction of counting of counting is carried out verification to the node number
of expection distribution according to the maximum reserve section of user preset.If the working field restriction of not satisfying that maximum reserve section is counted or minimum reserve section is counted is arranged; The interstitial content that this working field distributes is set to that maximum reserve section is counted (not satisfying formula (7) constraint) or minimum reserve section is counted (not satisfying formula (6) constraint); And join in the completed working field set of distribution
; The node number of participating in the distribution deducts and accomplishes the share out the work node number in territory of node; Activate the iteration sign again, repeat step (1).Count and the restriction of counting of minimum reserve section if wait to adjust maximum reserve section that the expection distribution node number in all working territory in the working field set
all satisfies each working field, directly get into step (4).
(4) Fig. 2 step 4 is found the solution the interstitial content that each working field needs is adjusted according to the expection distribution node number of each working field and the difference of the current node number that has; Pick out the working field set
that working field is gathered
(belong to working field 0 for unappropriated node acquiescence, working field 0 belongs to
) and node increases that node reduces after picking out adjustment.Each working field for reducing in the node working field set
sorts by node switching priority to the node that it had; The individual cluster nodes of
that priority is minimum is picked out as node to be adjusted, and forms node set to be adjusted
.For all nodes in
; The node number
that needs increase according to each working field
in
; Switch priority according to node and sort from low to high, distribute to the working field (each node only switches once) in
one by one.Revise the affiliated working field of preserving in local data base or the file of waiting to adjust node at last, upgrade the current actual cluster nodes number that has of each working field synchronously.
Above-described practical implementation case; Just carry out further detailed elaboration to the object of the invention, technical scheme and beneficial effect; And be not used in qualification protection scope of the present invention; All any modifications of on principle of the present invention and basis, being carried out etc. all should be included within protection scope of the present invention.