CN102724057B - A kind of distributed levelization autonomous management method towards cloud computing platform - Google Patents
A kind of distributed levelization autonomous management method towards cloud computing platform Download PDFInfo
- Publication number
- CN102724057B CN102724057B CN201210042033.3A CN201210042033A CN102724057B CN 102724057 B CN102724057 B CN 102724057B CN 201210042033 A CN201210042033 A CN 201210042033A CN 102724057 B CN102724057 B CN 102724057B
- Authority
- CN
- China
- Prior art keywords
- node
- management
- level
- module
- strategy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Abstract
The invention provides a kind of distributed levelization autonomous management method towards cloud computing platform, large-scale cloud computing management system is carried out logical partition;Pass through to build multi-level Autonomic Element realization autonomous management inside subregion;On subregion, the higher leveled Autonomic Element of layer building realizes system-level management;This is read from the corresponding rule of Major program from knowledge base, lexical analysis module detects whether to meet rule, triggers response events, then event is submitted to event manager module and cache and dispatch execution during autonomous management.The present invention devises the autonomous management system of high-performance computer using thought of dividing and ruling.Based on the dynamic way to manage in many logical partitions, large scale system is carried out logical partition according to necessarily strategy, realize autonomous management inside each subregion, with the extension of adaptive system scale.Inside each subregion, build multi-level Autonomic Element and be managed, on the upper strata of multiple subregions, build higher leveled Autonomic Element and realize system-level management.Every one-level Autonomic Element supports extensibility, and in newly added equipment or modification characteristic parameter, system is not shut down, and realizes the self-configuring of system.
Description
Technical field
The present invention relates to cloud computing platform management domain, specifically, there is provided a kind of distribution towards cloud computing platform
Method is managed independently in formula stratification.
Background technology
How autonomous counting system structural research coordinates multiple Autonomic Elements reaches a system-level target jointly, including asking
Topic detection, reparation, load management, automatically installation configuration etc..
Membership credentials between the autonomous multiple Autonomic Element of counting system structure primary study, main in terms of existing research
It is the combination of level and peering structure including hierarchical structure, peering structure, mixed structure.In hierarchical structure, upper strata is from supervisor
Reason person (AM) can be to its lower floor AM transmission control information (CI), and lower floor AM is then to its upper strata AM transmission state information (SI);On
Layer AM control system macroscopical autonomous nature, CI type out-degree be zero AM be bottom autonomous management person, realize micromanagement.Example
As the two-layer autonomic computation system being optimized based on cybernetics and utility function.In peering structure, the AM participating in cooperation does not exist
The transmission of hierarchical relationship, control information and status information is two-way, and the overall autonomous nature of system is typically in individual office
" emerge in large numbers " out in portion's interaction, for example, emerge in large numbers the architecture of theory based on self-organizing.In this architecture, the pass of AM
System is reciprocity, there is not the AM of management overall situation autonomy behavior, and that is, system macroscopic view autonomous nature is to produce in the Local Interaction of AM
Raw.In mixed structure, upper strata AM can be to its lower floor AM transmission control information (CI), and lower floor AM transmits shape to its upper strata AM
State information (SI);Macroscopical autonomous nature of upper strata AM control system, lower floor AM then based on upper strata AM provide constraint, by interaction
Realize the macroscopic properties of this layer.For example, autonomous system is divided into two layers:Upper strata is resource arbiter, and the resource being responsible for the overall situation is divided
Join, realize the maximization of overall effectiveness;Lower floor is application manager, and for given resource, application manager passes through adjustment office
Portion's parameter, realizes the maximization of local effectiveness.Application manager is converted into resource arbiter local service level utility function and makes
Resource level utility function, resource arbiter obtains the Resource Allocation Formula of the overall situation by the effectiveness of computing system level, and with
This is adjusting the behavior of lower floor's application manager.
High-efficiency computer system must be extendible, and expansible inclusion scale (resource) is expansible, the time is expansible
(upgrading), performance are expansible, software expandable, and first three items feature and high-performance computer itself are related, and software expandable is not only
Business software when running for high-performance computer system, also for high-performance computer management system software.
Content of the invention
For solving disadvantages described above, and cloud computing platform management is made to have extensibility, the invention provides a kind of facing cloud
The distributed levelization autonomous management method of calculating platform.
A kind of distributed levelization autonomous management method towards cloud computing platform,
Large-scale cloud computing management system is carried out logical partition;Pass through to build multi-level Autonomic Element inside subregion
Realize autonomous management;On subregion, the higher leveled Autonomic Element of layer building realizes system-level management;From knowledge base during autonomous management
Read this from the corresponding rule of Major program, lexical analysis module detects whether to meet rule, triggers response events, then by event
Submit to event manager module to cache and dispatch execution.
Preferably, described from Major program include divide logical partition plan, election plan and alarm association plan.
Preferably, described Autonomic Element includes base module, monitoring resource module, analysis module, event manager module,
Respond module, executed in parallel module and autonomous schedule module;
Described base module passes through knowledge base interactive interfacing with user and provides customizable autonomous rule;
Described monitoring resource module is safeguarded the storage standard of a resource and is accepted from being managed node Autonomic Element
Resource information, resource information is stored in data base according to standard directories form and uses for other Autonomic Elements;
Described analysis module is used by respective Major program, judges whether the information of storage in data base meets in knowledge base
Rule condition;Produce the event of needs execution when meeting, set Event Priority, and event description is sent to incident management mould
Block;
Described event manager module caches the event description that analysis module is stored in, and strategically scheduling determines that caching event is
No can execute, if executable, generate the scheduled event of concurrent thread execution, execute response in thread, complete specifically
Response;
Described respond module provides the method that response action is registered as predefined response, and manages the mapping table of the two;
Described executed in parallel module is used for script or the rule that execution respond module produces on multiple nodes simultaneously;
Described autonomous schedule module logically controls remaining six module, forms the freedom attributes of a management system.
More preferably, described autonomous rule includes conditional plan and formula rule;
Described conditional plan includes predefined conditional plan and predefined rule of response;
Described predefined conditional plan is expression formula or the expression logic combination of various Resource Properties, described Resource Properties
It is stored in Resource Properties catalogue by monitoring resource module management;
Described predefined rule of response is that action is registered generation, the trigger action when condition meets by respond module.
More preferably, described base module can extend online, realizes passive learning;Can also by Dynamical Deployment,
Update and deletion rule to be changed and sophisticated systems row with realizing in the case of not changing Software Coding or halt system operation
For.
Preferably, described Autonomic Element is divided into parametric degree, component-level, node level, partition level and system-level;Fixed in parametric degree
Justice minimum Autonomic Element, and build component-level on the basis of parametric degree, build node level on the basis of component-level, in node level base
Partition level, constructing system level on the basis of partition level are built on plinth.
More preferably, described node level element is deployed on server node, and is responsible for all portions in this node level node
The management of part, in the management node of each logical partition, deployment partition level Autonomic Element is responsible for partition management work, in highest
Layer deployment system level Autonomic Element, is communicated between each Autonomic Element by way of extending ID description standard CIM;Quilt
Autonomic Element in management node is responsible for collecting the status information of each resource on this node, is sent to higher level's Autonomic Element, and holds
The order that row higher level's Autonomic Element is issued, each basic unit Autonomic Element does not have the overall situation to see and the knowledge base of oneself, completely by upper strata certainly
Host element judges whether to meet predefined condition, and executes corresponding response.
More preferably, described logical partition need to calculate the nodes of each subregion in division according to formula, and computing formula is:
Wherein R in formula1、R2、R3、R4、R5For the weight of different resource load value, ∑ Ri=1;During self-adaptative adjustment, newly
The computing formula of coefficients R newi is:Li represents the load value of Current resource, if RnewiWith old coefficients R i
Compare, exceed the threshold value of reservation, then the coefficient with newly calculating substitutes old coefficient.
More preferably, the process of described election plan is:
The topological structure that is entirely connected is set up between all nodes;
The priority of each node is set;
The node of election highest priority is leader node, and is broadcast to other nodes;
If other nodal test to leader node go wrong when, triggering election;
New leader node is elected according to priority and re-broadcasts.
More preferably, described alarm association plan adopts time and space compression method to exclude invalid alarm.
Preferably, described election is using towards cloud computing election algorithm, described employs towards cloud computing election algorithm
The conventional distributed network management mechanism based on agent node group in large-scale distributed network management, if a logical partition
Inside being managed node number is n, each node all has a node-agent, this agency has a globally unique identifier, and makees
For priori known to the agency of other nodes in this subregion, can be mutual by message between any two agencies in whole subregion
Mutually transmit message, as entirely connect topological structure, the set of whole partitioned proxies can use { ID0, ID1, ID2... ... IDN-1Table
Show;In each logical partition, setting one leader agency (be managed to the agent node in subregion;Leader node and generation
According to the cooperation of centralized management pattern between reason node, that is, leader node instruction agent node is specifically operated or is provided spy
Fixed information, agent node returns operating result or the information being required;Then according to certain distributed association between leader node
With Pattern completion management role.
Preferably, described cloud computing management system adopts unified monitoring management strategy, in described unified monitoring management strategy
Hold as follows:
Policy class:It is divided into some classifications according to global monitoring management strategy, including:Switch, disk array, operation
System, tape library, data base, hardware information;
Strategy is abstract:Each level Autonomic Element, from the monitoring management strategy of same type different vendor product, takes out
The unified monitoring management strategy form of the type product;
Policy depiction:On the basis of above-mentioned monitoring management policy class, each level Autonomic Element is realized to various species
Monitoring management strategy carry out Unify legislation;
Strategy combination:Monitoring management strategy is divided into direct strategy and two kinds of indirect strategies, wherein, direct strategy is permissible
Changed by strategy and be directly implemented in concrete equipment or application, and indirect strategies are then by one group of direct strategy or indirect strategies
Combine;
Strategy configuration:Realize Unified Policy being converted to the monitoring management strategy processing module of concrete equipment strategy, in addition
The equipment supervision realizing again concrete equipment strategy is distributed on equipment or application drives and proxy module.
The present invention devises the autonomous management system of high-performance computer using thought of dividing and ruling.Dynamic based on many logical partitions
Way to manage, carries out logical partition large scale system according to necessarily strategy, realizes autonomous management, to adapt to inside each subregion
The extension of system scale.Inside each subregion, build multi-level Autonomic Element and be managed, on the upper strata of multiple subregions, build
Higher leveled Autonomic Element realizes system-level management.Every one-level Autonomic Element supports extensibility, special in newly added equipment or modification
System during parameter of levying is not shut down, and realizes the self-configuring of system.
Brief description
Fig. 1 is present invention autonomous management system framework
Fig. 2 is the logical partition based on stratification Autonomic Element for the present invention
Fig. 3 is fault agency and message transmission Figure of the quantitative relationship
Fig. 4 is overall unified monitoring management strategy
Specific embodiment
Distributed levelization autonomous management system frame structure is illustrated in fig. 1 shown below,
Each assembly function is as follows:
(1) knowledge base:Its purpose of design is to provide customizable autonomous rule by same user mutual.User can pass through
The interface of knowledge base, carries out inquiring about to rule, changes, deletes, adding.Rule is divided into two kinds:Conditional plan, formula rule.Rule
Description information then must be added show which to belong to oneself from Major program.
Conditional plan includes two parts:Predefined condition, predefined response.Predefined condition is the letter of various Resource Properties
Single expression formula or the logical combination of expression formula, such as 80 DEG C of cpu [temperature] >.Resource Properties catalogue is by monitoring resource pipe
Reason, the action that predefined response triggers when meeting for condition, such as stop forwarding request etc. to this node.By respond module be responsible for by
Action is registered as a predefined response, and safeguards that table is hinted obliquely in response.Selected by user or predefine mode, by predefined bar
Part and predefined response associate, and generate a conditional plan.
(2) monitoring resource:This module maintains the storage standard of a resource, i.e. Resource TOC service.It receives and is derived from
It is managed the resource information of node Autonomic Element, then this information arrives data base according to standard directories form storage (as CIM standard)
In, so that other modules use.
(3) analyze:Each Major program uses its analysis module, judges whether the information of storage in data base meets knowledge base
In rule condition.Produce the event of needs execution when meeting, set Event Priority, be sent to event manager module.Due to
The different rule formats from Major program is different, and corresponding analysis process is also different, such as judges whether cpu utilization rate reaches threshold value
Whether overweight with decision node live load cannot unify.
(4) incident management:The event description that caching analysis module is stored in, according to certain strategy scheduling (as priority) certainly
Whether certain caching event fixed can execute.This module generates the scheduled event of concurrent thread execution, executes in these threads
Response, completes specifically to respond.
(5) respond:This module provides the method that response action is registered as predefined response, and manages the mapping of the two
Table.It additionally provides the method that increase/deletion/modification predefines response.This respond module of thread scheduling that incident management starts,
And incoming predefined response is as parameter, in respond module, according to mapping table search this predefined respond corresponding action, can
Can be a script it is also possible to another group of rule, and execute this script or rule.
(6) executed in parallel:On multiple nodes, execution respond module produces simultaneously script or rule.
(7) from Major program:This module logically controls above six assemblies, formed a management system from master
Property.
The vague generalization step of autonomous management system includes:The corresponding rule of this plan, lexical analysis is read from knowledge base
Whether module check meets rule, triggers response events.These events are submitted to event manager module and are cached and dispatch execution.
Autonomous management system includes multiple Autonomic Element levels.Divide from functional perspective, Autonomic Element is divided into parametric degree, portion
Part level, node level, partition level, system-level.Define minimum Autonomic Element in parametric degree, and based on construct its upper level unit
Element, builds based on other one level below respectively at different levels, by that analogy, until the top system-level overall situation Autonomic Element of construction.
On each server node, deployment node level Autonomic Element is responsible for the management of all parts in this node and node, at each
In the management node of logical partition, deployment partition level Autonomic Element is responsible for partition management work, autonomous in top deployment system level
Element, is communicated between each Autonomic Element, thus constitute being based on by way of reasonable extensions ID description standard CIM
The autonomous management system of stratification, realizes high-performance computer overall situation unified resource monitoring and manages.
The Autonomic Element of autonomous management system substantially constitutes and includes knowledge base part, and knowledge base is used for defining control system
The rule of behavior, deposits the plan knowledge of relative quiescent, such as correlation rule, network connection static topological etc..By to knowledge base
The online extension of part plan knowledge, realizes passive learning function.Because strategy can be with Dynamical Deployment, renewal or deletion, therefore
Can change, improve system by the dynamic policing rule that updates on the premise of not changing Software Coding or halt system operation
System behavior.The Autonomic Element being managed on node is responsible for collecting the status information of each resource on this node, is sent to higher level autonomous
Element, and execute the order of higher level's Autonomic Element issue, each basic unit Autonomic Element does not have the overall situation to see and the knowledge base of oneself, completely
Judge whether to meet predefined condition by upper strata Autonomic Element, and execute corresponding response.
In order to ensure the extensibility of management system, suitable logical partition partition strategy is selected to be a key issue.
Correct logical partition strategy on the one hand can ensure that management node will not overlond running, another side can also avoid underloading and unrestrained
Take management node resource.In logical partition, system initialisation phase needs to select a node as management node.Additionally,
In the run duration of system, if management node lost efficacy, need to select another one node adapter management work in this subregion.Right
Large scale system carries out logical partitioning operation to be needed to consider many factors:A), subregion internal segment points;B), as management node
The I/O ability of disposal ability, communication capacity and external memory;C), the management data volume being produced due to management operation in subregion.Comprehensive
Close and state many factors realization, in logical partition, manageable nodes index calculating method is as follows:
In above-mentioned formula, R1, R2, R3, R4, R5 are the weight of different resource load value, wherein ∑ Ri=1.In knowledge base
In, there is single weight computing formula to the weight of different resource load value, and self-adaptative adjustment is carried out by Autonomic Element, newly
The computing formula of coefficients R newi is as follows:Li represents the load value of certain resource current, and now, Ri meets:If RnewiCompared with old coefficients R i, exceed the threshold value of reservation, then the coefficient with newly calculating substitutes old coefficient, threshold
The setting of value can prevent from shaking.In the range of nodes, logical partition is constituted according to policy selection node, such as according to physics
Nearby principle, selects to belong to same rack or the continuous node in multiple racks, or divides according to the function of node, such as certain
A little nodes specially complete inquiry business, and other node is responsible for high intensity calculating task specially.Based on stratification Autonomic Element
Zoning schemes schematic diagram as shown in Figure 2.
Propose a kind of towards cloud computing election algorithm (Cloud Computing based Election
Algorithm, hereinafter referred to as CCBE algorithm).This algorithm has higher execution efficiency, and solves the less solution of other algorithms
Election Trigger Problems;Situations such as this algorithm can adapt to node failure, link failure and node and changes simultaneously, has certain
Fault-tolerant ability and dynamic characteristic.
CCBE algorithm employs the conventional distributed network based on agent node group in large-scale distributed network management
Network administrative mechanism [Lee 04].This administrative mechanism thinks:From the point of view of angle of network management, managed networks are by basic by Guan Yuan
Element ----node forms.If being managed node number in a logical partition is n, each node all has a node-agent
(Agent), this agency has a globally unique identifier (ID1), and as priori by the agency of other nodes in this subregion
Known, message can be transmitted mutually by message between any two agencies in whole subregion, as entirely connect topological structure, entirely
The set of partitioned proxies can use { ID0, ID1, ID2... ... IDN-1Represent.In each logical partition, by a special generation
Reason node ----leader agency (Leader Agent) is managed to the agent node in subregion.Leader node and agent node
Between according to centralized management pattern cooperate [MZH99], that is, leader node instruction agent node specifically operated or provided
Specific information, agent node returns operating result or the information being required;Then according to certain distributed between leader node
Cooperative Mode completes management role.
CCBE algorithm is divided into multiple stages, and assumes that node messages transmission and response time are known:First stage, base
In subregion internal segment points, the I/O ability generation subregion of the disposal ability of node, communication capacity and external memory, agent node is excellent
First level list, and elect subregion medium priority highest node as leader node, it is broadcast to all agencies, select for the first time
Act terminates;Second stage, if any Agent ID in subregion1Leader node is detected by timeout mechanism to be out of order, then it
Triggering election.According to agent node priority list, Agent ID1The Agent advertisement election message that it is high to all priority ratios,
And wait the answer of any one other agency, without receiving any response then it is assumed that all priority ratio ID1High generation
Reason is all out of order, then arrange ID1For leader, and update priority agent list and be broadcast to other agencies;If on rule
Receive one or more responses in fixing time, then priority list is acted on behalf of according to the priority update of response source agency, and set
Put highest priority for leader node;When priority is higher than ID1Agency receive ID1Election message when, it is to ID1Make
Response simultaneously sends an election algorithm electing message initiated its own by the agency higher to all priority.If this
Process oneself has highest priority, and it just can announce at once oneself is leader, and updates priority list.Repeat
Second stage.
Our autonomous management system adopts time and space compression method to exclude invalid alarm.Press between when employed
During contracting, need to study effective time window.Otherwise excessive time window can introduce invalid warning information, interference alarm point
The accuracy of analysis result.Too small time window can miss effective warning information, causes analysis result unreliable.Space compression is examined
Consider multiple filtering rules, including network topology, service logic be topological, parts (node level, device level etc.) associations at different levels.
Monitoring management system each subregion Autonomic Element realizes alarm association reasoning based on Drools, programmed using statement formula,
Logical AND data separating, data is saved in system object, and logic is saved in rule, based on Rete algorithm, Leaps algorithm,
There is provided to system data object efficient coupling, knowledge centralization, (domain defines language by setting up object model and DSL
Speech), can be with natural language come redaction rule, explanation facility.
Unified monitoring management strategy is Unify legislation and the enforcement of monitoring management strategy.Unified Policy description and enforcement meaning
It is:Concrete monitoring management strategy configuration detail and difference that shielding to various managed device and is applied, and with unified monitoring pipe
The tactful configuration interface of reason presents to user, allows users to intently by the high-efficiency computer monitoring management of natural language description
Policy mappings become the strategy that machine can be implemented.Cloudview realizes unified monitoring management strategy mechanism according to below step.
Policy class:It is divided into some classifications according to global monitoring management strategy, including:Switch, disk array, operation
System, tape library, data base, hardware information etc..
Strategy is abstract:Each level Autonomic Element, from the monitoring management strategy of same type different vendor product, takes out
The unified monitoring management strategy form of the type product.Such as:The manufacturer producing disk array is a lot, and each manufacturer has oneself
Different privately owned disk array MIB storehouses, the monitoring management strategy based on a kind of consolidation form is it is simply that disk array unified monitoring pipe
Reason strategy.
Policy depiction:On the basis of above-mentioned monitoring management policy class, each level Autonomic Element is realized to various species
Monitoring management strategy carry out Unify legislation.
Strategy combination:Monitoring management strategy is divided into direct strategy and two kinds of indirect strategies, wherein, direct strategy is permissible
Changed by strategy and be directly implemented in concrete equipment or application, that is, the strategy corresponding to given strategy said before is taken out
As, and indirect strategies are then combined by one group of direct strategy or indirect strategies.Introducing one benefit of indirect strategies is can be square
Just our services complete to one is directly managed.Management for a service often relates to plurality of devices and application,
We only need to provide an abstract indirect strategies description corresponding to this service, and then by policy library, (policy library removes each monitoring
Also can find show that service dependence assists to generate according to services topology outside management original configuration policy library, and by specific shape
Operation formation rule under condition, is stored in knowledge base, reaches the purpose optimizing autonomous rule.) it is mapped as directly step by step
Strategy, then passes through strategy conversion and is implemented in concrete equipment or application.
Strategy configuration:Realize Unified Policy being converted to the monitoring management strategy processing module of concrete equipment strategy, in addition
The equipment supervision realizing again concrete equipment strategy is distributed on equipment or application drives and proxy module.
Unified configuration to all kinds of strategies, i.e. policy depiction are completed by each " the tactful configuration interface " of MC.By in MC
Corresponding each " policy enforcement module " completes to tactful conversion, is converted into the strategy configuration that can be implemented on concrete equipment
Information, and be distributed on each Managed Object by DHC.So, user can be according to the monitoring management plan of natural language description
Slightly, uniformly configure and implement overall unified monitoring management strategy, its mechanism is as shown in Figure 4.
Claims (6)
1. a kind of towards cloud computing platform distributed levelization autonomous management method it is characterised in that:
Cloud computing management system is carried out logical partition;Pass through to build multi-level Autonomic Element realization from supervisor inside subregion
Reason;On subregion, the higher leveled Autonomic Element of layer building realizes system-level management;This is read from knowledge base autonomous during autonomous management
Plan corresponding rule, lexical analysis module detects whether to meet rule, triggers response events, then event is submitted to event
Management module caches and dispatches execution;
Described from Major program include divide logical partition plan, election plan and alarm association plan;
Described Autonomic Element includes base module, monitoring resource module, analysis module, event manager module, respond module, and
Row performing module and autonomous schedule module;
Described base module passes through knowledge base interactive interfacing with user and provides customizable autonomous rule;
Described monitoring resource module is safeguarded the storage standard of a resource and is accepted from the resource being managed node Autonomic Element
Information, resource information is stored in data base according to standard directories form and uses for other Autonomic Elements;
Described analysis module is used by respective Major program, judges whether the information of storage in data base meets the rule in knowledge base
Condition;Produce the event of needs execution when meeting, set Event Priority, and event description is sent to event manager module;
Described event manager module caches the event description that analysis module is stored in, and strategically scheduling determines that caching event whether may be used
To execute, if executable, generate the scheduled event of concurrent thread execution, execute response in thread, complete specifically to ring
Should;
Described respond module provides the method that response action is registered as predefined response, and manages the mapping table of the two;
Described executed in parallel module is used for script or the rule that execution respond module produces on multiple nodes simultaneously;
Described autonomous schedule module logically controls remaining six module, forms the freedom attributes of a management system;
Described base module can extend online, realizes passive learning;Can also be by Dynamical Deployment, renewal and deletion rule
To be changed and sophisticated systems behavior in the case of not changing Software Coding or halt system operation with realizing;
Described Autonomic Element is divided into parametric degree, component-level, node level, partition level and system-level;Minimum autonomous in parametric degree definition
Element, and build component-level on the basis of parametric degree, build node level on the basis of component-level, build on the basis of node level and divide
Area's level, constructing system level on the basis of partition level;
Described node level element is deployed on server node, and is responsible for the management of all parts in this node level node, each
In the management node of individual logical partition, deployment partition level Autonomic Element is responsible for partition management work, in top deployment system level certainly
Host element, is communicated between each Autonomic Element by way of extending ID description standard CIM;Be managed on node from
Host element is responsible for collecting the status information of each resource on this node, is sent to higher level's Autonomic Element, and executes higher level's Autonomic Element
The order issued, each basic unit Autonomic Element does not have the overall situation to see and the knowledge base of oneself, is judged whether by upper strata Autonomic Element completely
Meet predefined condition, and execute corresponding response;
The process of described election plan is:
The topological structure that is entirely connected is set up between all nodes;
The priority of each node is set;
The node of election highest priority is leader node, and is broadcast to other nodes;
If other nodal test to leader node go wrong when, triggering election;
New leader node is elected according to priority and re-broadcasts.
2. the method for claim 1 it is characterised in that:Described autonomous rule includes conditional plan and formula rule;
Described conditional plan includes predefined conditional plan and predefined rule of response;
Described predefined conditional plan is expression formula or the expression logic combination of various Resource Properties, described Resource Properties storage
By monitoring resource module management in Resource Properties catalogue;
Described predefined rule of response is that action is registered generation, the trigger action when condition meets by respond module.
3. the method for claim 1 it is characterised in that:Described logical partition is dividing and need to calculate each point according to formula
The nodes in area, computing formula is:
Wherein R in formula1、R2、R3、R4、R5For the weight of different resource load value, Σ Ri=1;During self-adaptative adjustment, new coefficient
The computing formula of Rnewi is:Li represents the load value of Current resource, if RnewiWith old coefficients R i phase
Ratio exceedes the threshold value of reservation, then the coefficient with newly calculating substitutes old coefficient.
4. the method for claim 1 it is characterised in that:Described alarm association plan adopts time and space compression method
Exclude invalid alarm.
5. the method for claim 1 it is characterised in that:Described election is using towards cloud computing election algorithm, described face
Employ the conventional distributed network based on agent node group in large-scale distributed network management to cloud computing election algorithm
Network administrative mechanism, if being managed node number in a logical partition is n, each node all has a node-agent, this generation
Li Youyige globally unique identifier, and as priori known to the agency of other nodes in this subregion, appoint in whole subregion
Message can be transmitted mutually by message between meaning two agency, as entirely connect topological structure, the set of whole partitioned proxies can
With with { ID0,ID1,ID2,……IDN-1Represent;In each logical partition, one leader agency of setting is to the agency in subregion
Node is managed;Cooperate according to centralized management pattern between leader node and agent node, i.e. leader node instruction agency
Node is specifically operated or is provided specific information, and agent node returns operating result or the information being required;Leader saves
Then according to certain distributed collaboration Pattern completion management role between point.
6. the method for claim 1 it is characterised in that:Described cloud computing management system adopts unified monitoring to manage plan
Slightly, described unified monitoring management strategy content is as follows:
Policy class:It is divided into some classifications according to global monitoring management strategy, including:Switch, disk array, operation system
System, tape library, data base, hardware information;
Strategy is abstract:Each level Autonomic Element, from the monitoring management strategy of same type different vendor product, takes out such
The unified monitoring management strategy form of type product;
Policy depiction:On the basis of above-mentioned monitoring management policy class, each level Autonomic Element realizes the prison to various species
Control management strategy carries out Unify legislation;
Strategy combination:Monitoring management strategy is divided into direct strategy and two kinds of indirect strategies, wherein, direct strategy can be by
Strategy conversion is directly implemented in concrete equipment or application, and indirect strategies are then combined by one group of direct strategy or indirect strategies
Form;
Strategy configuration:Realize Unified Policy is converted to the monitoring management strategy processing module of concrete equipment strategy, in addition real again
Now the equipment supervision that concrete equipment strategy is distributed on equipment or application is driven and proxy module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210042033.3A CN102724057B (en) | 2012-02-23 | 2012-02-23 | A kind of distributed levelization autonomous management method towards cloud computing platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210042033.3A CN102724057B (en) | 2012-02-23 | 2012-02-23 | A kind of distributed levelization autonomous management method towards cloud computing platform |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102724057A CN102724057A (en) | 2012-10-10 |
CN102724057B true CN102724057B (en) | 2017-03-08 |
Family
ID=46949726
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210042033.3A Expired - Fee Related CN102724057B (en) | 2012-02-23 | 2012-02-23 | A kind of distributed levelization autonomous management method towards cloud computing platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102724057B (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103473062B (en) * | 2013-09-13 | 2017-01-18 | Tcl移动通信科技(宁波)有限公司 | Method and system for mobile terminal customization based on user space file system |
WO2016070375A1 (en) | 2014-11-06 | 2016-05-12 | 华为技术有限公司 | Distributed storage replication system and method |
CN105407334A (en) * | 2015-12-29 | 2016-03-16 | 上海大学 | Self management method for multi-scenario monitoring videos |
CN105427545B (en) * | 2015-12-30 | 2018-07-17 | 山东中创软件商用中间件股份有限公司 | Device Alarm Management method and device based on drools |
CN105872068A (en) * | 2016-04-28 | 2016-08-17 | 国网浙江省电力公司信息通信分公司 | Cloud platform and automatic operation check method based on same |
CN107707431A (en) * | 2017-10-31 | 2018-02-16 | 河南科技大学 | The data safety monitoring method and system of a kind of facing cloud platform |
US10735529B2 (en) | 2017-12-07 | 2020-08-04 | At&T Intellectual Property I, L.P. | Operations control of network services |
CN108337315B (en) * | 2018-02-07 | 2019-10-08 | 平安科技(深圳)有限公司 | Dispositions method, device, computer equipment and the storage medium of monitoring system |
CN108847961B (en) * | 2018-05-28 | 2021-07-16 | 中国电子科技集团公司第五十四研究所 | Large-scale high-concurrency deterministic network system |
CN111078399B (en) * | 2019-11-29 | 2023-10-13 | 珠海金山数字网络科技有限公司 | Resource analysis method and system based on distributed architecture |
CN112379977A (en) * | 2020-07-10 | 2021-02-19 | 中国航空工业集团公司西安飞行自动控制研究所 | Task-level fault processing method based on time triggering |
CN111711702B (en) * | 2020-08-18 | 2020-12-18 | 北京东方通科技股份有限公司 | Distributed cooperative interaction method and system based on communication topology |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101118521A (en) * | 2006-08-01 | 2008-02-06 | 国际商业机器公司 | System and method for spanning multiple logical sectorization to distributing virtual input-output operation |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE112010003594B4 (en) * | 2009-10-19 | 2023-03-16 | International Business Machines Corporation | Apparatus, method and computer program for operating a distributed write storage network |
-
2012
- 2012-02-23 CN CN201210042033.3A patent/CN102724057B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101118521A (en) * | 2006-08-01 | 2008-02-06 | 国际商业机器公司 | System and method for spanning multiple logical sectorization to distributing virtual input-output operation |
Non-Patent Citations (3)
Title |
---|
基于自主计算的集群管理软件的设计与实现;李云春等;《中山大学学报(自然科学版)》;20090331;第48卷(第S期);第248-251页 * |
基于自适应控制理论的自主计算;吕晔等;《福建电脑》;20090430(第4期);第101-102页 * |
自主计算概念模型与实现方法*;廖备水等;《软件学报》;20080430(第4期);第779-802页 * |
Also Published As
Publication number | Publication date |
---|---|
CN102724057A (en) | 2012-10-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102724057B (en) | A kind of distributed levelization autonomous management method towards cloud computing platform | |
CN102103518B (en) | System for managing resources in virtual environment and implementation method thereof | |
CN100570569C (en) | Operation cross-domain control method under the grid computing environment | |
JP4304535B2 (en) | Information processing apparatus, program, modular system operation management system, and component selection method | |
CN103154926A (en) | Virtual resource cost tracking with dedicated implementation resources | |
CN105975378A (en) | Distributed layering autonomous monitoring and management system facing supercomputer | |
Goyal et al. | Adaptive and dynamic load balancing in grid using ant colony optimization | |
Tong et al. | Bloom filter-based workflow management to enable QoS guarantee in wireless sensor networks | |
CN106911540A (en) | The method and cloud platform of analysis power resource and service data | |
Kanbar et al. | Region aware dynamic task scheduling and resource virtualization for load balancing in IoT–fog multi-cloud environment | |
Skarlat et al. | FogFrame: a framework for IoT application execution in the fog | |
CN109587026A (en) | A method of large and medium-sized enterprise's Network Programe Design based on Java | |
Chhetri et al. | AWaRE-towards distributed self-management for resilient cyber systems | |
Hasanzadeh et al. | Distributed optimization grid resource discovery | |
CN106254452A (en) | The big data access method of medical treatment under cloud platform | |
CN106302656A (en) | The Medical Data processing method of cloud storage platform | |
Ribeiro et al. | A management architectural pattern for adaptation system in Internet of Things | |
Lv et al. | A hierarchical management architecture for virtual network mapping | |
Yahaya et al. | Dynamic load balancing policy with communication and computation elements in grid computing with multi-agent system integration | |
Huang et al. | Performance diagnosis for SOA on hybrid cloud using the Markov network model | |
Csorba et al. | A bio-inspired method for distributed deployment of services | |
Xu et al. | Cooperative autonomic management in dynamic distributed systems | |
CN100373883C (en) | Gridding service group establishing method and gridding service discovering method | |
Saxena et al. | A High Up-Time and Security Centered Resource Provisioning Model Towards Sustainable Cloud Service Management | |
Rahman et al. | An autonomic workflow management system for global grids |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170308 |