CN107368365A - Cloud platform automatic O&M method, system, equipment and storage medium - Google Patents

Cloud platform automatic O&M method, system, equipment and storage medium Download PDF

Info

Publication number
CN107368365A
CN107368365A CN201710623425.1A CN201710623425A CN107368365A CN 107368365 A CN107368365 A CN 107368365A CN 201710623425 A CN201710623425 A CN 201710623425A CN 107368365 A CN107368365 A CN 107368365A
Authority
CN
China
Prior art keywords
event
cloud platform
workflow
task
virtual machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710623425.1A
Other languages
Chinese (zh)
Inventor
周昕毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ctrip Travel Information Technology Shanghai Co Ltd
Original Assignee
Ctrip Travel Information Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ctrip Travel Information Technology Shanghai Co Ltd filed Critical Ctrip Travel Information Technology Shanghai Co Ltd
Priority to CN201710623425.1A priority Critical patent/CN107368365A/en
Publication of CN107368365A publication Critical patent/CN107368365A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45591Monitoring or debugging support

Abstract

The invention provides a kind of cloud platform automatic O&M method, system, equipment and storage medium, this method is included according to multiple multiple workflows of O&M task creation;Establish the triggering rule of the workflow;When receiving pending O&M event, event type corresponding to judgement, and O&M event data corresponding to acquisition;According to the triggering of workflow rule, the workflow that the event type of the pending O&M event triggers is judged;The Perform sequence of each O&M task in the O&M event data and the workflow being triggered, performs the workflow being triggered.The present invention realizes event-driven O&M mechanism based on event capture and workflow technology, further increase cloud platform O&M automaticity, O&M operation is voluntarily driven by instrument, reduces the manual operation of engineer, and has evaded the possible caused operation potential risk of human factor;Rule-based engine technique, which reduces, links up cost, improves O&M efficiency.

Description

Cloud platform automatic O&M method, system, equipment and storage medium
Technical field
The present invention relates to cloud service technical field, more particularly to one kind to perform cloud platform fortune automatically based on event-driven mechanism Cloud platform automatic O&M method, system, equipment and the storage medium of dimension task.
Background technology
Cloud platform is the important underlying platform in O&M architecture, and number of servers is numerous, and platform stabilization requires high, Very big challenge be present in operation management work.Cloud platform manager needs to lift work effect by using automation operation and maintenance tools Rate, the practice such as the standardization of cloud platform O&M, automation, process optimization is carried out, ensures cloud platform stability, reduces cloud platform O&M Cost.
The automatic operation and maintenance tools of traditional cloud platform include some cloud platform service management instruments and monitoring tools etc., can be with The information such as running status and resource allocation to cloud platform are monitored in real time, in the event of exception, can be carried out to engineer Fault warning.Engineer can be handled after fault warning information is received.
Traditional cloud platform automation operation and maintenance tools are primarily present following three problem.1) O&M operation is by engineer master Lead:Conventional automated operation and maintenance tools need engineer's log-on webpage interface to click on the related O&M operation of execution, however it remains grasp by mistake The risk of work.2) development cost is high:Too development engineer needs to collect the demand of multiple systems, carries out tool design and opens Hair, the similar functions module of different system can not be also multiplexed, it is necessary to reformulate.3) communication efficiency is low:Traditional cloud platform is certainly Need to work in coordination between dynamicization operation and maintenance tools and could complete maintenance work, it is each different that this requires that cloud platform manager is familiar with The occupation mode of instrument, it is necessary to which how keeper manual decision does accordingly in another system after a system event occurs Processing.Such as:To be delayed the situation of machine when host occurs in cloud platform, alarm instrument can send out mail to responsible O&M engineer, O&M engineer, which gets the mail, needs login monitoring system to check all kinds of indexs, final to determine that login configurations management tool performs weight Open server operation.Each server machine processing procedure of delaying all is similar, this also means that using conventional automated O&M work The engineer of tool needs to carry out the substantial amounts of duplication of labour.
The content of the invention
For the problems of the prior art, it is an object of the invention to provide a kind of automatic O&M method of cloud platform, system, Equipment and storage medium, event-driven O&M mechanism is realized based on event capture and workflow technology, it is complete to realize cloud platform O&M is automated, cloud platform O&M efficiency is improved, saves the manual operation of engineer.
The embodiment of the present invention provides a kind of automatic O&M method of cloud platform, and methods described comprises the following steps:
According to multiple multiple workflows of O&M task creation, each workflow includes at least one O&M task With the Perform sequence of each O&M task;
The triggering rule of the workflow is established, the triggering rule of each workflow includes the thing for triggering the workflow Part type;
When receiving pending O&M event, event type corresponding to judgement, and O&M event data corresponding to acquisition;
According to the triggering of workflow rule, the work of the event type triggering of the pending O&M event is judged Stream;
The Perform sequence of each O&M task in the O&M event data and the workflow being triggered, perform quilt The workflow of triggering.
Alternatively, the O&M event type includes alarm event, and the alarm event is delayed machine event, void including host Plan machine is delayed machine event and resource alarm event.
Alternatively, the host delay machine event triggering workflow include host acquisition of information task and host in Virtual machine (vm) migration task;
The workflow that machine event of the virtual machine delaying triggers, which includes applying in virtual machine information acquisition task and virtual machine, moves Shifting task;
The workflow of the resource alarm event triggering includes cloud platform dilatation task and/or daily record clean-up task.
Alternatively, the O&M event data that the host is delayed corresponding to machine event includes host identification code;
The pending O&M event be host delay machine event when, it is described perform the workflow that is triggered include it is as follows Step:
Host acquisition of information task is performed, the host for the machine of delaying is found according to the host identification code and is somebody's turn to do Virtual machine information on host, the virtual machine information include the operation service in virtual machine identification code and virtual machine;
Virtual machine (vm) migration task in host is performed, virtual machine (vm) migration task comprises the following steps in the host:
The configuration of virtual machine corresponding to the virtual machine identification code is removed from load equalizer cluster;
Virtual machine corresponding to the virtual machine identification code is restarted on another host in cloud platform;
Start the operation service on the virtual machine restarted;
Call the availability of operation service described in health examination interface check;
It will check that the virtual machine passed through is added to the load equalizer cluster again.
Alternatively, the O&M event data that the host is delayed corresponding to machine event also includes cloud platform resource occupation ratio Example;
The pending O&M event be host delay machine event when, the workflow that is triggered of performing also includes holding Row virtual machine (vm) migration mode selects task, and the virtual machine (vm) migration mode selects task to comprise the following steps:
Judge whether the cloud platform resource occupation ratio is more than the first predetermined threshold value;
If it is, selection virtual machine (vm) migration mode is multiple virtual machine parallel migrations;
Otherwise, selection virtual machine (vm) migration mode serially migrates for multiple virtual machines.
Alternatively, the O&M event data that the virtual machine is delayed corresponding to machine event includes virtual machine identification code;
The pending O&M event be virtual machine delay machine event when, it is described perform the workflow that is triggered include it is as follows Step:
Perform virtual machine information and obtain task, virtual machine, the hair of the machine of delaying are found according to the virtual machine identification code The operation application in operation service and operation service in the virtual machine of raw machine of delaying;
Perform in virtual machine and apply migration task, comprise the following steps in the virtual machine using migration task:
The configuration of the operation application in operation service is removed from load equalizer cluster;
Restart the virtual machine for the machine of delaying;
Start the operation service in the virtual machine restarted;
Call the availability that application is run described in health examination interface check;
It will check that the operation passed through application is added to load equalizer cluster again.
Alternatively, the resource alarm event includes magnetic disk of virtual machine alarm event, virtual machine CPU alarm events and Yun Ping Platform cluster resource alarm event;
When the pending O&M event is magnetic disk of virtual machine alarm event, the workflow being triggered includes disk Clean-up task;
When the pending O&M event is virtual machine CPU alarm events or cloud platform cluster resource alarm event, institute Stating the workflow being triggered includes cloud platform dilatation task.
Alternatively, the O&M event type also includes altering event, and the altering event includes cloud platform service release Altering event and cloud platform capacity altering event;
Workflow corresponding to the cloud platform service release altering event includes responsible person and notifies task, and the cloud is put down Workflow corresponding to platform capacity altering event includes scalable capacitance type selection task, cloud platform dilatation task and cloud platform capacity reducing Task.
Alternatively, the O&M event data corresponding to the cloud platform service release altering event includes needing the place changed The virtual machine information and version change information that host information, needs change;
The responsible person notifies task to comprise the following steps:
The responsible person according to corresponding to obtaining the host's machine information for needing to change, the virtual machine information for needing to change Contact method;
Version change notice is sent to the responsible person according to the contact method of the responsible person and version change is believed Breath.
Alternatively, the O&M event data corresponding to the cloud platform capacity altering event accounts for including cloud platform cluster resource Use ratio;
When the pending O&M event is cloud platform scalable appearance event, the workflow being triggered that performs includes root According to following steps:
Scalable capacitance type selection task is performed, judges whether the cloud platform cluster resource occupation proportion is more than the second threshold Value, if it is, performing cloud platform dilatation task;
If the cloud platform cluster resource occupation proportion is less than or equal to Second Threshold, the cloud platform cluster money is judged Whether source occupation proportion is less than the 3rd threshold value, and the 3rd threshold value is less than Second Threshold, appoints if it is, performing cloud platform capacity reducing Business.
Alternatively, the workflow includes also including the boundary condition that successfully fails of each O&M tasks carrying and each The subsequent execution task that O&M tasks carrying successfully fails;It is described to perform the workflow being triggered, comprise the following steps:
Perform sequence in the workflow found performs each O&M task successively;
After each O&M tasks carrying, judge whether the O&M task is held according to corresponding boundary condition Go successfully, and generate implementing result;
Subsequent execution task is selected according to the implementing result of the O&M task.
Alternatively, pending O&M event is received from cloud platform operation and maintenance tools;The cloud platform operation and maintenance tools bag Include cloud platform server management tool, asset data management storehouse, cloud platform monitoring tools, alarm instrument, configuration management tool, domain At least one of name server management tool, code storage management tool and resource management instrument.
Alternatively, also comprise the following steps:
It whether there is O&M event to be combined in the O&M event for judging to get, the O&M event to be combined belongs to same One O&M event type, and correspond to same server;
If there is O&M event to be combined, then O&M event to be combined is merged into an O&M event.
Alternatively, also comprise the following steps:
Obtain the event termination condition of each O&M event;
Judge it is currently processed in O&M event whether reach corresponding to event termination condition;
If there is the O&M event for reaching corresponding event termination condition, then terminate to handle the O&M event.
Alternatively, also comprise the following steps:
The output data in O&M event, workflow execution data and the workflow execution got is recorded, and will record Data output is to user terminal.
The embodiment of the present invention also provides a kind of cloud platform automatic operational system, for realizing the described automatic O&M of cloud platform Method, the system include:
Workflow module, for being included at least according to multiple multiple workflows of O&M task creation, each workflow The Perform sequence of one O&M task and each O&M task;
Rule engine module, for establishing the triggering rule of the workflow, the triggering rule bag of each workflow Include the event type for triggering the workflow;
Sensor assembly, during for receiving pending O&M event, event type corresponding to judgement, and obtain correspondingly O&M event data;
Execution module, it is regular for the triggering according to the workflow, judge the event of the pending O&M event The workflow of type triggering;And each O&M task in the O&M event data and the workflow that is triggered is held Row sequence, perform the workflow being triggered.
Alternatively, in addition to Audit Module, the Audit Module are used to record the O&M event got, workflow execution Output data in data and workflow execution, and record data is exported to user terminal.
The embodiment of the present invention also provides a kind of cloud platform automatic O&M equipment, it is characterised in that including:
Processor;
Memory, wherein being stored with the executable instruction of the processor;
Wherein, the processor is configured to perform the automatic O&M of described cloud platform via the executable instruction is performed The step of method.
The embodiment of the present invention also provides a kind of computer-readable recording medium, and for storage program, described program is performed The step of cloud platform described in Shi Shixian automatic O&M method.
It should be appreciated that the general description and following detailed description of the above are only exemplary and explanatory, not The disclosure can be limited.
Cloud platform automatic O&M method, system, equipment and storage medium provided by the present invention have following advantages:
The present invention solve engineer in traditional cloud platform automation operation and maintenance tools dominate O&M operation, development cost it is high and The problem of communication efficiency is low, event-driven O&M mechanism is realized based on event capture and workflow technology, further increased Cloud platform O&M automaticity, O&M operation are voluntarily driven by instrument, reduce the manual operation of engineer, and evade Human factor may caused operation potential risk;Rule-based engine technique, which reduces, links up cost, improves O&M efficiency.
Brief description of the drawings
The detailed description made by reading with reference to the following drawings to non-limiting example, further feature of the invention, Objects and advantages will become more apparent upon.
Fig. 1 is the flow chart of the automatic O&M method of cloud platform of one embodiment of the invention;
Fig. 2 be one embodiment of the invention host delay machine event triggering workflow execution flow chart;
Fig. 3 be one embodiment of the invention host in virtual machine (vm) migration flow chart;
Fig. 4 be one embodiment of the invention virtual machine delay machine event triggering workflow execution flow chart;
Fig. 5 is the flow chart of the workflow execution of the scalable appearance event triggering of cloud platform of one embodiment of the invention;
Fig. 6 be one embodiment of the invention cloud platform O&M in event merge flow chart;
Fig. 7 be one embodiment of the invention cloud platform O&M in event close flow chart;
Fig. 8 is the structural representation of the automatic operational system of cloud platform of one embodiment of the invention;
Fig. 9 is the structural representation of the automatic O&M equipment of cloud platform of one embodiment of the invention;
Figure 10 is the structural representation of the computer-readable recording medium of one embodiment of the invention.
Embodiment
Example embodiment is described more fully with referring now to accompanying drawing.However, example embodiment can be with a variety of shapes Formula is implemented, and is not understood as limited to example set forth herein;On the contrary, these embodiments are provided so that the disclosure will more Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.Described feature, knot Structure or characteristic can be incorporated in one or more embodiments in any suitable manner.
In addition, accompanying drawing is only the schematic illustrations of the disclosure, it is not necessarily drawn to scale.Identical accompanying drawing mark in figure Note represents same or similar part, thus will omit repetition thereof.Some block diagrams shown in accompanying drawing are work( Can entity, not necessarily must be corresponding with physically or logically independent entity.These work(can be realized using software form Energy entity, or these functional entitys are realized in one or more hardware modules or integrated circuit, or at heterogeneous networks and/or place These functional entitys are realized in reason device device and/or microcontroller device.
As shown in figure 1, the automatic O&M method of cloud platform comprises the following steps:
S100:According to multiple multiple workflows of O&M task creation, each workflow includes at least one fortune The Perform sequence of dimension task and each O&M task;
I.e. script can be re-used in each workflow corresponding to O&M task, and each workflow can be multiplexed Same O&M mission script, so for identical O&M task, user only needs to write a script, it is possible to realizes more Secondary calling, without every time perform O&M task when, all write O&M mission script again;
S200:The triggering rule of the workflow is established, the triggering rule of each workflow includes triggering the work The event type of stream;
S300:When receiving pending O&M event, event type corresponding to judgement, and O&M event corresponding to acquisition Data;
S400:According to the triggering of workflow rule, the event type triggering of the pending O&M event is judged Workflow;
I.e. workflow can also be multiplexed when multiple O&M performs, and corresponding work can be triggered by being directed to identical O&M event Flow, substantially reduce operation and maintenance tools development cost;In addition, after O&M event occurs, can call automatically corresponding to O&M event Workflow, without engineer's artificial selection.
S500:The Perform sequence of each O&M task in the O&M event data and the workflow being triggered, Perform the workflow being triggered.
It will be appreciated that the label of each step is only to distinguish each step herein, and each step might not be limited Execution sequence, in actual applications, the execution sequence of each step can be adjusted as needed, belong to the present invention Within protection domain.
Therefore, the embodiment of the present invention, which solves traditional cloud platform automation operation and maintenance tools, needs engineer to dominate O&M operation The problem of, there is provided a set of new event-driven O&M mechanism, O&M operation are initiated by instrument is leading, held for all kinds of events The corresponding O&M operation of row, maloperation caused by avoiding human factor possible.Meanwhile to solve traditional cloud platform automatic by the present invention Change the problem of operation and maintenance tools development cost is high, there is provided a set of new workflow development pattern, O&M operating procedure is passed through Configuration file description is defined as workflow, and too development engineer can select workflow on demand, between different operation and maintenance tools Multiplexing, substantially reduces operation and maintenance tools development cost.To be used in addition, this invention also solves traditional cloud platform automation operation and maintenance tools In the low problem of communication efficiency that runs into, there is provided set of rule engine mechanism, each operation and maintenance tools can create rule, will be Event caused by system operation and the workflow of successor processing are bound.Rule-based engine technique realizes that event-driven is transported Dimension, reduce and link up cost, lift communication efficiency.
Further, the O&M event type can include alarm event, and the alarm event mainly includes host Delay machine event, virtual machine is delayed machine event and resource alarm event, and the altering event mainly includes cloud platform service release and changed Event and cloud platform capacity altering event.
Further, the host delay machine event triggering workflow mainly include host acquisition of information task and place Virtual machine (vm) migration task in main frame;The virtual machine delay machine event triggering workflow mainly include virtual machine information obtain task With migration task is applied in virtual machine;The workflow of resource alarm event triggering mainly include cloud platform dilatation task and/ Or daily record clean-up task.
Further, the O&M event data that the host is delayed corresponding to machine event includes host identification code, passes through The host identification code can obtain some related information of host.
As shown in Figure 2:The pending O&M event be host delay machine event when, it is described to perform the work that is triggered Stream comprises the following steps:
A1:Host acquisition of information task is performed, the host for the machine of delaying is found according to the host identification code With the virtual machine information on the host, the virtual machine information includes the operation service in virtual machine identification code and virtual machine;
A3:Perform virtual machine (vm) migration task in host;
As shown in figure 3, virtual machine (vm) migration task comprises the following steps in the host:
A31:The configuration of virtual machine corresponding to the virtual machine identification code is removed from load equalizer cluster;
A32:Virtual machine corresponding to the virtual machine identification code is restarted on another host in cloud platform;
A33:Start the operation service on the virtual machine restarted;
A34:Call the availability of operation service described in health examination interface check;
A35:It will check that the virtual machine passed through is added to the load equalizer cluster again.
Alternatively, the O&M event data that the host is delayed corresponding to machine event also includes cloud platform resource occupation ratio Example, the pending O&M event be host delay machine event when, the workflow that is triggered of performing also includes A2:Perform Virtual machine (vm) migration mode selects task, and the virtual machine (vm) migration mode selects task to comprise the following steps:
Judge whether the cloud platform resource occupation ratio is more than the first predetermined threshold value;
If it is, selection virtual machine (vm) migration mode is multiple virtual machine parallel migrations;
Otherwise, selection virtual machine (vm) migration mode serially migrates for multiple virtual machines.
For example, certain host is delayed after machine, in overall cluster surplus resources>When 20%, to current on the host The virtual machine of operation performs virtual machine (vm) migration workflow parallel:
The configuration of the virtual machine is removed from load equalizer cluster;
Restart the virtual machine in another host;
Start the tomcat services on the virtual machine;
Call health examination interface check tomcat service availabilities;
Virtual machine by availability inspection is pulled in into load equalizer cluster.
Certain host is delayed after machine, remaining in overall cluster resource<When 20%, to what is currently run on the host Virtual machine serially performs virtual machine (vm) migration workflow:
The configuration of the virtual machine is removed from load equalizer cluster;
Restart the virtual machine in another host;
Start the tomcat services on the virtual machine;
Call health examination interface check tomcat service availabilities;
Virtual machine by availability inspection is pulled in into load equalizer cluster;
It is only an application example herein, in actual applications, other O&M tasks, or setting O&M can be used Different execution relation between task, the numbering of above-mentioned steps, which is also only, distinguishes each step, rather than represents the suitable of each step Sequence, the order of each step can as needed be adjusted, belong within protection scope of the present invention.
As shown in figure 4, the O&M event data that the virtual machine is delayed corresponding to machine event includes virtual machine identification code;It is described Pending O&M event be virtual machine delay machine event when, the workflow that is triggered of performing comprises the following steps:
B1:Perform virtual machine information obtain task, according to the virtual machine identification code find the machine of delaying virtual machine, Operation application in the operation service and operation service delayed in the virtual machine of machine;
B2:Perform in virtual machine and apply migration task, comprise the following steps in the virtual machine using migration task:
B21:The configuration of the operation application in operation service is removed from load equalizer cluster;
B22:Restart the virtual machine for the machine of delaying;
B23:Start the operation service in the virtual machine restarted;
B24:Call the availability that application is run described in health examination interface check;
B25:It will check that the operation passed through application is added to load equalizer cluster again.
Such as:Virtual machine VM01 has operated above tomcat programs, is serviced for application APP01;
When VM01 delays the generation of machine event, following workflow task can be triggered automatically:
APP01-VM01 configuration is removed from load equalizer cluster;
Restart VM01 virtual machines;
Start tomcat services;
Call APP01-VM01 health examination interface checks APP01-VM01 availability;Call the mesh of health examination interface Be that availability detection is carried out to application, it is ensured that VM01, which pulls in APP01 applications after cluster, can normally provide service;
APP01-VM01 by availability inspection is pulled in into load equalizer cluster;Load equalizer cluster is pulled in, should Operation is to automate operation and maintenance tools described in this programme to call the api interface of load equalizer cluster to perform, and is not related to pair Virtual machine VM01 operation.
Herein virtual machine VM01, using APP01 be cloud platform in numbering.Tomcat services refer to that one free is opened The web application servers of source code are put, belong to lightweight application server.
Further, the resource alarm event includes magnetic disk of virtual machine alarm event, virtual machine CPU alarm events and cloud Platform cluster resource alarm event;
When the pending O&M event is magnetic disk of virtual machine alarm event, the workflow being triggered includes disk Clean-up task;
When the pending O&M event is virtual machine CPU alarm events or cloud platform cluster resource alarm event, institute Stating the workflow being triggered includes cloud platform dilatation task.
For example, it is directed to magnetic disk of virtual machine free space<10% alarm, system can perform cleaning (journal file) behaviour automatically Make, release disk space.It is higher than 70% alarm event for virtual machine CPU usage, cloud platform dilatation behaviour can be performed automatically Make.
Further, the O&M event type can also include altering event, and the altering event takes including cloud platform Version change event of being engaged in and cloud platform capacity altering event;
Workflow corresponding to the cloud platform service release altering event includes responsible person and notifies task, and the cloud is put down Workflow corresponding to platform capacity altering event includes scalable capacitance type selection task, cloud platform dilatation task and cloud platform capacity reducing Task.
Further, the O&M event data corresponding to the cloud platform service release altering event includes needing what is changed The virtual machine information and version change information that host's machine information, needs change;
The responsible person notifies task to comprise the following steps:
The responsible person according to corresponding to obtaining the host's machine information for needing to change, the virtual machine information for needing to change Contact method;
Version change notice is sent to the responsible person according to the contact method of the responsible person and version change is believed Breath.
For example, O&M engineer actively initiates change, cloud platform network management services from 1.0 edition upgradings to 2.0 versions, 10 hosts, 50 virtual machines can be influenceed, in tonight 22:00-22:Suspension 15 minutes between 15.
The embodiment can obtain 10 hosts automatically, apply director corresponding to 50 virtual machines, send mail in advance Notice.
Further, the O&M event data corresponding to the cloud platform capacity altering event includes cloud platform cluster resource Occupation proportion;
As shown in figure 5, when the pending O&M event is cloud platform scalable appearance event, it is described to perform the work being triggered Making stream is included according to following steps:
C1:Scalable capacitance type selection task is performed, judges whether the cloud platform cluster resource occupation proportion is more than second Threshold value, if it is, performing step C3, otherwise continue step C2;
C2:If the cloud platform cluster resource occupation proportion is less than or equal to Second Threshold, the cloud platform collection is judged Whether group's resource occupation ratio is less than the 3rd threshold value, and the 3rd threshold value is less than Second Threshold, if it is, performing step C4;
C3:The resource occupation amount for illustrating current cloud platform is bigger, it is necessary to carry out dilatation, could meet further need Ask, therefore perform cloud platform dilatation task;
C4:Illustrate that the resource occupation amount of current cloud platform is smaller, can suitably carry out capacity reducing, to reduce the utilization of resources, Therefore cloud platform capacity reducing task is performed.
In addition, cloud platform dilatation task may also can be triggered when application is actively changed, for example, redaction application is reached the standard grade, Online user number can increase by 10 times simultaneously.Cloud platform dilatation task can be then triggered, is increased with tackling burst flow.
The present embodiment only lists some O&M event types, but in actual applications, it is also possible to there can be other classes Offline event in the O&M event of type, such as virtual machine application event, host, using issue event, cloud platform edition upgrading Event etc..Protection scope of the present invention is not limited to described above and enumerating herein.
Similarly, O&M task is also not necessarily limited to the above-mentioned type enumerated, and O&M task can also include server operation system System installation, disk array configuration, network configuration, system software installation, application software installation and configuration, DB Backup and upgrading With the combination of any one or more in application software upgrade etc..
Further, the workflow includes also including the boundary condition that successfully fails of each O&M tasks carrying and respectively The subsequent execution task that individual O&M tasks carrying successfully fails;It is described to perform the workflow being triggered, comprise the following steps:
Perform sequence in the workflow found performs each O&M task successively;
After each O&M tasks carrying, judge whether the O&M task is held according to corresponding boundary condition Go successfully, and generate implementing result;
Subsequent execution task is selected according to the implementing result of the O&M task.
Can be set in workflow each O&M mission script run succeeded after operation and each O&M mission script The operation after failure is performed, such as a workflow includes O&M task A, O&M task B and O&M task C.Setting is held first Row O&M task A, if O&M task A runs succeeded, then performs O&M task B, if O&M task A performs failure, O&M script task C is performed again.Execution logic relation between each O&M mission script, structure can be set in workflow Build the diversified workflow corresponding with each O&M event.There is provided the execution logic between multiple O&M mission scripts After relation, each O&M mission script in each workflow found is performed, then needs to enter according to the execution logic relation Row is performed, and runs into after needing the node for judging success or failure, and subsequent operation is performed according to judged result.
Alternatively, pending O&M event is received from cloud platform operation and maintenance tools;The cloud platform operation and maintenance tools bag Include cloud platform server management tool, asset data management storehouse, cloud platform monitoring tools, alarm instrument, configuration management tool, domain At least one of name server management tool, code storage management tool and resource management instrument.
Wherein it is possible to the computing resource of cloud platform, Internet resources, storage resource are entered by cloud platform service management instrument Row management and control, the distribution condition and Life Cycle of all kinds of resources such as host, virtual machine, container are recorded by asset management database Phase;Grasp the monitor state of all kinds of services of cloud platform in real time by monitoring tools;Cloud platform can be occurred by alerting instrument Abnormality notify related personnel in time;Batch operation and cloud platform service can efficiently be carried out by configuration management tool State restrains;Independent domain name can be carried out by domain name service management tool for host, virtual machine, container to configure;Pass through Code storage management tool can carry out automating code deployment and rollback;By resource management instrument, cloud platform keeper can To carry out rationalization analysis to cloud platform history resource service condition, and the cloud platform resource growth pattern in future is carried out in advance Estimate.
As shown in fig. 6, the cloud platform O&M method of the present invention also includes the step of event merges, event merges and refers to have The event for having certain common trait does merging treatment, improves instrument treatment effeciency, and specifically, the time, which merges, includes following step Suddenly:
D1:It whether there is O&M event to be combined, the O&M event package to be combined in the O&M event for judging to get Including at least two has the identical default O&M event for merging feature;
D2:If there is O&M event to be combined, then O&M event to be combined is merged into an O&M event;
D3:If there is no O&M event to be combined, then without merging treatment.
Therefore, the embodiment of the present invention is by using event merging treatment, it is possible to reduce the number of event to be processed is needed, from And improve the treatment effeciency of O&M event.
Alternatively, a variety of O&M event types are preset, same server will be corresponded to, and belong to identical O&M event The O&M event of type is as O&M event to be combined.
Exemplified by alerting the alarm event of instrument transmission, when CPU alarm events, internal storage warning thing occurs in same host Part, disk alarm event, event merging treatment can be carried out according to host dimension.
As shown in fig. 7, alternatively, O&M method of the invention also includes the step of event is closed, event is closed and referred to Life cycle management is carried out to O&M event, the condition that setting event terminates will when state-event meets event termination condition The event is closed automatically, terminates its life cycle, engineer does not have to focus on this event again, without again to the O&M event Handled.Specifically, event is closed and comprised the following steps:
E1:Obtain the event termination condition of each O&M event;
E2:Judge it is currently processed in O&M event whether reach corresponding to event termination condition;
E3:If there is the O&M event for reaching corresponding event termination condition, then terminate to handle the O&M event;
E4:The O&M event of event termination condition, then do not close O&M event if there is no corresponding to reaching.
Wherein, the processing for closing O&M event can be in automatic O&M process, monitor currently real-time continuous O&M event in processing, once finding to reach the condition that event terminates, that is, it is turned off.The O&M event newly obtained, also may be used So that it compared with the also untreated O&M event obtained before, if meeting merging condition, can be carried out Merge, improve event handling efficiency.
Further, cloud platform O&M method of the invention can also comprise the following steps:
The output data in O&M event, workflow execution data and the workflow execution got is recorded, and will record Data output is to user terminal.
In actual applications, it is also possible to which the chance of suitable workflow can not be matched by existing, and at this moment can send report Alert information, ask the suitable workflow of engineer's artificial selection, or create new suitable workflow, and record the workflow with Corresponding relation between O&M event, for later run into identical O&M event when, directly invoke, without engineer again Operation.And by the process of such a renewal, it can make it that the triggering rule of workflow and workflow is more and more completeer It is kind, more diversified O&M event can be tackled.In use, the work that engineer increases newly can also be obtained in real time Stream, and the triggering rule of workflow, are carried out further perfect to this method.
It is possible to further the O&M mission script uploaded by Webpage or user mutual interface captures user;I.e. O&M mission script can be that can individually performing for the task arranged according to the routine work of engineer (can be Shell pin Originally, Python scripts or api interface call).For example, engineer is by routine work (such as on the main frame of cloud platform Xinsu Line) artificially it is organized into multiple task (system installation, disk array setting, network switching, monitor clients that can individually perform Installation, the installation of system administration client etc.).Each task is organized into corresponding O&M mission script, it is achieved thereby that once Editor's repeatedly multiplexing.Routine work may include:Host operating system installation, system software installation, application software installation with Combination of any one or more in configuration, DB Backup and upgrading and application software upgrade etc..
After having a database with multiple O&M mission scripts, newly-built workflow can be inputted according to user, At least one O&M mission script is added in newly-built workflow;Can be to provide one to sequence of tasks for engineer Carry out the function of layout management.
Further, after the completion of newly-built workflow, it is also necessary to obtain the triggering rule of the new workflow of user's input.Triggering Rule can also equally be created by web interface or interface interchange two ways, and can be led by web interface Go out, facilitate user to add, check, change and delete.
Therefore, the embodiment of the present invention automates O&M method by providing a kind of cloud platform based on event-driven mechanism, The development cost of automation operation and maintenance tools is reduced, enhancing automates the collaboration capabilities between operation and maintenance tools.By using this base In the automation operation and maintenance tools of event Initiated Mechanism, it is possible to achieve transfer the routine work of cloud platform repeatability to operation and maintenance tools It is automatic to perform;Common cloud platform system exception is repaired automatically by operation and maintenance tools, without manual intervention;Cloud platform can quickly on Line New function;Cloud platform maintenance work process can be recorded, and can be traced;The knowledge accumulation of cloud platform O&M and flow can be accelerated Optimization;Promote team collaboration, realize efficient delivery.
As shown in figure 8, the embodiment of the present invention also provides a kind of cloud platform automatic operational system, the cloud for realizing described is put down The automatic O&M method of platform, the system include:
Workflow module 100, for being included extremely according to multiple multiple workflows of O&M task creation, each workflow The Perform sequence of a few O&M task and each O&M task;
Rule engine module 200, for establishing the triggering rule of the workflow, the triggering rule of each workflow Event type including triggering the workflow;
Sensor assembly 300, during for receiving pending O&M event, event type corresponding to judgement, and obtain Corresponding O&M event data;
Execution module 400, it is regular for the triggering according to the workflow, judge the thing of the pending O&M event The workflow of part type triggering;And each O&M task in the O&M event data and the workflow being triggered Perform sequence, perform the workflow being triggered.
Each sensor assembly 300 is communicated with cloud platform operation and maintenance tools 500, passive reception or active probe difference cloud Event caused by platform operation and maintenance tools 500, and can further be responsible for the operations such as event transmission, event merger and event closing.
Alternatively, in addition to Audit Module, the Audit Module are used to record the O&M event got, workflow execution Output data in data and workflow execution, and record data is exported to user terminal.I.e. described Audit Module can record Output during O&M event history, workflow execution history and workflow execution.Keeper or engineer can pass through Audit Module carries out event examination & verification, workflow examination & verification, rule examination & verification.The related data of Audit Module and daily record can be preserved for a long time And external auditing is supplied, so as to ensure that cloud platform maintenance work process can be recorded, can trace.
By using the automatic operational system of this kind of cloud platform, solves engineer in traditional cloud platform automation operation and maintenance tools The problem of leading O&M operates, development cost is high low with communication efficiency, event is realized based on event capture and workflow technology O&M mechanism is driven, further increases cloud platform O&M automaticity;Can by the automatic operational system of the cloud platform Handle most O&M event, and can be increased by engineer's later stage O&M mission script, workflow, workflow touch Hair rule, to the automatic operational system constantly improve of cloud platform, to greatly reduce the manual operation of engineer, by event triggering come Automatic calling workflow processing is carried out, improves O&M efficiency.
The embodiment of the present invention also provides a kind of cloud platform automatic O&M equipment, including processor;Memory, wherein being stored with The executable instruction of the processor;Wherein, the processor is configured to perform via the executable instruction is performed described Cloud platform automatic O&M method the step of.
Person of ordinary skill in the field it is understood that various aspects of the invention can be implemented as system, method or Program product.Therefore, various aspects of the invention can be implemented as following form, i.e.,:It is complete hardware embodiment, complete The embodiment combined in terms of full Software Implementation (including firmware, microcode etc.), or hardware and software, can unite here Referred to as " circuit ", " module " or " platform ".
The electronic equipment 600 according to the embodiment of the invention is described referring to Fig. 9.The electronics that Fig. 9 is shown Equipment 600 is only an example, should not bring any restrictions to the function and use range of the embodiment of the present invention.
As shown in figure 9, electronic equipment 600 is showed in the form of universal computing device.The component of electronic equipment 600 can wrap Include but be not limited to:At least one processing unit 610, at least one memory cell 620, (including the storage of connection different platform component Unit 620 and processing unit 610) bus 630, display unit 640 etc..
Wherein, the memory cell is had program stored therein code, and described program code can be held by the processing unit 610 OK so that the processing unit 610 perform described in the above-mentioned electronic prescription circulation processing method part of this specification according to this The step of inventing various illustrative embodiments.For example, the step of processing unit 610 can perform as shown in fig. 1.
The memory cell 620 can include the computer-readable recording medium of volatile memory cell form, such as random access memory Unit (RAM) 6201 and/or cache memory unit 6202, it can further include read-only memory unit (ROM) 6203.
The memory cell 620 can also include program/practical work with one group of (at least one) program module 6205 Tool 6204, such program module 6205 includes but is not limited to:Operating system, one or more application program, other programs Module and routine data, the realization of network environment may be included in each or certain combination in these examples.
Bus 630 can be to represent the one or more in a few class bus structures, including memory cell bus or storage Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures Local bus.
Electronic equipment 600 can also be with one or more external equipments 700 (such as keyboard, sensing equipment, bluetooth equipment Deng) communication, the equipment communication interacted with the electronic equipment 600 can be also enabled a user to one or more, and/or with causing Any equipment that the electronic equipment 600 can be communicated with one or more of the other computing device (such as router, modulation /demodulation Device etc.) communication.This communication can be carried out by input/output (I/O) interface 650.Also, electronic equipment 600 can be with By network adapter 660 and one or more network (such as LAN (LAN), wide area network (WAN) and/or public network, Such as internet) communication.Network adapter 660 can be communicated by bus 630 with other modules of electronic equipment 600.Should Understand, although not shown in the drawings, can combine electronic equipment 600 uses other hardware and/or software module, including it is but unlimited In:Microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive and number According to backup storage platform etc..
By using the automatic O&M equipment of this kind of cloud platform, you can to solve work in traditional cloud platform automation operation and maintenance tools The problem of Cheng Shi dominates O&M operation, development cost is high low with communication efficiency, is realized based on event capture and workflow technology Event-driven O&M mechanism, further increase cloud platform O&M automaticity;It is by the automatic O&M equipment of the cloud platform Most O&M event can be handled, greatly improves the efficiency of O&M, and avoids the mistake being likely to occur during manual operation By mistake, the success rate of O&M processing is improved.
The embodiment of the present invention also provides a kind of computer-readable recording medium, and for storage program, described program is performed The step of cloud platform described in Shi Shixian automatic O&M method.In some possible embodiments, various aspects of the invention A kind of form of program product is also implemented as, it includes program code, when described program product is run on the terminal device When, described program code is retouched for making the terminal device perform in the above-mentioned electronic prescription circulation processing method part of this specification State according to the step of various illustrative embodiments of the invention.
With reference to shown in figure 10, the program product for being used to realize the above method according to the embodiment of the present invention is described 800, it can use portable compact disc read only memory (CD-ROM) and including program code, and can in terminal device, Such as run on PC.However, the program product not limited to this of the present invention, in this document, readable storage medium storing program for executing can be with Be it is any include or the tangible medium of storage program, the program can be commanded execution system, device either device use or It is in connection.
Described program product can use any combination of one or more computer-readable recording mediums.Computer-readable recording medium can be readable letter Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared ray or System, device or the device of semiconductor, or any combination above.The more specifically example of readable storage medium storing program for executing is (non exhaustive List) include:It is electrical connection, portable disc, hard disk, random access memory (RAM) with one or more wires, read-only Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
The computer-readable recording medium can include believing in a base band or as the data that a carrier wave part is propagated Number, wherein carrying readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetism Signal, optical signal or above-mentioned any appropriate combination.Readable storage medium storing program for executing can also be any beyond readable storage medium storing program for executing Computer-readable recording medium, the computer-readable recording medium can send, propagate either transmit for being used by instruction execution system, device or device or Person's program in connection.The program code included on readable storage medium storing program for executing can be transmitted with any appropriate medium, bag Include but be not limited to wireless, wired, optical cable, RF etc., or above-mentioned any appropriate combination.
Can being combined to write the program operated for performing the present invention with one or more programming languages Code, described program design language include object oriented program language-Java, C++ etc., include routine Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user Perform on computing device, partly perform on a user device, the software kit independent as one performs, is partly calculated in user Its upper side point is performed or performed completely in remote computing device or server on a remote computing.It is remote being related to In the situation of journey computing device, remote computing device can pass through the network of any kind, including LAN (LAN) or wide area network (WAN) user calculating equipment, is connected to, or, it may be connected to external computing device (such as utilize ISP To pass through Internet connection).
Cloud platform automatic O&M method, system, equipment and storage medium provided by the present invention have following advantages:
The present invention solve engineer in traditional cloud platform automation operation and maintenance tools dominate O&M operation, development cost it is high and The problem of communication efficiency is low, event-driven O&M mechanism is realized based on event capture and workflow technology, further increased Cloud platform O&M automaticity, O&M operation are voluntarily driven by instrument, reduce the manual operation of engineer, and evade Human factor may caused operation potential risk;Rule-based engine technique, which reduces, links up cost, improves O&M efficiency.
Above content is to combine specific preferred embodiment further description made for the present invention, it is impossible to is assert The specific implementation of the present invention is confined to these explanations.For general technical staff of the technical field of the invention, On the premise of not departing from present inventive concept, some simple deduction or replace can also be made, should all be considered as belonging to the present invention's Protection domain.

Claims (19)

  1. A kind of 1. automatic O&M method of cloud platform, it is characterised in that comprise the following steps:
    According to multiple multiple workflows of O&M task creation, each workflow is including at least one O&M task and respectively The Perform sequence of the individual O&M task;
    The triggering rule of the workflow is established, the triggering rule of each workflow includes the event class for triggering the workflow Type;
    When receiving pending O&M event, event type corresponding to judgement, and O&M event data corresponding to acquisition;
    According to the triggering of workflow rule, the workflow that the event type of the pending O&M event triggers is judged;
    The Perform sequence of each O&M task in the O&M event data and the workflow being triggered, execution are triggered Workflow.
  2. 2. the automatic O&M method of cloud platform according to claim 1, it is characterised in that the O&M event type includes accusing Alert event, the alarm event are delayed machine event and resource alarm event including delay machine event, virtual machine of host.
  3. 3. the automatic O&M method of cloud platform according to claim 2, it is characterised in that machine event of the host delaying triggers Workflow include virtual machine (vm) migration task in host acquisition of information task and host;
    The workflow that machine event of the virtual machine delaying triggers includes virtual machine information and obtained in task and virtual machine using migration times Business;
    The workflow of the resource alarm event triggering includes cloud platform dilatation task and/or daily record clean-up task.
  4. 4. the automatic O&M method of cloud platform according to claim 2, it is characterised in that the host machine event institute that delays is right The O&M event data answered includes host identification code;
    The pending O&M event be host delay machine event when, the workflow that is triggered of performing includes following step Suddenly:
    Host acquisition of information task is performed, host and the host of the machine of delaying are found according to the host identification code Virtual machine information on machine, the virtual machine information include the operation service in virtual machine identification code and virtual machine;
    Virtual machine (vm) migration task in host is performed, virtual machine (vm) migration task comprises the following steps in the host:
    The configuration of virtual machine corresponding to the virtual machine identification code is removed from load equalizer cluster;
    Virtual machine corresponding to the virtual machine identification code is restarted on another host in cloud platform;
    Start the operation service on the virtual machine restarted;
    Call the availability of operation service described in health examination interface check;
    It will check that the virtual machine passed through is added to the load equalizer cluster again.
  5. 5. the automatic O&M method of cloud platform according to claim 4, it is characterised in that the host machine event institute that delays is right The O&M event data answered also includes cloud platform resource occupation ratio;
    The pending O&M event be host delay machine event when, the workflow that is triggered of performing also includes performing void Plan machine migration pattern selects task, and the virtual machine (vm) migration mode selects task to comprise the following steps:
    Judge whether the cloud platform resource occupation ratio is more than the first predetermined threshold value;
    If it is, selection virtual machine (vm) migration mode is multiple virtual machine parallel migrations;
    Otherwise, selection virtual machine (vm) migration mode serially migrates for multiple virtual machines.
  6. 6. the automatic O&M method of cloud platform according to claim 2, it is characterised in that the virtual machine machine event institute that delays is right The O&M event data answered includes virtual machine identification code;
    The pending O&M event be virtual machine delay machine event when, the workflow that is triggered of performing includes following step Suddenly:
    Perform virtual machine information and obtain task, the virtual machine for the machine of delaying is found according to the virtual machine identification code, delayed Operation application in operation service and operation service in the virtual machine of machine;
    Perform in virtual machine and apply migration task, comprise the following steps in the virtual machine using migration task:
    The configuration of the operation application in operation service is removed from load equalizer cluster;
    Restart the virtual machine for the machine of delaying;
    Start the operation service in the virtual machine restarted;
    Call the availability that application is run described in health examination interface check;
    It will check that the operation passed through application is added to load equalizer cluster again.
  7. 7. the automatic O&M method of cloud platform according to claim 2, it is characterised in that the resource alarm event includes void Plan machine disk alarm event, virtual machine CPU alarm events and cloud platform cluster resource alarm event;
    When the pending O&M event is magnetic disk of virtual machine alarm event, the workflow being triggered includes Disk Cleanup Task;
    When the pending O&M event is virtual machine CPU alarm events or cloud platform cluster resource alarm event, the quilt The workflow of triggering includes cloud platform dilatation task.
  8. 8. the automatic O&M method of cloud platform according to claim 1, it is characterised in that the O&M event type also includes Altering event, the altering event include cloud platform service release altering event and cloud platform capacity altering event;
    Workflow corresponding to the cloud platform service release altering event includes responsible person and notifies task, and the cloud platform is held Measuring the workflow corresponding to altering event includes scalable capacitance type selection task, cloud platform dilatation task and cloud platform capacity reducing times Business.
  9. 9. the automatic O&M method of cloud platform according to claim 8, it is characterised in that the cloud platform service release change O&M event data corresponding to event includes virtual machine information and the version change that host's machine information, the needs that needs change change More information;
    The responsible person notifies task to comprise the following steps:
    The contact of responsible person according to corresponding to obtaining the host's machine information for needing to change, the virtual machine information for needing to change Mode;
    Version change notice and version change information are sent to the responsible person according to the contact method of the responsible person.
  10. 10. the automatic O&M method of cloud platform according to claim 8, it is characterised in that the cloud platform capacity becomes experienced O&M event data corresponding to part includes cloud platform cluster resource occupation proportion;
    When the pending O&M event is cloud platform scalable appearance event, the workflow being triggered that performs is included according to such as Lower step:
    Scalable capacitance type selection task is performed, judges whether the cloud platform cluster resource occupation proportion is more than Second Threshold, such as Fruit is then to perform cloud platform dilatation task;
    If the cloud platform cluster resource occupation proportion is less than or equal to Second Threshold, judge that the cloud platform cluster resource accounts for Whether it is less than the 3rd threshold value with ratio, the 3rd threshold value is less than Second Threshold, if it is, performing cloud platform capacity reducing task.
  11. 11. the automatic O&M method of cloud platform according to claim 1, it is characterised in that the workflow includes also including The subsequent execution task that the boundary condition and each O&M tasks carrying that each O&M tasks carrying successfully fails successfully fail; It is described to perform the workflow being triggered, comprise the following steps:
    Perform sequence in the workflow found performs each O&M task successively;
    After each O&M tasks carrying, judge whether the O&M task performs into according to corresponding boundary condition Work(, and generate implementing result;
    Subsequent execution task is selected according to the implementing result of the O&M task.
  12. 12. the automatic O&M method of cloud platform according to claim 1, it is characterised in that connect from cloud platform operation and maintenance tools Receive pending O&M event;The cloud platform operation and maintenance tools include cloud platform server management tool, asset data management Storehouse, cloud platform monitoring tools, alarm instrument, configuration management tool, name server management tool, code storage management tool and At least one of resource management instrument.
  13. 13. the automatic O&M method of cloud platform according to claim 1, it is characterised in that also comprise the following steps:
    It whether there is O&M event to be combined in the O&M event for judging to get, the O&M event to be combined belongs to same fortune Event type is tieed up, and corresponds to same server;
    If there is O&M event to be combined, then O&M event to be combined is merged into an O&M event.
  14. 14. the automatic O&M method of cloud platform according to claim 1, it is characterised in that also comprise the following steps:
    Obtain the event termination condition of each O&M event;
    Judge it is currently processed in O&M event whether reach corresponding to event termination condition;
    If there is the O&M event for reaching corresponding event termination condition, then terminate to handle the O&M event.
  15. 15. the automatic O&M method of cloud platform according to claim 1, it is characterised in that also comprise the following steps:
    Record the output data in the O&M event, workflow execution data and the workflow execution that get, and by record data Export to user terminal.
  16. A kind of 16. automatic operational system of cloud platform, for realizing that the cloud platform any one of claim 1 to 15 is transported automatically Dimension method, it is characterised in that the system includes:
    Workflow module, for including at least one according to multiple multiple workflows of O&M task creation, each workflow The Perform sequence of the O&M task and each O&M task;
    Rule engine module, for establishing the triggering rule of the workflow, the triggering rule of each workflow includes touching Send out the event type of the workflow;
    Sensor assembly, during for receiving pending O&M event, event type corresponding to judgement, and transport corresponding to obtaining Tie up event data;
    Execution module, it is regular for the triggering according to the workflow, judge the event type of the pending O&M event The workflow of triggering;And the execution sequence of each O&M task in the O&M event data and the workflow being triggered Row, perform the workflow being triggered.
  17. 17. the automatic operational system of cloud platform according to claim 16, it is characterised in that described also including Audit Module Audit Module is used to record the output data in O&M event, workflow execution data and the workflow execution got, and will Record data is exported to user terminal.
  18. A kind of 18. automatic O&M equipment of cloud platform, it is characterised in that including:
    Processor;
    Memory, wherein being stored with the executable instruction of the processor;
    Wherein, the processor is configured to come any one of 1 to 15 institute of perform claim requirement via the execution executable instruction The step of cloud platform stated automatic O&M method.
  19. 19. a kind of computer-readable recording medium, for storage program, it is characterised in that power is realized when described program is performed Profit requires the step of cloud platform any one of 1 to 15 automatic O&M method.
CN201710623425.1A 2017-07-25 2017-07-25 Cloud platform automatic O&M method, system, equipment and storage medium Pending CN107368365A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710623425.1A CN107368365A (en) 2017-07-25 2017-07-25 Cloud platform automatic O&M method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710623425.1A CN107368365A (en) 2017-07-25 2017-07-25 Cloud platform automatic O&M method, system, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN107368365A true CN107368365A (en) 2017-11-21

Family

ID=60308512

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710623425.1A Pending CN107368365A (en) 2017-07-25 2017-07-25 Cloud platform automatic O&M method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN107368365A (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107909164A (en) * 2017-12-08 2018-04-13 泰康保险集团股份有限公司 O&M processing method, system, electronic equipment and computer-readable medium
CN108093094A (en) * 2017-12-08 2018-05-29 腾讯科技(深圳)有限公司 Database instance access method, device, system, storage medium and equipment
CN108536447A (en) * 2018-04-11 2018-09-14 上海掌门科技有限公司 Operation management method
CN108768728A (en) * 2018-05-31 2018-11-06 康键信息技术(深圳)有限公司 O&M task processing method, device, computer equipment and storage medium
CN109189615A (en) * 2018-09-04 2019-01-11 郑州云海信息技术有限公司 A kind of delay machine treating method and apparatus
CN109634712A (en) * 2018-10-16 2019-04-16 平安普惠企业管理有限公司 API function services method, apparatus, equipment and readable storage medium storing program for executing
CN110008034A (en) * 2018-11-22 2019-07-12 阿里巴巴集团控股有限公司 Task automated execution method, apparatus, electronic equipment and storage medium
CN110097341A (en) * 2019-04-29 2019-08-06 重庆天蓬网络有限公司 A kind of automation O&M management-control method, device, medium and electronic equipment
CN110149283A (en) * 2019-05-22 2019-08-20 无锡华云数据技术服务有限公司 A kind of resource layout implementation method and device
CN110569140A (en) * 2019-08-29 2019-12-13 网宿科技股份有限公司 operation and maintenance method and device
CN111008035A (en) * 2019-11-27 2020-04-14 北京宝兰德软件股份有限公司 Software operation and maintenance method, electronic equipment and storage medium
CN111343017A (en) * 2020-02-22 2020-06-26 苏州浪潮智能科技有限公司 Method, system, equipment and medium for cloud platform resource alarm
CN111898899A (en) * 2020-07-25 2020-11-06 江苏锐创软件技术有限公司 Flow management and control method, device and system for automatically triggering workflow and storage medium
CN111915275A (en) * 2020-07-31 2020-11-10 上海燕汐软件信息科技有限公司 Application operation process management method, device and system
CN112398933A (en) * 2020-11-05 2021-02-23 携程旅游网络技术(上海)有限公司 Cloud native application publishing method, system, device and storage medium
CN113127162A (en) * 2019-12-31 2021-07-16 阿里巴巴集团控股有限公司 Automatic task execution method and device, electronic equipment and computer storage medium
CN113434366A (en) * 2021-06-28 2021-09-24 中国建设银行股份有限公司 Event processing method and system
CN113485542A (en) * 2021-07-23 2021-10-08 国网福建省电力有限公司 Operation and maintenance method and system based on big data automation operation and maintenance platform
TWI802388B (en) * 2022-04-27 2023-05-11 中華電信股份有限公司 Cloud platform-based storage service system, method and computer readable medium
CN116643950A (en) * 2023-07-19 2023-08-25 浩鲸云计算科技股份有限公司 FaaS-based cloud native application automatic operation and maintenance method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473117A (en) * 2013-09-18 2013-12-25 北京思特奇信息技术股份有限公司 Cloud-mode virtualization method
CN104516735A (en) * 2013-09-30 2015-04-15 上海宝信软件股份有限公司 Two-dimensional layering method for achieving automatic operation and maintenance of cloud computing environment
CN105915633A (en) * 2016-06-02 2016-08-31 北京百度网讯科技有限公司 Automated operational system and method thereof
CN106506215A (en) * 2016-11-11 2017-03-15 郑州云海信息技术有限公司 A kind of automation operational system based on CMDB
CN106875018A (en) * 2017-01-04 2017-06-20 北京百度网讯科技有限公司 A kind of method and apparatus of ultra-large Machine automated maintenance

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473117A (en) * 2013-09-18 2013-12-25 北京思特奇信息技术股份有限公司 Cloud-mode virtualization method
CN104516735A (en) * 2013-09-30 2015-04-15 上海宝信软件股份有限公司 Two-dimensional layering method for achieving automatic operation and maintenance of cloud computing environment
CN105915633A (en) * 2016-06-02 2016-08-31 北京百度网讯科技有限公司 Automated operational system and method thereof
CN106506215A (en) * 2016-11-11 2017-03-15 郑州云海信息技术有限公司 A kind of automation operational system based on CMDB
CN106875018A (en) * 2017-01-04 2017-06-20 北京百度网讯科技有限公司 A kind of method and apparatus of ultra-large Machine automated maintenance

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107909164A (en) * 2017-12-08 2018-04-13 泰康保险集团股份有限公司 O&M processing method, system, electronic equipment and computer-readable medium
CN108093094A (en) * 2017-12-08 2018-05-29 腾讯科技(深圳)有限公司 Database instance access method, device, system, storage medium and equipment
CN107909164B (en) * 2017-12-08 2021-11-26 泰康保险集团股份有限公司 Operation and maintenance processing method, system, electronic equipment and computer readable medium
CN108536447A (en) * 2018-04-11 2018-09-14 上海掌门科技有限公司 Operation management method
CN108536447B (en) * 2018-04-11 2021-07-16 上海掌门科技有限公司 Operation and maintenance management method
CN108768728A (en) * 2018-05-31 2018-11-06 康键信息技术(深圳)有限公司 O&M task processing method, device, computer equipment and storage medium
CN108768728B (en) * 2018-05-31 2022-09-02 康键信息技术(深圳)有限公司 Operation and maintenance task processing method and device, computer equipment and storage medium
CN109189615A (en) * 2018-09-04 2019-01-11 郑州云海信息技术有限公司 A kind of delay machine treating method and apparatus
CN109634712A (en) * 2018-10-16 2019-04-16 平安普惠企业管理有限公司 API function services method, apparatus, equipment and readable storage medium storing program for executing
CN110008034A (en) * 2018-11-22 2019-07-12 阿里巴巴集团控股有限公司 Task automated execution method, apparatus, electronic equipment and storage medium
CN110097341A (en) * 2019-04-29 2019-08-06 重庆天蓬网络有限公司 A kind of automation O&M management-control method, device, medium and electronic equipment
CN110149283A (en) * 2019-05-22 2019-08-20 无锡华云数据技术服务有限公司 A kind of resource layout implementation method and device
CN110569140A (en) * 2019-08-29 2019-12-13 网宿科技股份有限公司 operation and maintenance method and device
CN111008035A (en) * 2019-11-27 2020-04-14 北京宝兰德软件股份有限公司 Software operation and maintenance method, electronic equipment and storage medium
CN111008035B (en) * 2019-11-27 2023-07-04 北京宝兰德软件股份有限公司 Software operation and maintenance method, electronic equipment and storage medium
CN113127162A (en) * 2019-12-31 2021-07-16 阿里巴巴集团控股有限公司 Automatic task execution method and device, electronic equipment and computer storage medium
CN111343017A (en) * 2020-02-22 2020-06-26 苏州浪潮智能科技有限公司 Method, system, equipment and medium for cloud platform resource alarm
CN111343017B (en) * 2020-02-22 2022-12-09 苏州浪潮智能科技有限公司 Method, system, equipment and medium for cloud platform resource alarm
CN111898899A (en) * 2020-07-25 2020-11-06 江苏锐创软件技术有限公司 Flow management and control method, device and system for automatically triggering workflow and storage medium
CN111915275A (en) * 2020-07-31 2020-11-10 上海燕汐软件信息科技有限公司 Application operation process management method, device and system
CN112398933A (en) * 2020-11-05 2021-02-23 携程旅游网络技术(上海)有限公司 Cloud native application publishing method, system, device and storage medium
CN112398933B (en) * 2020-11-05 2023-05-23 携程旅游网络技术(上海)有限公司 Cloud native application release method, system, equipment and storage medium
CN113434366A (en) * 2021-06-28 2021-09-24 中国建设银行股份有限公司 Event processing method and system
CN113485542A (en) * 2021-07-23 2021-10-08 国网福建省电力有限公司 Operation and maintenance method and system based on big data automation operation and maintenance platform
TWI802388B (en) * 2022-04-27 2023-05-11 中華電信股份有限公司 Cloud platform-based storage service system, method and computer readable medium
CN116643950A (en) * 2023-07-19 2023-08-25 浩鲸云计算科技股份有限公司 FaaS-based cloud native application automatic operation and maintenance method
CN116643950B (en) * 2023-07-19 2023-10-20 浩鲸云计算科技股份有限公司 FaaS-based cloud native application automatic operation and maintenance method

Similar Documents

Publication Publication Date Title
CN107368365A (en) Cloud platform automatic O&M method, system, equipment and storage medium
CN105357038B (en) Monitor the method and system of cluster virtual machine
US9383900B2 (en) Enabling real-time operational environment conformity to an enterprise model
US7519711B2 (en) Method for middleware assisted system integration in a federated environment
CN106371975B (en) A kind of O&amp;M automation method for early warning and system
CN109634728A (en) Job scheduling method, device, terminal device and readable storage medium storing program for executing
CN112866333A (en) Cloud-native-based micro-service scene optimization method, system, device and medium
Soldani et al. The μTOSCA toolchain: Mining, analyzing, and refactoring microservice‐based architectures
US20170004423A1 (en) Systems and methods for simulating orders and workflows in an order entry and management system to test order scenarios
US20150193317A1 (en) Recovery of a network infrastructure to facilitate business continuity
CN108334447A (en) A kind of system and method for test processes computer software exception
Dehraj et al. A review on architecture and models for autonomic software systems
CN111324599B (en) Block chain experiment system and management method
CN108563455A (en) Middleware portion arranging method, system and equipment in a kind of K-UX operating systems
CN112579288A (en) Cloud computing-based intelligent security data management system
Yue et al. Understanding digital twins for cyber-physical systems: A conceptual model
CN113570468A (en) Enterprise payment wind control service platform
CN115619162A (en) Power supply service command system based on cloud platform and micro-service architecture
CN109586946B (en) Exception handling method and device and computer readable storage medium
CN110048881A (en) Information monitoring system, information monitoring method and device
CN105681070A (en) Method and system for automatically collecting and analyzing computer cluster node information
Lin et al. Research on building an innovative electric power marketing business application system based on cloud computing and microservices architecture technologies
WO2022103685A1 (en) Continuous integration and development of code in a secure environment
CN114137861A (en) Intention-driven cloud security service system and method
CN105653423B (en) The automation collection analysis method and its system of distributed information system health status

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171121