Background technology
The application of cloud computing technology is to traditional bringing great convenience property of IT business and availability.One of feature of cloud computing technology is exactly the virtualization of resource.The virtualization of so-called resource is namely by the association to bottom hardware equipment of the technology masking operation system, operating system does not know oneself to operate on a set of special hardware or is sharing a set of hardware device with other operating systems, thus reach to run the purpose of multiple operating system at a set of hardware device simultaneously, it is greatly improved the utilization rate of hardware device, decreases the expense of hardware device buying.
Under cloud computing environment, one traditional business system is generally made up of multiple fictitious host computers, the running status of any one main frame all can have influence on the availability of business, so to fully understand the operation conditions of operation system, being accomplished by the indices understanding the ruuning situation of all fictitious host computers and fictitious host computer operation, comprehensive indices obtains the situation of operation system overall operation.
But this weighs also to the state of operation system and brings very big trouble, and an operation system is generally made up of multiple operating systems, and each system broadly falls into a part for operation system, and it is unavailable that the fault of any one system can result in operation system.Under traditional mode, each operating system, on the hardware device that oneself is exclusive, will not be subject to the interference of other system.But under cloud computing mode, one hardware device there is multiple operating system, the running status of single operating is not only only limited by the impact of self software, may also be subjected to the impact of other operating systems on same hardware device, such as: other system takies the resource that hardware is too much, cause that the spendable resource of this system is inadequate.Therefore under cloud computing environment, want the running status weighing an operation system, not only to check the ruuning situation of all web hosting software in this operation system, also to check the ruuning situation of hardware device residing for these fictitious host computers and the situation that on this hardware device, other fictitious host computers run.Calculate according to 5 fictitious host computers of an operation system, 5 fictitious host computers operate on 5 physical equipments, one physical equipment runs 30 fictitious host computers simultaneously, 5 equipment are likely to run up to a hundred fictitious host computers, therefore when operation system goes wrong, in order to there is place in orientation problem, it is necessary to checks the state that more than 100 fictitious host computer runs, just certain fictitious host computer impact that finally can determine that on whether certain hardware device causes that the main frame operation in key service system is abnormal.And a traditional fictitious host computer runs common index item and has tens of, use time, memory usage, Memory Allocation value, internal memory consumption figures, internal memory activity value, Memory recycle value, Memory recycle target, memory compression value, disk read request, disk write request, the storage read-write of disk read rate, disk write speed, data etc. including CPU entirety utilization rate, CPU core utilization rate, CPU waiting time, CPU.These numerical value reflect the ruuning situation of a fictitious host computer.O&M mode traditionally, when a certain operation system breaks down and cannot access, manager needs first to determine the ruuning situation of the physical equipment at the intrasystem fictitious host computer place of target service, check whether physical equipment is in power-on state, then look at physical host equipment CPU, internal memory, disk context performance whether have problems, if problematic, then platform inspection is by caused by which platform fictitious host computer affects.If physical host equipment is absent from the problem used, then start to check the running status of the fictitious host computer belonging to target service system.The workload of this process can along with on the increasing of host number virtual in operation system, physical host virtual increasing of host number and roll up.
No matter finding that use or the access of operation system go wrong, still navigating to this problem is that the fictitious host computer in business operation system causes or on Same Physical equipment, other fictitious host computers cause on earth, it is necessary to expends substantial amounts of manpower and goes retrieval.
1, under cloud computing environment, on physical equipment, the fictitious host computer of multiplexing is much more a lot of than under conventional environment, and environment is more complicated, and affected factor is more, and fictitious host computer runs into fault and also needs to consider whether causing because of other fictitious host computer on same main frame.
2, under cloud computing environment, due to the multiplexing of equipment, the index parameter of fictitious host computer running status becomes more, and the environment comparing specialized hardware has had more many index parameters.
3, for the deciphering of these index parameters, it is necessary to possess suitable Floor layer Technology knowledge, just analyzing implication behind between multiple coupling index, this is for general manager too difficulty.
4, existing management software is shown as master with data, the core parameter affecting fictitious host computer running status is displayed, and the data shown are incomplete, it is necessary to manual switching show content or do not support switching, cause complicated operation or detailed failure cause cannot be obtained.
5, fictitious host computer is used as an independent individuality and is treated by existing management software mostly, each fictitious host computer is individually monitored and manages, ignore the part as an operation system, contact between fictitious host computer, when other component malfunctions, single virtual operational state of mainframe is good again, and operation system also cannot access.
Summary of the invention
It is an object of the invention to overcome the deficiencies in the prior art, the invention provides a kind of service operation state evaluating method based on cloud computing environment and device thereof, the workload of hand inspection service system running state can be reduced, and assist manager more can understand the situation that operation system is run intuitively, reduce the time of malfunction elimination.
In order to solve the problems referred to above, the present invention proposes a kind of service operation state evaluating method based on cloud computing environment, and described method includes:
Service operation state evaluating method based on cloud computing environment, it is characterised in that described method includes:
All fictitious host computers on cloud computing platform are carried out the collection of achievement data;
Classify with the achievement data of a fictitious host computer to belonging to, it is thus achieved that the achievement data of subclassification;
Respectively the achievement data of each subclassification is calculated, it is thus achieved that the numerical value of each subclassification after calculating;
Running status corresponding to the numerical value of described each subclassification and numerical value is displayed.
Preferably, described respectively the achievement data of each subclassification is calculated, it is thus achieved that the step of the numerical value of each subclassification after calculating includes:
The achievement data of each subclassification is carried out subitem calculating, it is thus achieved that subitem numerical value;
Each subitem numerical value is carried out cum rights calculating by the calculating ratio according to each subclassification, it is thus achieved that the Weighted Coefficients of each subitem;
Weighted Coefficients according to described each subitem obtains the numerical value of subclassification.
Preferably, after the described step of collection that all fictitious host computers on cloud computing platform are carried out achievement data, also include:
All achievement datas are stored.
Preferably, respectively the achievement data of each subclassification is calculated described, it is thus achieved that after the step of the numerical value of each subclassification after calculating, also include:
The same subclassification numerical value of the fictitious host computer managed under same operation system is weighted;
Weighted value according to each subclassification after weighted calculation obtains the numerical value of each subclassification of operation system.
Preferably, after the described weighted value according to each subclassification after weighted calculation obtains the step of numerical value of each subclassification of operation system, also include:
The running status that the numerical value of each subclassification of operation system and numerical value are corresponding is displayed.
Correspondingly, the present invention also provides for a kind of service operation state evaluation device based on cloud computing environment, and described device includes:
Acquisition module, for carrying out the collection of achievement data to all fictitious host computers on cloud computing platform;
Sort module, for classifying with the achievement data of a fictitious host computer to belonging to, it is thus achieved that the achievement data of subclassification;
Computing module, for being calculated the achievement data of each subclassification respectively, it is thus achieved that the numerical value of each subclassification after calculating;
Display module, for displaying running status corresponding to the numerical value of described each subclassification and numerical value.
Preferably, described computing module includes:
Computing unit, calculates for the achievement data of each subclassification is carried out subitem, it is thus achieved that subitem numerical value;And each subitem numerical value is carried out cum rights calculating by the calculating ratio according to each subclassification, it is thus achieved that the Weighted Coefficients of each subitem;
Acquiring unit, obtains the numerical value of subclassification for the Weighted Coefficients according to described each subitem.
Preferably, described device also includes:
Memory module, for storing all achievement datas.
Preferably, described computing module is additionally operable to the same subclassification numerical value of the fictitious host computer managed under same operation system is weighted, and the numerical value of the weighted value acquisition each subclassification of operation system according to each subclassification after weighted calculation.
Preferably, described display module is additionally operable to the numerical value to each subclassification of operation system and running status corresponding to numerical value displays.
In embodiments of the present invention, by gathering the achievement data of all fictitious host computers of cloud computing platform, and classification analysis is carried out, obtain virtual main running status, management personnel can be made quickly to understand the running status of fictitious host computer, be conducive to carrying out malfunction elimination, reduce the time of malfunction elimination;The workload of hand inspection service system running state can be reduced, cut operating costs.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art obtain under not making creative work premise, broadly fall into the scope of protection of the invention.
Fig. 1 is the schematic flow sheet of the service operation state evaluating method based on cloud computing environment of the embodiment of the present invention, as it is shown in figure 1, the method includes:
All fictitious host computers on cloud computing platform are carried out the collection of achievement data by S1;
S2, classifies with the achievement data of a fictitious host computer to belonging to, it is thus achieved that the achievement data of subclassification;
S3, is calculated the achievement data of each subclassification respectively, it is thus achieved that the numerical value of each subclassification after calculating;
S4, displays running status corresponding to the numerical value of each subclassification and numerical value.
nullIn embodiments of the present invention,The achievement data gathered includes: fictitious host computer central processor CPU utilization rate (percentage ratio)、CPU makes consumption (megahertz (MegaHertz,Mhz))、The CPU waiting time (millisecond)、CPU ready time (millisecond)、Cpu idle time (millisecond)、Fictitious host computer memory usage (percentage ratio)、Internal memory warrant quantity (kilobytes,(KiloByte,KB))、Internal memory activity (KB)、Internal memory shares amount (KB)、Internal memory exchange capacity (KB)、Memory recycle amount (KB)、Internal memory exchange capacity (KB)、Memory compression amount (KB)、Disk I/O utilization rate (percentage ratio)、Disk read requests number (numerical value)、Disk write number of request (numerical value)、(kilobytes are per second for disk read rate,KBps)、Disk write speed (KBps)、Disk reads lag time (millisecond)、Disk write lag time (millisecond)、Cpu demand value (Mhz)、CPU limit (Mhz)、Network data speed uplink (KBps)、Network downstream speed (KBps)、Network Packet Loss number (numeral)、Network error bag quantity (numeral)、Memory quota (million,MB)、Storage IO limit (numeral)、Fictitious host computer snapshot space (MB) etc..
After collection, all achievement datas are stored, specifically, be stored in the data base of this locality.
In S2, subclassification includes healthy class, risk class, the big subclassification of efficiency class three.
Specifically, healthy subclassification includes: fictitious host computer central processor CPU utilization rate (percentage ratio), CPU make consumption (MegaHertz megahertz, Mhz), memory usage (percentage ratio), internal memory warrant quantity (kilobytes, KB), internal memory activity (KB), disk read rate (kilobytes are per second, KBps), disk write speed (KBps), disk I/O utilization rate (percentage ratio);
Risk subclassification includes: the CPU waiting time (millisecond), CPU ready time (millisecond), fictitious host computer memory usage (percentage ratio), internal memory warrant quantity (kilobytes, KB), internal memory exchange capacity (KB), Memory recycle amount (KB), internal memory exchange capacity (KB), memory compression amount (KB), (kilobytes are per second for disk read rate, KBps), disk write speed (KBps), disk reads lag time (millisecond), disk write lag time (millisecond), cpu demand value (Mhz), CPU limit (Mhz), Network Packet Loss number (numerical value), network error bag quantity (numerical value), memory quota (million, MB), storage IO limit (numerical value), fictitious host computer snapshot space (MB);
Efficiency subclassification includes: cpu idle time (millisecond), internal memory activity (KB), internal memory share amount (KB), network data speed uplink (KBps), network downstream speed (KBps), disk read requests number (numerical value), disk write number of request (numerical value).
In being embodied as, as in figure 2 it is shown, S3 farther includes:
S31, carries out subitem calculating to the achievement data of each subclassification, it is thus achieved that subitem numerical value;
S32, carries out cum rights calculating according to the calculating ratio of each subclassification to each subitem numerical value, it is thus achieved that the Weighted Coefficients of each subitem;
S33, obtains the numerical value of subclassification according to the Weighted Coefficients of each subitem.
Fictitious host computer is in running, and running status can fluctuate according to the number number of the difference of task and access.The numerical value of healthy subclassification reflects whether the state fluctuation of this fictitious host computer operation is in rational scope, and namely whether the state of fictitious host computer is controlled.When management personnel carry out malfunction elimination, it is possible to judge that whether the current state of fictitious host computer is normal by the numerical value of healthy subclassification, and whether failure judgement causes because of this fictitious host computer itself.
Specifically, healthy subclassification is divided into again 4 subitem: CPU, internal memory, storage, networks;Each subitem individually calculates, last 4 subitem numerical value and the healthy mark that is fictitious host computer, mark interval is 0 to 100 point.
The fault of CPU can directly result in the fictitious host computer even fault of physical host, and the health effect of fictitious host computer is maximum, and its shared ratio is also the highest, is set to 50 points.The score value of CPU adopts: fictitious host computer cpu busy percentage (percentage ratio), CPU make consumption (megahertz, Mhz) for reference index, and the utilization rate of CPU is divided into 5 mark intervals such as 0%-10,10%-50%, 50%-70%, 70%-90%, 90%-100%.Difference according to physical equipment Mhz absolute value, adjusts the percentage ratio of utilization rate, when CPU usage is when certain is interval, calculates the cpu load score value of correspondence, and its numerical range is that 0-50 divides.
The fault of internal memory can cause application run quickly burst, main frame such as cannot respond at the problem, healthy impact is only second to CPU, is set to 30 points.The score value of internal memory adopts fictitious host computer memory usage (percentage ratio), internal memory warrant quantity (kilobytes, KB), internal memory activity (KB) as reference index.For memory usage, it is divided into 3 intervals such as 0%-30%, 30%-80%, 80%-100%, when difference interval residing for the memory usage of fictitious host computer, calculates the memory usage score value of correspondence.The activity of fictitious host computer internal memory is the amount of actual activity in the total internal memory shared by fictitious host computer, takies total internal memory for exclusive segment very high, but practical efficiency lacks very low application.Internal memory warrant quantity is the amount that underlying operating system runs that fictitious host computer is actually used, is combined calculating with activity.Calculating overall score according to memory usage scoring and internal memory actual activity stock number two subitem cum rights of scoring, obtain final internal memory load scoring, numerical range is that 0-30 divides.
The fault of storage also results in that the service of fictitious host computer is unavailable, even crashes, and the calculating of storage score value mainly adopts: use disk to read lag time and disk write lag time as reference index.Take value maximum in both, be divided into 4 intervals, be respectively as follows: 0 millisecond-10 milliseconds, 10 milliseconds-50 milliseconds, 50 milliseconds-100 milliseconds, > 100 milliseconds, the instantaneous value according to lag time, calculate the storage load score value of correspondence, numerical range is that 0-10 divides.
In network facet, use current network service condition (current network speed) as reference index, for network interface card overall transmission rate, calculate the percentage ratio of network usage.Percentage ratio is converted into offered load scoring again, and numerical range is that 0-10 divides.
Calculating and the calculating ratio of subitem according to above 4 subitems draw the Weighted Coefficients that resource load is marked the numerical value of the Weighted Coefficients acquisition subclassification according to each subitem.
Alarm deduction of points: the alarm of the different stage that statistics fictitious host computer triggers, calculates the deduction of points value that alarm causes.The deduction of points value that red alarm (seriously) and yellow alarm (warning) have is different, 15 points of each red alarm button, 5 points of each yellow alarm button.Finally, marking according to the overall load of resource and deduct the deduction of points value that alarm produces, be then final operation score value, full marks are 100 points.
Risk subclassification is mainly used in showing the current Risk Content occurred in running, and value-at-risk is more high more easily to go wrong.The scoring of risk is mainly calculated according to risk subitem defined above.Each risk subitem has the score value of 15 or 5 points, when the threshold value of this risk subitem is triggered, then judges that this risk has occurred and that, and increases the risk numerical value of fictitious host computer.
Internal memory exchange capacity (KB), Memory recycle amount (KB), internal memory exchange capacity (KB), memory compression amount (KB), CPU limit (Mhz), Network Packet Loss number (numeral), network error bag quantity (numeral), CPU limit (Mhz), memory quota (MByte million, MB), storing the subitems such as IO limit (numeral) is switching value, on duty when being not zero, it is judged as this risk and has occurred and that.
CPU waiting time (millisecond), CPU ready time (millisecond), fictitious host computer memory usage (percentage ratio), internal memory warrant quantity (KB), disk read rate (KBps), disk write speed (KBps), disk reading lag time (millisecond), disk write lag time (millisecond), cpu demand value (Mhz), fictitious host computer snapshot space (MB) are interval amount, trigger when fictitious host computer is treated as this risk subitem when the performance of this index reaches to alert interval.Finally, the risk adding up each risk subitem divides the risk score value calculating fictitious host computer entirety, and its numerical range is that 0-100 divides.
Efficiency subclassification is in idle state or the fictitious host computer passed into silence for assisting manager to find, because being there is periodic crest and trough by certain applications in the demand of resource, such as mail class be applied in work hours section utilization rate can be higher, at night, After Hours then utilization rate is very low.Therefore efficiency subclassification needs one longer cycle of monitoring, and comprehensive multiple achievement datas weigh whether fictitious host computer is in idle state.
Cpu idle ratio indicator is to weigh virtual machine to be in the key index of idle state, but cpu idle ratio is cpu resource to be in upstate fictitious host computer and but do not use shared time ratio;
Internal memory activity is the index that another is important, it is judged that according to the warrant quantity being internal memory activity/internal memory, and the ratio drawn is that fictitious host computer is actual in the amount of ram used, accordingly even when virtual, to take total internal memory many, without the score affecting efficiency value;
The basis for estimation of disk is disk utilization rate, is the percent value of disk utilization, it is possible to check out the fictitious host computer read-write situation to disk;
When above-mentioned cpu idle ratio is more than 98%, internal memory activity is lower than the 20% of warrant quantity, and when IO utilization rate is lower than 5%, fictitious host computer is marked as idle main frame.
It addition, in specific implementation process, respectively the achievement data of each subclassification is being calculated, it is thus achieved that after the step of the numerical value of each subclassification after calculating, also include:
The same subclassification numerical value of the fictitious host computer managed under same operation system is weighted;
Weighted value according to each subclassification after weighted calculation obtains the numerical value of each subclassification of operation system.
Operation system divides health, risk, three subclassification of efficiency equally.
The numerical value that the calculating of each subclassification index is the same seed classification according to the virtual machine managed below this operation system is weighted.Main calculation is the weight that the purposes (http server, database server, application server etc.) according to fictitious host computer distributes various fictitious host computer, and the weight of the fictitious host computer of different purposes is different.Fictitious host computer respective cum rights meansigma methods on health, risk, three subclassification of efficiency according to different purposes, calculates the actual score value of operation system health, risk, efficiency subclassification.
The running status that the numerical value of each subclassification of operation system and numerical value are corresponding is displayed.
Specifically, the numerical value and service system running state that calculate acquisition being displayed, healthy subclassification numerical value, risk subclassification numerical value, efficiency subclassification numerical value embody the numeral ranging for 0-100 respectively.There is below each numerical value detailed displaying content simultaneously, in order to embody the source of scoring, and when score value is abnormal, be the reason that causes of which project.
The numerical value of healthy subclassification can be shown by historical data broken line graph, and vertical coordinate is the concrete numerical value of each project, and abscissa is the time, it can be seen that the achievement data that each healthy subclassification is correlated with fluctuation situation in time.
The numerical value of risk subclassification can be shown by list, is displayed by list by risky for institute subitem, when risk is triggered, by top set and be highlighted.
The numerical value of efficiency subclassification can be shown by history broken line graph, and vertical coordinate is the concrete numerical value of each subitem, and abscissa is the time, it can be seen that the index that each efficiency is correlated with fluctuation situation in time.
By the displaying of front-end graphical, the risk of all existence being displayed by the mode of list, the risk wherein having occurred and that is highlighted, after manager finds risk existing problems, it is possible to quickly know which risk item existing problem.
It addition, save the historical data of each index while calculating, before pipe can be checked one hour, before 1 week, before 1 month, even data before 1 year, by showing monitoring information in a period of time, dynamically grasp the trend of historical variations.
Simultaneously, it is possible to help user to be adjusted resource optimizing, by the contrast of efficiency, it has been found which is idle fictitious host computer for a long time, thus checking whether the resource passed into silence, and reclaims.
Correspondingly, the embodiment of the present invention also provides for a kind of service operation state evaluation device based on cloud computing environment, as it is shown on figure 3, this device includes:
Acquisition module 1, for carrying out the collection of achievement data to all fictitious host computers on cloud computing platform;
Sort module 2, for classifying with the achievement data of a fictitious host computer to belonging to, it is thus achieved that the achievement data of subclassification;
Computing module 3, for being calculated the achievement data of each subclassification respectively, it is thus achieved that the numerical value of each subclassification after calculating;
Display module 4, for displaying running status corresponding to the numerical value of each subclassification and numerical value.
Specifically, healthy subclassification includes: fictitious host computer central processor CPU utilization rate (percentage ratio), CPU make consumption (MegaHertz megahertz, Mhz), memory usage (percentage ratio), internal memory warrant quantity (kilobytes, KB), internal memory activity (KB), disk read rate (kilobytes are per second, KBps), disk write speed (KBps), disk I/O utilization rate (percentage ratio);
Risk subclassification includes: the CPU waiting time (millisecond), CPU ready time (millisecond), fictitious host computer memory usage (percentage ratio), internal memory warrant quantity (kilobytes, KB), internal memory exchange capacity (KB), Memory recycle amount (KB), internal memory exchange capacity (KB), memory compression amount (KB), (kilobytes are per second for disk read rate, KBps), disk write speed (KBps), disk reads lag time (millisecond), disk write lag time (millisecond), cpu demand value (Mhz), CPU limit (Mhz), Network Packet Loss number (numerical value), network error bag quantity (numerical value), memory quota (million, MB), storage IO limit (numerical value), fictitious host computer snapshot space (MB);
Efficiency subclassification includes: cpu idle time (millisecond), internal memory activity (KB), internal memory share amount (KB), network data speed uplink (KBps), network downstream speed (KBps), disk read requests number (numerical value), disk write number of request (numerical value).
Further, computing module 3 includes:
Computing unit, calculates for the achievement data of each subclassification is carried out subitem, it is thus achieved that subitem numerical value;And each subitem numerical value is carried out cum rights calculating by the calculating ratio according to each subclassification, it is thus achieved that the Weighted Coefficients of each subitem;
Acquiring unit, obtains the numerical value of subclassification for the Weighted Coefficients according to each subitem.
Computing module 3 is additionally operable to the same subclassification numerical value of the fictitious host computer managed under same operation system is weighted, and the numerical value of the weighted value acquisition each subclassification of operation system according to each subclassification after weighted calculation.
Display module 4 is additionally operable to the numerical value to each subclassification of operation system and running status corresponding to numerical value displays.
Further, this device also includes:
Memory module (not shown), for storing all achievement datas.
In assembly of the invention embodiment, the function of each functional module referring to the flow processing in the inventive method embodiment, can repeat no more here.
In embodiments of the present invention, involved subclassification, subitem, the scope of numerical value and indices data be chosen for the realization that the inventive method and device are described, specifically can carry out self-defined according to the actual requirements.
In embodiments of the present invention, by gathering the achievement data of all fictitious host computers of cloud computing platform, and classification analysis is carried out, obtain virtual main running status, management personnel can be made quickly to understand the running status of fictitious host computer, be conducive to carrying out malfunction elimination, reduce the time of malfunction elimination;The workload of hand inspection service system running state can be reduced, cut operating costs.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment can be by the hardware that program carrys out instruction relevant and completes, this program can be stored in a computer-readable recording medium, storage medium may include that read only memory (ROM, ReadOnlyMemory), random access memory (RAM, RandomAccessMemory), disk or CD etc..
Additionally, the service operation state evaluating method based on cloud computing environment and the device thereof that above the embodiment of the present invention are provided are described in detail, principles of the invention and embodiment are set forth by specific case used herein, and the explanation of above example is only intended to help to understand method and the core concept thereof of the present invention;Simultaneously for one of ordinary skill in the art, according to the thought of the present invention, all will change in specific embodiments and applications, in sum, this specification content should not be construed as limitation of the present invention.