Summary of the invention
For correlation technique ensureing to calculate in system the real-time of resource information and effectiveness and then the problem of running status of system cannot be judged, the present invention proposes a kind of information collecting method and device, the collection of information can be completed with reasonable manner, the effectiveness of guarantee information and real-time, contribute to accurately judging the running status of system.
According to an aspect of the invention, it is provided a kind of information collecting method.
The method includes:
The status information that sink virtual machine management platform sends;
According to the type that the status information received is corresponding, status information is carried out statistical analysis;
Status information after statistical analysis is sent to managing node.
Wherein, the method can farther include: is pre-configured with status information capture index; Further, virtual machine management platform is determined the type needing the status information gathered and carries out status information capture according to status information capture index.
Additionally, acquisition index comprises at least one in following information: needs gather the type of status information, needs collection the mark of status information, collection interval.
It addition, status information is carried out statistical analysis according to the type that the status information received is corresponding to include at least one of:
For the status information of a type, the numerical value of the status information of the type that virtual machine management platform gathers in predetermined period is averaged, and using this meansigma methods as the status information after statistical analysis;
Status information for a type, it is determined that the time of the status information of the type that virtual machine management platform collects in predetermined period, and using the status information of the type that finally collects as the status information after statistical analysis.
Include additionally, status information is carried out statistical analysis according to the type that the status information received is corresponding:
Determine whether system exception occurs according to the status information received, and when determining that result would indicate that when being and be that system occurs that abnormal status information sends to management node and alerts to management node.
Preferably, the status information that sink virtual machine management platform sends includes:
With the status information that predetermined time interval sink virtual machine management platform sends;
Further, the method farther includes:
When the status information from virtual machine management platform cannot be received, alert to management node.
It addition, when the number of times that cannot receive the status information from virtual machine management platform exceedes threshold value, alert to management node.
According to a further aspect in the invention, it is provided that a kind of information collecting device.
This device includes:
Receiver module, for the status information that sink virtual machine management platform sends;
Statistical analysis module, carries out statistical analysis for the type corresponding according to the status information received to status information;
Sending module, for sending the status information after statistical analysis to managing node.
As the improvement of information collecting device of the present invention, statistical analysis module is used for:
Wherein, the numerical value of the status information of the type that virtual machine management platform gathers in predetermined period is averaged, and using this meansigma methods as the status information after statistical analysis; And/or
Status information for a type, it is determined that the time of the status information of the type that virtual machine management platform collects in predetermined period, and using the status information of the type that finally collects as the status information after statistical analysis.
Additionally, receiver module can be used for the status information sent with predetermined time interval sink virtual machine management platform;Further, sending module is additionally operable to when receiving the status information from virtual machine management platform, alerts to management node.
The present invention is under the premise ensureing performance, by reducing acquisition interval, and after information collecting device processes according to predetermined policy, it is possible to real-time and the effectiveness of collection of resources information are greatly improved, thus ensureing the stability of cloud computing management system.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments. Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art obtain, broadly fall into the scope of protection of the invention.
According to embodiments of the invention, it is provided that a kind of information collecting method.
As it is shown in figure 1, information collecting method according to embodiments of the present invention includes:
Step S101, the status information that sink virtual machine management platform (for example, it may be the Hypervisor being described below) sends;
Step S103, carries out statistical analysis according to the type that the status information received is corresponding to status information;
Step S105, sends the status information after statistical analysis to managing node.
Wherein, acquisition index comprises at least one in following information: needs gather the type of status information, needs collection the mark of status information, collection interval.
According to the type that the status information received is corresponding, status information is carried out statistical analysis and includes at least one of:
For the status information of a type, the numerical value of the status information of the type that virtual machine management platform gathers in predetermined period is averaged, and using this meansigma methods as the status information after statistical analysis;
Status information for a type, it is determined that the time of the status information of the type that virtual machine management platform collects in predetermined period, and using the status information of the type that finally collects as the status information after statistical analysis.
Such as, the cpu busy percentage for gathering can take the meansigma methods of all time point data, and internal memory surplus then takes the last look of all time point data, applies which kind of strategy and defines in status information capture index.
When realizing the solution of the present invention, it is possible to the status information that predetermined time interval sink virtual machine management platform sends; And when the status information from virtual machine management platform cannot be received, alert to management node. Such as, the alarm threshold that can will be unable to receive the number of times of the status information from virtual machine management platform is set to 3, and when not collecting data, number of times adds 1 every time, judges simultaneously, when this number of times is more than or equal to 3, warning information is sent to management node.
In addition, when gathering information appearance exception, equally possible alert, such as, alarm threshold corresponding with this parameter information for the status information desired value of reception can be compared, when the desired value of status information index exceedes this alarm threshold, information collecting device will produce warning information, and warning information is sent to management node. Alarm threshold is the boundary that information collecting device defines for certain index of status information.
Fig. 2 illustrates the system being capable of such scheme, and this system includes management node, agent node, Hypervisor. Wherein, above-mentioned steps S101, step S103 and step S105 can be performed by agent node.
Specifically, in fig. 2, management node is the computer running cloud computing management system; Hypervisor, i.e. virtual machine management platform, be that operation information gathers plug-in unit the resource registered in systems; Agent node serves the effect of information collecting device, is computer information gathering flow process being processed and forwarding. Wherein, what information gathering real-time property and effectiveness are played a major role is data analysis and the processing procedure of agent node. Management working mechanism between node, agent node, Hypervisor is: Hypervisor node is carried out resource detection by the Collection agent operating on Hypervisor, being sent to management node with the form of regulation, the child resource as Hypervisor is saved in data base;
In the course of the work, Hypervisor and child resource object thereof are analyzed processing by management node, generate acquisition index according to predefined content in configuration file, and are saved in data base;
The collection plug-in unit of the upper operation of Hypervisor obtains index of correlation from management node, according to index name, Index Content and acquisition time interval, gather the information of Hypervisor and child resource thereof, and send such information to agent node, and preserve in the buffer;
The data message received is analyzed processing by agent node, screens with predetermined strategy, is sent to management node and is saved in data base after effective information and abnormal information being arranged. Collection period is monitored simultaneously, to not collecting data and exceeding the Hypervisor of stipulated number, produces abnormality alarming information and be sent to management node.
When data can be collected, Hypervisor gathers a Data Concurrent for every 5 seconds and delivers to agent node, consider the performance of management node, after agent node receives collection data, need these data are judged, when data are normal, 12 data got averaged/last look according to different indexs, be subsequently sent to management node in past one minute; When data exception, send it to management node at once.
In the middle of practical application, specifically can be accomplished by the technique scheme of the present invention.
As a part for cloud computing management system, information acquisition system needs to rely on resource management system, it is now assumed that resource management with the addition of relevant resource, and has generated child resource and corresponding index, then to implement flow process as follows in information gathering:
(1) Hypervisor calls acquisition resource metrics interface
PublicList<Metric>getMetricsByResId (LongResId);
Wherein, resource Id, Metric that parameter ResId is Hypervisor are index object, contain index ID, index name, acquisition interval essential information.This interface returns Metric list.
(2) Hypervisor obtains desired value according to indication information, namely gathers data
PublicMetricValuegetMetricValueByMetric (Metricmetric);
Wherein MetricValue is desired value object, contains essential information and the index collection value of index. For acquisition interval, according to predefined ratio, it is divided into multiple closely-spaced, multiple time points are acquired respectively. Such as, if in data base definition be cpu busy percentage acquisition interval be 1 minute, then when Hypervisor is acquired, within every 5 seconds, a secondary data can be gathered, final all of data are required for being sent to agent node and are analyzed processing.
(3) collection data are sent to agent node by Hypervisor recalls information transmission interface
PublicvoidsendMetricValue (List<MetricValue>metricValueList);
This interface is agent node definition, and is issued as webservice service, Hypervisor obtain and call. Wherein metricValueList is desired value list.
(4) the collection information received is processed by agent node, and handling process is as follows:
(4.1) judge whether the collection data received are empty, i.e. traversal desired value list, it is judged that whether the index in metricValue has corresponding desired value. Its interface can be:
PublicvoiddataInfoCheck (MetricValuemetricValue);
If desired value exists, then carry out next step; Otherwise adding 1 by the number of times not collecting data in index, judge simultaneously, when this number of times is more than or equal to 3, call alarm transmission interface, warning information is sent to management node, alarm transmission interface is by alarm management subsystem definition.
(4.2) according to index ID, alarm threshold is obtained from management node, calling interface:
PublicThresholdgetThresholdByMetricId(LongmetricId)
Alarm threshold refers to the boundary that system defines for certain index, and when the desired value of certain index of resource exceedes the threshold value of definition, system will produce warning information. Threshold object saves the index ID corresponding to threshold value, threshold value, threshold unit.
(4.2) the collection information received is analyzed judging by agent node, performs different process according to judged result, it is judged that interface is:
PublicvoiddataInfoHandle (List<MetricValue>metricValueList, Thresholdthreshold)
In such a case, it is possible to perform following process:
A all indexs of () first searching loop, compare desired value and metrics-thresholds, when desired value exceedes threshold limits, produce warning information, call alarm transmission interface sendAlarmInfo (), warning information is sent to management node;
B () is according to index collection interval, obtain the collection data of all time points in this interval, different strategies is taked, for instance cpu busy percentage can take the meansigma methods of all time point data, and internal memory surplus then takes the last look of all time point data according to different indexs. Applying definition in the tangible index object of which kind of strategy, name variable is policy, and value is 0 and 1, and wherein, 0 represents meansigma methods, and 1 represents last look.
C collection data after process are sent to management node by (), calling interface is:
PublicvoidsendMetricInfo(List<MetricValue>metricList)��
Can effectively be analyzed the various status informations obtained from virtual machine management platform by above-mentioned process, and both will not increase accessing cost for data, it is also possible to ensure real-time and the effectiveness of data.
Corresponding to the information collecting method that the embodiment of the present invention provides, the information collecting device that the embodiment of the present invention provides is as it is shown on figure 3, include:
Receiver module 1, for the status information that sink virtual machine management platform sends;
Statistical analysis module 2, carries out statistical analysis for the type corresponding according to the status information received to status information;
Sending module 3, for sending the status information after statistical analysis to managing node.
Dissimilar according to the status information collected, statistical analysis module 2 decides whether that the numerical value of the status information of the type gathered in predetermined period for virtual machine management platform is averaged, and using this meansigma methods as the status information after statistical analysis; And/or determine time of the status information of the type that virtual machine management platform collects in predetermined period, and using the status information of the type that finally collects as the status information after statistical analysis.
Optimal way as information collecting device of the present invention, the receiver module 1 status information for sending with predetermined time interval sink virtual machine management platform, and, sending module 3 is additionally operable to when receiving the status information from virtual machine management platform, alerts to management node.
Can be seen that, the present invention is to being provided with agent node and it being carried out functional configuration, the data collected are carried out process and the judgement of intermediate link, to ensure the effectiveness of data, and, abnormal data can be judged by agent node such that it is able to finds produced problem when system is run timely.
In sum, by means of the technique scheme of the present invention, under the premise ensureing performance, by reducing acquisition interval, and after information collecting device processes according to predetermined policy, it is possible to real-time and the effectiveness of status information are greatly improved, thus ensureing the stability of cloud computing management system. And when the situation that cannot collect data occurs, set up alarming mechanism. Technical scheme can not only under cloud computing environment, it is achieved large-scale calculations resource information gathers, and the system for other kinds of platform or middle and small scale is equally applicable.
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all within the spirit and principles in the present invention, any amendment of making, equivalent replacement, improvement etc., should be included within protection scope of the present invention.