CN108268355A - For the monitoring system and method for data center - Google Patents
For the monitoring system and method for data center Download PDFInfo
- Publication number
- CN108268355A CN108268355A CN201611268506.6A CN201611268506A CN108268355A CN 108268355 A CN108268355 A CN 108268355A CN 201611268506 A CN201611268506 A CN 201611268506A CN 108268355 A CN108268355 A CN 108268355A
- Authority
- CN
- China
- Prior art keywords
- monitoring
- level
- monitoring data
- convergence
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3006—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3065—Monitoring arrangements determined by the means or processing involved in reporting the monitored data
- G06F11/3072—Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
- G06F11/3082—Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting the data filtering being achieved by aggregating or compressing the monitored data
Abstract
This application involves the monitoring system and methods for data center, the monitoring system includes monitoring resource layer, monitoring data convergence-level, monitoring center and configuration center, monitoring data convergence-level is multilayered structure, it monitors resource layer and includes multiple monitoring resource groups, the rule acquisition monitoring data that each monitoring resource group is issued according to configuration center, and monitoring data is reported to monitoring data convergence-level;Monitoring data convergence-level picks the monitoring data that monitoring resource layer is sent, and classifies to monitoring data, and carries out convergence processing to every class monitoring data;Monitoring center, for storing monitoring data;Configuration center, the aggregation node grouping information of grouping information, monitoring data convergence-level for configuration monitoring resource group and convergence strategy, and it is handed down to monitoring resource layer and monitoring data convergence-level.
Description
Technical field
Invention relates generally to computer and network technology field, more particularly, to the monitoring system for data center
System and method.
Background technology
As the continuous improvement of social informatization technology and the quick of Internet technology are popularized, computer equipment is increasingly
It is more, it is contemplated that in the near future, the number of devices involved by ultra-large data center is up to hundreds thousand of or even up to a hundred
Ten thousand, the data therefore, it is necessary to processing are also more and more, and demand of the every field to mass data processing is also increasing.In list
Machine memory space and operational capability cannot meet under the background for the needs of people are to mass data processing, Distributed Calculation
Start fast-developing and application with parallel computation, finally develop into grid computing.The monitoring information of large scale distributed system is
Magnanimity, monitoring resource is multi-level multi-source, the dynamic of big data platform, complexity to big data platform monitoring system
Regiment commander carrys out numerous difficulties.
Invention content
According to the one side of the application, a kind of monitoring system for data center is provided, which includes prison
Resource layer, monitoring data convergence-level, monitoring center and configuration center are controlled, monitoring data convergence-level is multilayered structure, monitors resource
Layer includes multiple monitoring resource groups, each monitors the rule acquisition monitoring data that resource group is issued according to configuration center, and will supervise
Control data are reported to monitoring data convergence-level;Monitoring data convergence-level picks the monitoring data that monitoring resource layer is sent, to monitoring
Data are classified, and carry out convergence processing to every class monitoring data;Monitoring center, for storing monitoring data;Configuration center,
The aggregation node grouping information of grouping information, monitoring data convergence-level for configuration monitoring resource group and convergence strategy, and under
Issue monitoring resource layer and monitoring data convergence-level.
According to the another aspect of the application, a kind of monitoring method for data center is provided, which is wrapping
It includes and is performed in the monitoring system of monitoring resource layer, monitoring data convergence-level, monitoring center and configuration center, wherein, monitor number
It is multilayered structure according to convergence-level, which includes:Monitoring data is simultaneously reported to prison by monitoring resource layer acquisition monitoring data
Data convergence-level is controlled, monitoring resource layer includes multiple monitoring resource groups, each rule for monitoring resource group and being issued according to configuration center
Then acquire monitoring data;Monitoring data convergence-level picks the monitoring data that monitoring resource layer is sent, and classifies to monitoring data,
And convergence processing is carried out to every class monitoring data;Monitoring center stores monitoring data;And configuration center configuration monitoring resource group
Grouping information, the aggregation node grouping information of monitoring data convergence-level and convergence strategy, and be handed down to monitoring resource layer and prison
Control data convergence-level..
Monitoring can be mitigated by providing one kind according to the monitoring system and method for data center of the embodiment of the present application
The technical solution of the pressure at center.
Description of the drawings
In conjunction with the following drawings, the application may be better understood in description according to an embodiment of the present application, wherein:
Fig. 1 shows the structure diagram of the monitoring system for data center according to one embodiment of the application;
Fig. 2 shows the flow chart of the monitoring method for data center according to one embodiment of the application;
Fig. 3 shows the operational flowchart of monitoring center according to one embodiment of the application;And
Fig. 4 show the monitoring resource node layer that can realize the application, monitoring data convergence-level node, monitoring center,
The structure diagram of the information processing equipment of one or more of configuration center.
Specific embodiment
The feature and exemplary embodiment of the application various aspects is described more fully below.Following description covers many
Detail, in order to provide the comprehensive understanding to the application.It will be apparent, however, to one skilled in the art that
The application can be implemented in the case of some details in not needing to these details.Below to the description of embodiment only
It is to provide the clearer understanding to the application by showing the example of the application.The application is not limited to set forth below
Any concrete configuration, but covered under the premise of spirit herein is not departed from correlated characteristic, structure, operation etc. appoint
What modification is replaced and is improved.
Monitoring is the important component of big data platform, with significantly increasing for monitoring data, current system and side
Method can not meet the monitoring demand of increasingly huge data center, lead to monitoring delay, the data portion of acquisition is lost, prison
The problems such as control central apparatus resource consumption remains high, so as to can not achieve the effective monitoring to data.
In addition, the server cluster of big data system may be because of service in statistical analysis software and hardware information related to summarizing
Device quantity is more, deployment software type is more and the reasons such as relevant information index excessively complexity, and abnormal alarm is caused to work
The problems such as measuring heavy, monitoring and alarm inefficiency.
The application converges corresponding resource, while to every layer by using the thinking of multilayer monitoring structure at every layer
Resource is analyzed and is handled, and is mitigated the pressure of monitoring center significantly, is realized the effective monitoring to big data.
Fig. 1 shows the structural representation of the monitoring system 100 for data center according to one embodiment of the application
Figure.
In one embodiment, system 100 can be included in monitoring resource layer 102, monitoring data convergence-level 101, monitoring
The heart 103 and configuration center 104.In one embodiment, monitoring data convergence-level 101 is multilayered structure.In one embodiment,
According to the amount that the scale of data center and needs monitor, monitoring data convergence-level can be according to business, region, network condition etc.
N layers are divided into, wherein N is positive integer, for example, as shown in Figure 1, one 101_1 of monitoring data convergence-level, monitoring data convergence-level
Two 101_2, three 101_3 ... of monitoring data convergence-level and monitoring data convergence-level N 101_N.The application is not to monitoring number
It is limited according to the number of convergence-level, but with the increase of data processing level, increase monitoring that can be additional postpones, therefore, needs
The number of plies is set according to portfolio, business feature of data center etc..
In one embodiment, monitoring resource layer 102 can be as the basal layer of system 100.In one embodiment, it supervises
Multiple monitoring resource groups 1021 can be included by controlling resource layer 102, and each monitoring resource group 1021 can be according under configuration center 104
The rule acquisition monitoring data of hair, and monitoring data is reported to monitoring data convergence-level 101.
In one embodiment, at least one of multiple monitoring resource groups 1021 monitoring resource group 1021 can include needing
The resource to be monitored.Resource for example can be software, hardware, or combination each attribute, for example, service attribute.
In one embodiment, monitoring node can be disposed for the resource monitored is each needed, to be carried out to resource
Monitoring and gathered data.In one embodiment, monitoring node can be agency (agent).
In one embodiment, monitoring data convergence-level 101 picks the monitoring data that monitoring resource layer 102 is sent, to prison
Control data are classified, and carry out convergence processing to every class monitoring data.
In one embodiment, monitoring center 103 is used to store monitoring data.
In one embodiment, the remittance of the grouping information, monitoring data convergence-level of 104 configuration monitoring resource group of configuration center
Poly- node grouping information and convergence strategy, and it is handed down to monitoring resource layer and monitoring data convergence-level.
In one embodiment, each monitoring node can respectively report collected data, for example, reporting to
First monitoring data convergence-level (for example, one 101_1 of monitoring data convergence-level).It in one embodiment, can be in each monitoring
Main monitoring node is selected according to pre-defined rule in monitoring node in resource group, other monitoring nodes converge collected data
Gather main monitoring node, and by main monitoring node reported data.It in one embodiment, can be according to pre-defined rule, for not
Same monitoring resource group is reported using above two mode to monitoring data convergence-level.
In one embodiment, monitoring data convergence-level 101 can include the first monitoring data convergence-level and the second monitoring
Data convergence-level, for example, two 101_2 (not shown in figure 1)s of one 101_1 of monitoring data convergence-level and monitoring data convergence-level, the
Two monitoring data convergence-levels can be the upper strata convergence-level of the first monitoring data convergence-level.
In one embodiment, the first monitoring data convergence-level picks the monitoring data that monitoring resource layer 102 is sent, and is used for
Monitoring data is converged to obtain the first monitoring information, and the first monitoring information is sent to the convergence of the second monitoring data
Layer.
In one embodiment, the second monitoring data convergence-level receives the first monitoring letter from the first monitoring data convergence-level
Breath, converges the first monitoring information to obtain the second monitoring information, and the second monitoring information is sent to monitoring center 103
Or second monitoring data convergence-level upper strata convergence-level (for example, two 101_3 of monitoring data convergence-level).
In one embodiment, monitoring center 103 for example can be used for being responsible for storage, analysis, the system event of monitoring information
Barrier automated to respond to mechanism action analysis and automatic fault processing action issue and the transmission of warning information in one or
It is multinomial.
In one embodiment, monitoring center 103 can include monitoring data center 1031.In one embodiment, it supervises
Control data center 1031 can be used for storing monitoring data.In one embodiment, it is stored at monitoring data center 1031 all
Monitoring data.In one embodiment, storage section monitoring data at monitoring data center 1031.
In one embodiment, monitoring center 103 can include monitoring data analysis center 1032.In one embodiment
In, monitoring data analysis center 1032 can be used for analyzing monitoring data.In one embodiment, monitoring data is analyzed
Center 1032 can carry out one or more in following item:The preliminary analysis judgement of fault message, system performance information analysis,
The analysis of service feature status data, resource trends analysis, device load-bearing capability analysis, resource capacity expansion demand analysis and generation
All kinds of relevant reports of monitoring.
In one embodiment, monitoring center 103 can include monitoring notice alarm center 1033.In one embodiment
In, the notice warning information that monitoring notice alarm center 1033 can be responsible for monitoring system is sent.In one embodiment, it monitors
Alarm center 1033 is notified corresponding notice warning information can be sent to corresponding personnel according to corresponding configuration.
In one embodiment, monitoring center 103 can include monitoring self-defined self-healing analysis center 1034.In a reality
Apply in example, monitor self-defined self-healing analysis center 1034 can be used for the failure of 1032 preliminary analysis of monitoring data analysis center into
It goes and further analyzes, generation is used to solve accordingly failure (for example, equipment crash, the event of the failure of device systems level, business
Barrier etc.) instruction.
In one embodiment, monitoring center 103 can issue center 1035 including monitoring custom action.In a reality
It applies in example, monitoring custom action, which issues center 1035, will monitor under the instruction that self-defined self-healing analysis center 1034 generates
Corresponding device node is dealt into, to solve failure.In one embodiment, the instruction issued can include but is not limited to:Restart
System, restarting equipment restart business, equipment is powered off or powers up restart firmly again (are set accessing corresponding power management
In the case of standby) etc..
In one embodiment, monitoring data convergence-level can include multiple data convergence groups 1011, each data convergence
Group is operated according to the rule that configuration center issues.In one embodiment, data convergence group 1011 can include multiple sections
Point.In one embodiment, a certain node can be selected in multiple nodes in data convergence group 1011 as main convergence section
Point, for sending monitoring information.
In one embodiment, monitoring data convergence-level can classify to monitoring data, specifically include:According to monitoring
Monitoring data is divided into real time monitoring information, non real-time monitoring information and non-by the real time monitoring attribute of data and effective monitoring
Monitoring information.
In one embodiment, monitoring data convergence-level carries out convergence processing to every class monitoring data, specifically includes:If it connects
The monitoring data of receipts is real time monitoring information, sends real time monitoring information in real time;If the monitoring data received is non real-time monitoring
Information, by non real-time monitoring information asynchronous transmission to monitoring center;If the monitoring data received is non-supervised information, non-prison is abandoned
Control information.Thus, it is possible to it improves response speed and mitigates the pressure of monitoring center.
In one embodiment, real time information highest priority, can be real-time by the main aggregation node of data convergence group 1011
The real time information is sent to the monitoring data convergence-level of monitoring center 103 or higher level by ground.It in one embodiment, can root
Non-real-time information is asynchronously sent directly to by monitoring center 103 by main aggregation node according to resource service condition, network flow etc..
In one embodiment, non-supervised information (e.g., including invalid information or the information no longer needed) can be direct by each node
It abandons or is abandoned by main aggregation node.The pressure of the monitoring data convergence-level of monitoring center 103 and higher level will be mitigated as a result,
Power.
In one embodiment, configuration center 104 can be used for storing various configuration strategies, and be issued to respective nodes.
In one embodiment, configuration strategy can include:Monitor grouping information, the monitoring resource layer 102 of resource layer 102
Main monitoring node selection mechanism, monitoring strategy, monitoring data convergence-level convergence strategy, monitoring data convergence-level main remittance
Poly- node selection mechanism.
In one embodiment, configuration center 104 can include monitoring group configuration 1041.In one embodiment, it monitors
Group configuration 1041 can store the grouping information configuration of monitoring resource.In other words, monitoring group configuration 1041 can be stored with being supervised
The relevant information of resource of control, the monitoring resource group belonging to including but not limited to each resource.In one embodiment, in configuration
The heart 104 can be grouped monitoring resource according to business information, physical region, service logic, network area etc..At one
In embodiment, due to each resource deployment monitoring node (for example, agency), being provided to the monitoring belonging to each resource
After source group is configured, monitoring node is also provided with corresponding group.
In one embodiment, configuration center 104 can include convergence group configuration 1042.In one embodiment, it converges
Group configuration 1042 can store the node grouping information configuration of monitoring data convergence-level.In other words, convergence group configuration 1042 can be with
Storage and the relevant information of node of monitoring data convergence-level, data convergence group belonging to including but not limited to each node and
Level belonging to data convergence group.
In one embodiment, configuration center 104 can include monitoring strategies configuration 1043.In one embodiment, it supervises
Control strategy configuration 1043 can store and monitor relevant strategy configuration.In one embodiment, strategy configuration is included but not
It is limited to:The information acquired, the frequency for acquiring information, the method for acquiring information and monitoring resource layer are needed by information reporting
Which node to monitoring data convergence-level etc..In one embodiment, strategy configuration, which can also include main monitoring node, needs
The monitoring exploration policy to same group of resource actively to initiate, for example, same organize network delay inspection, with group node viability inspection
Deng.
In one embodiment, configuration center 104 can include convergence strategy configuration 1044.In one embodiment, it converges
Poly- strategy configuration 1044 can store the convergence strategy configuration of monitoring data convergence-level.In one embodiment, the convergence strategy
Configuration can include monitoring data convergence-level to converging the preliminary analysis strategy of information to come up, information classification policy (by information
Be divided into real time information, non-real-time information, non-supervised information etc.), the configuration of higher level's receiving node of main aggregation node (for example, which
Which of data convergence group node receives information), non real-time monitoring information asynchronous upload monitoring center strategy is (for example, asynchronous
Upload opportunity, mode etc.) etc..
In one embodiment, configuration center 104 can include monitoring group/convergence group host node selection mechanism configuration
1045.In one embodiment, monitoring group/convergence group host node selection mechanism configuration 1045 can store the main monitoring section of monitoring group
The main aggregation node selection mechanism configuration of point/convergence group.It is each to supervise since large-scale data center environment is complicated, resource is different
The host node selection strategy of control group/convergence group is limited to corresponding node resource, Internet resources, input and output (I/O) resource etc.
And corresponding monitoring strategies.
In one embodiment, each monitoring group/convergence group can be according to respective business feature, to set corresponding selection
Mechanism.For example, in the very big monitoring resource group of I/O resource pressures, the preferential selection smaller section of I/O resource pressures can be set
Point is used as main monitoring node.For example, being consumed in process resource (for example, processor) in larger monitoring resource group, can set
The relatively low node of process resource utilization rate is preferentially used as main monitoring node.In the more complicated situation of environment, host node is selected
Strategy may need comprehensive many factors.
For example, above-mentioned strategy can rely on specific business and change, for example, respectively with the business of analysis classes and business class
The corresponding strategy of business be different.For example, in one embodiment, for the business of analysis classes, it can not consider have
The utilization rate of the process resource of body time point or period, but for class business of doing business, then need to consider this attribute.
In one embodiment, for a variety of different business, in corresponding Different Strategies attribute needed to be considered include but
It is not limited to one or more of the following items:I/O resources service condition, process resource service condition, network flow, link number
Etc..In one embodiment, some business may not be needed to consider any of the above described attribute, and other business may need
Consider whole attributes.
In one embodiment, the attribute considered can be real-time data or non real-time historical data.
In one embodiment, configuration center 104 can issue center 1046 including configuration strategy.In one embodiment
In, configuration strategy, which issues center 1046, can be responsible under the configuration strategy for monitoring resource node layer/monitoring data convergence-level node
1041, convergence group configuration 4012, monitoring strategies configuration 1043, convergence strategy configuration 1044, monitoring is configured in above-mentioned monitoring group by hair
One or more of 1045 tactful configuration distributing is configured to respective nodes in group/convergence group host node selection mechanism.
Fig. 2 shows the flow chart 200 of the monitoring method for data center according to one embodiment of the application.It should
Monitoring method can in including monitoring resource layer, monitoring data convergence-level, the monitoring system of monitoring center and configuration center quilt
It performs, wherein, monitoring data convergence-level is multilayered structure.
At step 201, the grouping information of configuration center configuration monitoring resource group, the aggregation node of monitoring data convergence-level
Grouping information and convergence strategy, and it is handed down to monitoring resource layer and monitoring data convergence-level.
At step 202, monitoring data is simultaneously reported to monitoring data convergence-level by monitoring resource layer acquisition monitoring data, is supervised
It controls resource layer and includes multiple monitoring resource groups, each rule acquisition monitoring data for monitoring resource group and being issued according to configuration center.
At step 203, monitoring data convergence-level picks the monitoring data that monitoring resource layer is sent, and monitoring data is carried out
Classification, and convergence processing is carried out to every class monitoring data.
At step 204, monitoring center storage monitoring data.
In one embodiment, monitoring resource layer can include multiple monitoring resource groups, and each resource group that monitors can wrap
Include one or more nodes.
In one embodiment, monitoring data convergence-level includes at least the first monitoring data convergence-level and the second monitoring data
Convergence-level, the second monitoring data convergence-level are the upper strata convergence-level of the first monitoring data convergence-level, and method 200 can include:The
One monitoring data convergence-level picks the monitoring data that monitoring resource layer is sent, for being converged to obtain first to monitoring data
Monitoring information, and the first monitoring information is sent to the second monitoring data convergence-level;Second monitoring data convergence-level is supervised from first
It controls data convergence-levels and receives the first monitoring information, the first monitoring information is converged to obtain the second monitoring information, and by the
Two monitoring informations are sent to the upper strata convergence-level of monitoring center or the second monitoring data convergence-level.
In one embodiment, monitoring data convergence-level is classified to monitoring data and can be included in step 203:According to
Monitoring data is divided into real time monitoring information, non real-time monitoring information by the real time monitoring attribute of monitoring data and effective monitoring
With non-supervised information.
In one embodiment, in step 203 monitoring data convergence-level convergence processing is carried out to every class monitoring data can be with
Including:If the monitoring data received is real time monitoring information, real time monitoring information is sent in real time;If the monitoring data received is non-
Information is monitored in real time, by non real-time monitoring information asynchronous transmission to monitoring center;If the monitoring data received is non-supervised information,
Abandon non-supervised information.
In one embodiment, method 200 further includes configuration center and following item is configured:Monitor the main prison of resource layer
Control the main aggregation node of node selection mechanism, the strategy of monitoring, the convergence strategy of monitoring data convergence-level, monitoring data convergence-level
Selection mechanism.For example, being configured in configuration center with monitoring relevant strategy, as above, which is issued to monitoring by configuration center
Node in resource layer.For example, the convergence strategy in configuration center configuration monitoring data convergence-level is configured, as above, by configuration
The heart issues the node that the convergence strategy is configured in each monitoring data convergence-level.
In one embodiment, following item can be configured in configuration center:Monitor the main monitoring node choosing of resource layer
The main aggregation node selection machine for system, the strategy of monitoring, the convergence strategy of monitoring data convergence-level, the monitoring data convergence-level of selecting a good opportunity
System.
In one embodiment, monitoring data convergence-level can include multiple data convergence groups, and method 200 can also wrap
It includes:Each data convergence group is operated according to the rule that configuration center issues.
In one embodiment, method 200 can also include:Master is selected from multiple nodes in monitoring data convergence group
Aggregation node sends monitoring information.
In one embodiment, each node for monitoring the monitoring resource group in resource layer arrives collected data summarization
Collected data are transmitted up monitoring data convergence by the main monitoring node of corresponding monitoring resource group by the main monitoring node
Layer.In one embodiment, each node for monitoring the monitoring resource group in resource layer sends out collected data directly up
It is sent to monitoring data convergence-level.
In one embodiment, the node of monitoring data convergence-level can be to from the monitoring resource layer or other monitoring of even lower level
The data that data convergence-level receives are classified, the main aggregation node of data convergence group where then corresponding data is converged to
Place.In one embodiment, the main aggregation node of monitoring data convergence-level can where converge to data convergence group other
After the data of node, classify to these data.
Fig. 3 shows the operational flowchart 300 of monitoring center according to one embodiment of the application.
In one embodiment, at 301, monitoring data convergence-level 101 will be acquired by synchronous or asynchronous mode
To real-time or non real-time data be sent to monitoring center 103, and the data are stored in monitoring data by monitoring center 103
The heart 1031.
In one embodiment, at 302, monitoring data analysis center 1032 reads the data at monitoring data center 1031
To be analyzed.
In one embodiment, at 303, monitoring data analysis center 1032 stores analysis result into monitoring data
The heart 1031.
In one embodiment, monitoring data analysis center 1032 to from 1031 read data of monitoring data center into
Row analysis if discovery failure, at 304, will notify configuration center 104 to update the configuration information of malfunctioning node interdependent node.
In one embodiment, monitoring data analysis center 1032 carries out preliminary analysis processing to failure.
In one embodiment, at 305, if failure not in the range of Self healing Strategy, then monitoring data analysis center
1032 notice monitoring notice alarm centers 1033 send out corresponding notice warning information to corresponding personnel, so as to manually be located
Reason.
In one embodiment, at 306, if failure in the range of Self healing Strategy, monitoring data analysis center 1032
Fault message is sent to the self-defined self-healing analysis center 1034 of monitoring.
In one embodiment, it at 307, monitors self-defined self-healing analysis center 1034 and fault message is analyzed,
Corresponding self-healing instruction is generated, and self-healing instruction is issued to monitoring custom action and issues center 1035.
In one embodiment, monitoring custom action issues center 1035 and corresponding self-healing operational order is issued to phase
The node answered carries out self-healing operation.For example, at 309, monitoring custom action, which issues center 1035, will be directed to monitoring resource section
The self-healing operational order of point is issued to corresponding monitoring resource node and carries out self-healing operation.For example, at 309, monitoring is self-defined
Action issues center 1035 will be issued to corresponding monitoring data convergence for the self-healing operational order of monitoring data aggregation node
Node carries out self-healing operation.
In one embodiment, if malfunctioning node self-healing operates successfully at 310, at 311, configuration center is notified
104 issue the strategy of interdependent node, and respective nodes are re-incorporated INTO monitoring range.
In one embodiment, if at 312 malfunctioning node self-healing operation failure, at 313, notice monitoring notice
Alarm center 1033 sends corresponding notice alarm to corresponding personnel, to carry out artificial treatment.
Fig. 4 shows the structure diagram of information processing equipment 400, the section of the monitoring resource layer in embodiments herein
One or more of point, the node of monitoring data convergence-level, monitoring center, configuration center can be by information processing equipments 400
To realize.As shown in figure 4, equipment 400 can include with one or more in lower component:Processor 420, memory 430, electricity
Source component 440, input/output (I/O) interface 460, communication interface 480, these components can for example be led to by bus 410
The mode of letter connects.
The operation of the control device 400 on the whole of processor 420, for example, it is associated with data communication and calculation processing etc.
Operation.Processor 420 can include one or more processing cores, and be able to carry out instruction to realize method described herein
All or part of step.Processor 420 can include the various devices with processing function, including but not limited to general procedure
It is device, application specific processor, microprocessor, microcontroller, graphics processor (GPU), digital signal processor (DSP), special integrated
Circuit (ASIC), programmable logic device (PLD), field programmable gate array (FPGA) etc..Processor 420 can include
Caching 425 can communicate with caching 425, to improve the access speed of data.
Memory 430 is configured as storing various types of instructions and/or data with the operation of holding equipment 400.Data
Example include for instruction, data of any application program for operating on device 400 or method etc..Memory 430 can be with
It is realized by any kind of volatibility or non-volatile memory device or combination thereof.Memory 430 can include partly leading
Body memory, such as random access memory (RAM), static RAM (SRAM), dynamic random access memory
(DRAM), read-only memory (ROM), programmable read only memory (PROM), Erasable Programmable Read Only Memory EPROM (EPROM),
Electrically erasable programmable read-only memory (EEPROM), flash memory etc..Memory 430 can also include for example using paper being situated between
Any memory of matter, magnetic medium and/or optical medium, as paper tape, hard disk, tape, floppy disk, magneto-optic disk (MO), CD, DVD,
Blue-ray etc..
Power supply module 440 provides electric power for the various assemblies of equipment 400.Power supply module 440 can include internal cell and/
Or external power interface, and can include power-supply management system and other with generating, managing and distributing electric power phase for equipment 400
Associated component.
I/O interfaces 460 provide the interface for allowing users to interact with equipment 400.I/O interfaces 460 for example can be with
Including the interface based on technologies such as PS/2, RS-232, USB, FireWire, Lightening, VGA, HDMI, DisplayPort,
It allows users to through keyboard, Genius mouse, touch tablet, touch screen, control stick, button, microphone, loud speaker, display, camera shooting
The peripheral devices such as head, projection port and equipment 400 interact.
Communication interface 480 is configured to that equipment 400 is enable to communicate in a wired or wireless fashion with other equipment.If
Standby 400 can access the wireless network based on one or more communication standards by communication interface 480, for example, Wi-Fi, 2G, 3G,
4G communication networks.In a kind of exemplary embodiment, communication interface 480 can also be received via broadcast channel from external broadcasting
The broadcast singal or broadcast related information of management system.Illustrative communication interface 480 can be included based on near-field communication (NFC)
Technology, radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology, bluetooth (BT) technology etc.
The interface of communication mode.
Structures described above frame functional block shown in figure can be implemented as hardware, software, firmware or their group
It closes.When realizing in hardware, it may, for example, be electronic circuit, application-specific integrated circuit (ASIC), appropriate firmware, insert
Part, function card etc..When being realized with software mode, element of the invention is used to perform program or the generation of required task
Code section.Either code segment can be stored in machine readable media program or the data-signal by being carried in carrier wave is passing
Defeated medium or communication links are sent." machine readable media " can include being capable of any medium of storage or transmission information.
The example of machine readable media includes electronic circuit, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), soft
Disk, CD-ROM, CD, hard disk, fiber medium, radio frequency (RF) link, etc..Code segment can be via such as internet, inline
The computer network of net etc. is downloaded.
" one embodiment " is mentioned above, it should be understood, however, that the feature referred in various embodiments might not
The embodiment is can be only applied to, but is possibly used for other embodiment or is applied in combination with other embodiment.
The application is described above with reference to the specific embodiment of the application, but those skilled in the art are equal
Solution, implementation method mentioned herein are the application statement, and listed specific embodiment is only the applicating example of the application,
Do not represent the application be only limitted to it is such using example, and these specific embodiments can be carry out various modifications, combine and
Change, without departing from the spirit and scope limited by appended claims or its equivalent.
Claims (12)
1. a kind of monitoring system for data center, which is characterized in that the monitoring system includes monitoring resource layer, monitoring number
According to convergence-level, monitoring center and configuration center, the monitoring data convergence-level is multilayered structure,
The monitoring resource layer includes multiple monitoring resource groups, what each monitoring resource group was issued according to the configuration center
Rule acquisition monitoring data, and the monitoring data is reported to the monitoring data convergence-level;
The monitoring data convergence-level picks the monitoring data that the monitoring resource layer is sent, and the monitoring data is divided
Class, and convergence processing is carried out to every class monitoring data;
The monitoring center, for storing the monitoring data;
The configuration center, for the convergence section of the grouping information of the monitoring resource group, the monitoring data convergence-level to be configured
Point grouping information and convergence strategy, and it is handed down to the monitoring resource layer and the monitoring data convergence-level.
2. monitoring system as described in claim 1, which is characterized in that the monitoring data convergence-level includes at least the first monitoring
Data convergence-level and the second monitoring data convergence-level, the second monitoring data convergence-level are the first monitoring data convergence-level
Upper strata convergence-level;
The first monitoring data convergence-level picks the monitoring data that the monitoring resource layer is sent, for the monitoring data
It is converged to obtain the first monitoring information, and first monitoring information is sent to the second monitoring data convergence-level;
The second monitoring data convergence-level receives first monitoring information, to institute from the first monitoring data convergence-level
The first monitoring information is stated to be converged to obtain the second monitoring information, and by second monitoring information be sent to monitoring center or
The upper strata convergence-level of second monitoring data convergence-level.
3. monitoring system as claimed in claim 1 or 2, which is characterized in that the monitoring data convergence-level is to the monitoring number
According to classifying, specifically include:
According to the real time monitoring attribute of the monitoring data and effective monitoring, the monitoring data is divided into real time monitoring letter
Breath, non real-time monitoring information and non-supervised information.
4. monitoring system as claimed in claim 3, which is characterized in that the monitoring data convergence-level to every class monitoring data into
Row convergence processing, specifically includes:
If the monitoring data received is real time monitoring information, the real time monitoring information is sent in real time;
If the monitoring data received is non real-time monitoring information, will be in the non real-time monitoring information asynchronous transmission to the monitoring
The heart;
If the monitoring data received is non-supervised information, the non-supervised information is abandoned.
5. monitoring system as described in claim 1, which is characterized in that the configuration center is additionally operable to be configured:
The main monitoring node selection mechanism of the monitoring resource layer, the convergence plan of tactful, the described monitoring data convergence-level of monitoring
Slightly, the main aggregation node selection mechanism of the monitoring data convergence-level.
6. monitoring system as described in claim 1, which is characterized in that the monitoring data convergence-level is converged including multiple data
Group, each data convergence group are operated according to the rule that the configuration center issues.
7. a kind of monitoring method for data center, which is characterized in that the monitoring method is including monitoring resource layer, monitoring
It is performed in the monitoring system of data convergence-level, monitoring center and configuration center, wherein, the monitoring data convergence-level is multilayer
Structure, the monitoring method include:
The aggregation node grouping of the grouping information, the monitoring data convergence-level of the configuration center configuration monitoring resource group
Information and convergence strategy, and it is handed down to the monitoring resource layer and the monitoring data convergence-level;
The monitoring data is simultaneously reported to the monitoring data convergence-level, the prison by the monitoring resource layer acquisition monitoring data
It controls resource layer and includes multiple monitoring resource groups, the rule acquisition prison that each monitoring resource group is issued according to the configuration center
Control data;
The monitoring data convergence-level picks the monitoring data that the monitoring resource layer is sent, and the monitoring data is divided
Class, and convergence processing is carried out to every class monitoring data;And
The monitoring center stores the monitoring data.
8. monitoring method as claimed in claim 7, which is characterized in that the monitoring data convergence-level includes at least the first monitoring
Data convergence-level and the second monitoring data convergence-level, the second monitoring data convergence-level are the first monitoring data convergence-level
Upper strata convergence-level, the method includes:
The first monitoring data convergence-level picks the monitoring data that the monitoring resource layer is sent, for the monitoring data
It is converged to obtain the first monitoring information, and first monitoring information is sent to the second monitoring data convergence-level;
The second monitoring data convergence-level receives first monitoring information from the first monitoring data convergence-level, to described
First monitoring information is converged to obtain the second monitoring information, and second monitoring information is sent to monitoring center or
The upper strata convergence-level of two monitoring data convergence-levels.
9. monitoring method as claimed in claim 7 or 8, which is characterized in that the monitoring data convergence-level is to the monitoring number
Include according to classification is carried out:
According to the real time monitoring attribute of the monitoring data and effective monitoring, the monitoring data is divided into real time monitoring letter
Breath, non real-time monitoring information and non-supervised information.
10. monitoring method as claimed in claim 9, which is characterized in that the monitoring data convergence-level is to every class monitoring data
Convergence processing is carried out to include:
If the monitoring data received is real time monitoring information, the real time monitoring information is sent in real time;
If the monitoring data received is non real-time monitoring information, will be in the non real-time monitoring information asynchronous transmission to the monitoring
The heart;
If the monitoring data received is non-supervised information, the non-supervised information is abandoned.
11. monitoring method as claimed in claim 7, which is characterized in that the configuration center is also configured following item:Institute
State the convergence strategy, described of the main monitoring node selection mechanism of monitoring resource layer, tactful, the described monitoring data convergence-level of monitoring
The main aggregation node selection mechanism of monitoring data convergence-level.
12. monitoring method as claimed in claim 7, which is characterized in that the monitoring data convergence-level is converged including multiple data
Poly group, the monitoring method include:Each data convergence group is operated according to the rule that the configuration center issues.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611268506.6A CN108268355A (en) | 2016-12-31 | 2016-12-31 | For the monitoring system and method for data center |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611268506.6A CN108268355A (en) | 2016-12-31 | 2016-12-31 | For the monitoring system and method for data center |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108268355A true CN108268355A (en) | 2018-07-10 |
Family
ID=62771238
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611268506.6A Pending CN108268355A (en) | 2016-12-31 | 2016-12-31 | For the monitoring system and method for data center |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108268355A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109062699A (en) * | 2018-08-15 | 2018-12-21 | 郑州云海信息技术有限公司 | A kind of resource monitoring method, device, server and storage medium |
CN109204389A (en) * | 2018-09-12 | 2019-01-15 | 济南轨道交通集团有限公司 | A kind of subway equipment fault diagnosis and self-healing method, system |
CN110417597A (en) * | 2019-07-29 | 2019-11-05 | 中国工商银行股份有限公司 | For monitoring method and device, electronic equipment and the readable storage medium storing program for executing of certificate |
CN111934923A (en) * | 2020-07-30 | 2020-11-13 | 深圳市高德信通信股份有限公司 | CDN network quality monitoring system based on internet |
CN113923131A (en) * | 2021-09-10 | 2022-01-11 | 北京世纪互联宽带数据中心有限公司 | Monitoring information determination method and device, computing equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7318094B1 (en) * | 2002-02-13 | 2008-01-08 | Cisco Technology, Inc. | Apparatus, system and device for collecting, aggregating and monitoring network management information |
CN103092158A (en) * | 2012-12-31 | 2013-05-08 | 深圳先进技术研究院 | Large building energy consumption real-time monitoring system based on wireless sensor network |
CN103414571A (en) * | 2013-08-03 | 2013-11-27 | 东北大学 | Information collecting and transmitting convergence node used for industrial monitoring |
CN104052634A (en) * | 2014-05-30 | 2014-09-17 | 国家电网公司 | Information security monitoring system and method |
CN105991366A (en) * | 2015-03-05 | 2016-10-05 | 中国移动通信集团福建有限公司 | Service monitoring method and system |
-
2016
- 2016-12-31 CN CN201611268506.6A patent/CN108268355A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7318094B1 (en) * | 2002-02-13 | 2008-01-08 | Cisco Technology, Inc. | Apparatus, system and device for collecting, aggregating and monitoring network management information |
CN103092158A (en) * | 2012-12-31 | 2013-05-08 | 深圳先进技术研究院 | Large building energy consumption real-time monitoring system based on wireless sensor network |
CN103414571A (en) * | 2013-08-03 | 2013-11-27 | 东北大学 | Information collecting and transmitting convergence node used for industrial monitoring |
CN104052634A (en) * | 2014-05-30 | 2014-09-17 | 国家电网公司 | Information security monitoring system and method |
CN105991366A (en) * | 2015-03-05 | 2016-10-05 | 中国移动通信集团福建有限公司 | Service monitoring method and system |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109062699A (en) * | 2018-08-15 | 2018-12-21 | 郑州云海信息技术有限公司 | A kind of resource monitoring method, device, server and storage medium |
CN109204389A (en) * | 2018-09-12 | 2019-01-15 | 济南轨道交通集团有限公司 | A kind of subway equipment fault diagnosis and self-healing method, system |
CN110417597A (en) * | 2019-07-29 | 2019-11-05 | 中国工商银行股份有限公司 | For monitoring method and device, electronic equipment and the readable storage medium storing program for executing of certificate |
CN110417597B (en) * | 2019-07-29 | 2022-11-01 | 中国工商银行股份有限公司 | Method and device for monitoring certificate, electronic equipment and readable storage medium |
CN111934923A (en) * | 2020-07-30 | 2020-11-13 | 深圳市高德信通信股份有限公司 | CDN network quality monitoring system based on internet |
CN113923131A (en) * | 2021-09-10 | 2022-01-11 | 北京世纪互联宽带数据中心有限公司 | Monitoring information determination method and device, computing equipment and storage medium |
CN113923131B (en) * | 2021-09-10 | 2023-08-22 | 北京世纪互联宽带数据中心有限公司 | Monitoring information determining method and device, computing equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108268355A (en) | For the monitoring system and method for data center | |
Serhani et al. | Self-adapting cloud services orchestration for fulfilling intensive sensory data-driven IoT workflows | |
CN105653425B (en) | Monitoring system based on complex event processing engine | |
WO2021129367A1 (en) | Method and apparatus for monitoring distributed storage system | |
CN109214704A (en) | A kind of distributed intelligence operation platform, method, apparatus and readable storage medium storing program for executing | |
US10142242B2 (en) | Network support node traffic reduction for self-organizing networks | |
KR20150112357A (en) | Sensor data processing system and method thereof | |
US9772871B2 (en) | Apparatus and method for leveraging semi-supervised machine learning for self-adjusting policies in management of a computer infrastructure | |
CN106100868B (en) | A kind of project operation and maintenance device, system and method | |
CN110309108A (en) | Data acquisition and storage method, device, electronic equipment, storage medium | |
US20150081376A1 (en) | Customization of event management and incident management policies | |
CN107544832A (en) | A kind of monitoring method, the device and system of virtual machine process | |
CN108337127A (en) | application performance monitoring method, system, terminal and computer readable storage medium | |
US11743237B2 (en) | Utilizing machine learning models to determine customer care actions for telecommunications network providers | |
CN110096258A (en) | A method of the OpenStack infrastructure architecture management based on Terraform | |
US11388109B2 (en) | Hierarchical capacity management in a virtualization environment | |
CN107479974A (en) | A kind of dispatching method of virtual machine and device | |
US20230010417A1 (en) | Message oriented middleware cluster synchronization | |
CN107197002A (en) | Cloud computing system and cloud data processing method | |
Wladdimiro et al. | Disaster management platform to support real-time analytics | |
CN112925619A (en) | Big data real-time computing method and platform | |
CN103414717A (en) | Simulation monitoring method and system in regard to C / S structure service system | |
CN114756301B (en) | Log processing method, device and system | |
JP2021506010A (en) | Methods and systems for tracking application activity data from remote devices and generating modified behavioral data structures for remote devices | |
CN115719147A (en) | Power transmission line inspection data processing method, device and platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180710 |
|
RJ01 | Rejection of invention patent application after publication |