CN103905238A - Data center abnormal information collection system and method - Google Patents

Data center abnormal information collection system and method Download PDF

Info

Publication number
CN103905238A
CN103905238A CN201210584066.0A CN201210584066A CN103905238A CN 103905238 A CN103905238 A CN 103905238A CN 201210584066 A CN201210584066 A CN 201210584066A CN 103905238 A CN103905238 A CN 103905238A
Authority
CN
China
Prior art keywords
server
abnormal information
data center
list
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210584066.0A
Other languages
Chinese (zh)
Inventor
林明珉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hongfujin Precision Industry Shenzhen Co Ltd
Hon Hai Precision Industry Co Ltd
Original Assignee
Hongfujin Precision Industry Shenzhen Co Ltd
Hon Hai Precision Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hongfujin Precision Industry Shenzhen Co Ltd, Hon Hai Precision Industry Co Ltd filed Critical Hongfujin Precision Industry Shenzhen Co Ltd
Priority to CN201210584066.0A priority Critical patent/CN103905238A/en
Publication of CN103905238A publication Critical patent/CN103905238A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Computer And Data Communications (AREA)

Abstract

The invention provides a data center abnormal information collection method. The method is applied to a monitoring server. The monitoring server is connected with each server of a data center. The method comprises the steps that a list is created in the monitoring server, wherein the list is used for storing abnormal information transmitted from each server; every preset time, the abnormal information sent from each server is received; the acquired abnormal information of each server is stored in the list; every preset time, the abnormal information in the list is transmitted to a client; and when an abnormal server is restored, a removing instruction of the server is received to remove the abnormal information of the server in the list. The invention further provides a data center abnormal information collection system. According to the invention, maintenance personnel do not need to set the servers in a single setting manner, which is convenient for the maintenance personnel, improves the efficiency, and improves the stability of the data center.

Description

Data center's abnormal information gathering system and method
Technical field
The present invention relates to a kind of information acquisition system and method, especially about a kind of data center abnormal information gathering system and method.
Background technology
Data center (data center), generally includes several and even station server up to ten thousand, also referred to as server farm (server farm), refers to the facility for settling computer system and associated components, for example, and telecommunications and stocking system.Conventionally, data center comprises redundancy and stand-by power supply, redundant data communication connection, and environment control (for example air-conditioning, fire extinguisher) and safety means, wherein, in data center, most important equipment is the server for storing data.
Generally speaking, the server of data center breaks down unavoidably in the process of operation, and for example, the CPU of server damages, the abnormal conditions of hard disk corruptions.In order to ensure that server runs well, need to there is special maintenance personal to investigate in data center, to ensure in the time that server breaks down, the server can on-call maintenance breaking down, but because the server of data center is very huge, finding out which station server breaks down, need to spend maintenance personal's regular hour, thus, reduce the efficiency of maintenance, also make the time of maintenance elongated, also reduced the stability of data center.
Summary of the invention
In view of above content, be necessary to provide a kind of data center abnormal information gathering system, can know in time which station server of data center breaks down, improve maintenance efficiency, shorten maintenance time, also improve the stability of data center.
In view of above content, be also necessary to provide a kind of data center abnormal information collection method, can know in time which station server of data center breaks down, improve maintenance efficiency, shorten maintenance time, also improve the stability of data center.
A kind of data center abnormal information gathering system, this system runs on monitoring server, this monitoring server is connected with each server of data center, this system comprises: creation module, for creating a list at monitoring server, the abnormal information that this list sends for service device; Receiver module, for every Preset Time, receives the abnormal information that each server sends over; Memory module, for being stored in list by the abnormal information of described each server obtaining; Sending module, for every Preset Time, sends the abnormal information in list to client; Described receiver module, also, in the time that the server of generation abnormal conditions recovers normal, receives the clearance order of this server, to remove the abnormal information of this server in this list.
A kind of data center abnormal information collection method, the method applies to monitoring server, this monitoring server is connected with each server of data center, and the method comprises: in monitoring server, create a list, the abnormal information that this list sends for service device; Every Preset Time, receive the abnormal information that each server sends over; The abnormal information of described each server obtaining is stored in list; Every Preset Time, send the abnormal information in list to client; In the time that the server of generation abnormal conditions recovers normal, receive the clearance order of this server, to remove the abnormal information of this server in this list.
Compared to prior art, data center provided by the invention abnormal information gathering system and method, can know which station server of data center breaks down in time, improved maintenance efficiency, shortened maintenance time, also improved the stability of data center.
Accompanying drawing explanation
The applied environment figure of Tu1Shi data center of the present invention abnormal information gathering system preferred embodiment.
Fig. 2 is the structural representation of monitoring server preferred embodiment of the present invention.
The flow chart of Tu3Shi data center of the present invention abnormal information collection method preferred embodiment.
The structural representation of Tu4Shi data center of the present invention.
Main element symbol description
Client 10
Monitoring server 20
Communicator 510
Database 30
Network 40
Data center 50
Server 500
Data center's abnormal information gathering system 200
Creation module 210
Distribution module 220
Receiver module 230
Memory module 240
Sending module 250
Memory 260
Processor 270
Following embodiment further illustrates the present invention in connection with above-mentioned accompanying drawing.
Embodiment
Consulting shown in Fig. 1, is the applied environment figure of data center of the present invention abnormal information gathering system 200 preferred embodiments.This data center's abnormal information gathering system 200 is applied in monitoring server 20.This monitoring server 20 is communicated and is connected by network 40 with data center (Data Center) 50.
Described network 40 can be the Internet, local area network (LAN) or other communication network.
Described data center 50 comprises multiple servers 500, and described server 500 is blade server.In the present embodiment, the structure of described data center 50 as shown in Figure 4, puts together in the mode of stack between server 500.On each server 500, also comprise a communicator 510, for being connected with monitoring server 20 by network 40.Described communication unit 510 can be, but be not limited to, the communicators such as bluetooth module, WIFI module, Wideband Code Division Multiple Access (WCDMA) (Wideband Code Division Multiple Access, WCDMA) module and Long Term Evolution module (Long Term Evolution, LTE).The mode that described communication unit 510 is connected with monitoring server 20 networks can be that wireless network connects, and can be also that cable network connects.It should be noted that, due to server 500 One's name is legions in data center 50, use the mode of wireless connections can save the inner space of data center 50, therefore, consider from the angle of saving data center space, the mode that described communicator 510 connects by wireless network is carried out network with monitoring server 20 and is connected.
Described monitoring server 20 is provided with DynamicHost agreement (Dynamic Host Configuration Protocol is set, DHCP) service, serve agreement (the Internet Protocol interconnecting between can distribution network by DHCP, IP) address, to the communicator 510 of each server 500 in data center 50, makes can communicate with each communicator 510 of data center 50 in monitoring server 20.This monitoring server 20 can be personal computer, the webserver, can also be any other applicable computer.In addition it is inner or using the some servers 500 in data center 50 as monitoring server that, this monitoring server 20 can also be placed on data center 50.
This monitoring server 20 is connected and is connected with database 30 by a database.Wherein, described database connection can be an open type data storehouse and connects (Open Database Connectivity, ODBC), or Java database connects (Java Database Connectivity, JDBC).Described database 30 is for storing the abnormal information sending from each server 500 of data center 50, described abnormal information comprises that abnormal conditions occur the numbering of server 500 and server 500 hardware (for example, CPU, memory bar, hard disk, USB interface, supply unit, fan and CD-ROM drive etc.) the information such as numbering.Described abnormal information is detected voluntarily by server 500, as occurs abnormal conditions, and server 500 can record concrete which hardware to be occurred extremely, and the numbering of this hardware is stored in daily record, sends to monitoring server 20 afterwards by communicator 510.
It should be noted that at this, database 30 can be independent of monitoring server 20, also can be positioned at monitoring server 20, and for example described database 30 can be stored in the hard disk or flash disk of monitoring server 20.Consider from the angle of security of system, the database 30 in the present embodiment is independent of monitoring server 20.
Client 10 is for providing an interactive interface to attendant, is convenient to that attendant operates and the various data in operating process are stored in monitoring server 20.This client 10 can be personal computer, notebook computer, mobile phone, panel computer and other equipment that can be connected with monitoring server 20 arbitrarily.In this preferred embodiment, to carry the convenience angle of client 10 from attendant and consider, described client 10 is mobile phone.
Consulting shown in Fig. 2, is the structural representation of monitoring server 20 preferred embodiments of the present invention.This monitoring server 20, except comprising data center's abnormal information gathering system 200, also comprises memory 260 and processor 270.This data center's abnormal information gathering system 200 comprises creation module 210, distribution module 220, receiver module 230, memory module 240 and sending module 250.The sequencing code storage of module 210 to 250 is in memory 260, and processor 270 is carried out these sequencing codes, realizes the above-mentioned functions that data center's abnormal information gathering system 200 provides.
Creation module 210 is for create a list at monitoring server 20, and this list is used for the abnormal information that service device 500 sends.Particularly, on every station server 500, operating system is all installed, operating system is in the process of operation, can detect voluntarily whether normal operation of each hardware, for example, the CPU in server 500 breaks down in the process of instruction of carrying out some programs, cannot work on, operating system can record CPU and break down, and is saved in journal file.Server 500 regularly sends to monitoring server 20 by communicator 510 by the abnormal information in journal file.Generally speaking, abnormal information in journal file is very many, if information all in journal file is passed to monitoring server 20 by each server 500, can cause network 30 block up and the storage pressure of monitoring server 20 increases, blocking up and alleviating the storage pressure of monitoring server 20 for fear of network 30,500 parts that intercept abnormal information in daily record of server send monitoring server 20 to, as the numbering of server 500, the information such as the hardware number of the server 500 of generation abnormal conditions.
Distribution module 220 is the communicator 510 to each server 500 of data center 50 for the DHCP service distribution IP address by monitoring server 20, to establish a communications link with each server 500.
Receiver module 230, for for example, every Preset Time (, one hour), receives the abnormal information of each server 500.The information such as the abnormal information of described reception comprises the numbering of server 500, the hardware number of the server 500 of generation abnormal conditions.In addition, because the mode with stack of the server 500 of data center 50 is put, the numbering of each server 500 comprises line number and the columns of each server 500 in data center 50, and the numbering of each server 500 has embodied the putting position of server 500, as shown in Figure 4, n represents the columns at server 500 places, and m represents the line number at server 500 places.Suppose be numbered (20,1) of certain server 500, represent that this server 500 is the 20 row at the putting position of data center 50, the position of first row.The hardware number of server 500 can be that numeral can be also letter, it can also be the numbering of numeral and alphabetical combination, for example, numbering " 01 " represents CPU, and " 02 " represents hard disk, and " 03 " represents fan, " 04 " represents CD-ROM device, " 05 " represents memory bar, and " 06 " represents supply unit, " 07 " expression " USB interface ".If server 500 comprises multiple same hardware, on former numbered basis, also add differentiation number, to distinguish which hardware, for example, some servers 500 comprise two CPU, in numbering " 0102 ", above two digits " 01 " represents CPU, after two digits " 02 " for distinguishing number, represent second CPU in this server 500.
Memory module 240 is for being stored in list by the abnormal information of described each server 500 obtaining.Particularly, memory module 240 according to the sequential storage of the line number columns of each server 500 in list.
Sending module 250, for for example, every Preset Time (, ten minutes), sends the abnormal information in list to client 10.Particularly, sending module detects in list whether have abnormal information for 250 every ten minutes, if there is abnormal information, sends abnormal information in list to client 10, makes attendant know which hardware of which server 500 goes wrong.For example, the information receiving as maintenance personal during for (20,1,0101), shows the 20 row in data center 50, and first CPU of the server 500 of first row breaks down.
Described receiver module 230 also, in the time that the server 500 of generation abnormal conditions recovers normal, receives the clearance order of this server 500, to remove the abnormal information of this server 500 in this list.After attendant repairs server 500, that is to say, server 500 (is for example normally worked after certain hour again, one hour), send clearance order to monitoring server 20, after monitoring server 20 receives this instruction, remove the abnormal information of this server 500 in list.
As shown in Figure 3, be the flow chart of data center of the present invention abnormal information collection method preferred embodiment.
Step S10, creation module 210 creates a list in monitoring server 20, the abnormal information that this list sends for service device 500.Particularly, on every station server 500, operating system is all installed, operating system is in the process of operation, can detect voluntarily whether normal operation of each hardware, for example, the CPU in server 500 breaks down in the process of instruction of carrying out some programs, cannot work on, operating system can record CPU and break down, and is saved in journal file.Server 500 regularly sends to monitoring server 20 by communicator 510 by the abnormal information in journal file.Generally speaking, abnormal information in journal file is very many, if information all in journal file is passed to monitoring server 20 by each server 500, can cause network 30 block up and the storage pressure of monitoring server 20 increases, blocking up and alleviating the storage pressure of monitoring server 20 for fear of network 30,500 parts that intercept abnormal information in daily record of server send monitoring server 20 to, as the numbering of server 500, the information such as the hardware number of the server 500 of generation abnormal conditions.
Step S20, distribution module 220 is passed through the DHCP service distribution IP address of monitoring server 20 to the communicator 510 of each server 500 of data center 50, to establish a communications link with each server 500.
Step S30, receiver module 230 for example,, every Preset Time (, one hour), obtains the abnormal information that each server 500 sends.The information such as the abnormal information of described reception comprises the numbering of server 500, the hardware number of the server 500 of generation abnormal conditions.In addition, because the mode with stack of the server 500 of data center 50 is put, the numbering of each server 500 comprises line number and the columns of each server 500 in data center 50, and the numbering of each server 500 has embodied the putting position of server 500, as shown in Figure 4, n represents the columns at server 500 places, and m represents the line number at server 500 places.Suppose be numbered (20,1) of certain server 500, represent that this server 500 is the 20 row at the putting position of data center 50, the position of first row.The hardware number of server 500 can be that numeral can be also letter, it can also be the numbering of numeral and alphabetical combination, for example, numbering " 01 " represents CPU, and " 02 " represents hard disk, and " 03 " represents fan, " 04 " represents CD-ROM device, " 05 " represents memory bar, and " 06 " represents supply unit, " 07 " expression " USB interface ".If server 500 comprises multiple same hardware, on former numbered basis, also add differentiation number, to distinguish which hardware, for example, some servers 500 comprise two CPU, in numbering " 0102 ", above two digits " 01 " represents CPU, after two digits " 02 " for distinguishing number, represent second CPU in this server 500.
Step S40, memory module 240 is stored in the abnormal information of described each server 500 obtaining in list.Particularly, memory module 240 according to the sequential storage of the line number columns of each server 500 in list.
Step S50, sending module 250 for example,, every Preset Time (, ten minutes), sends the abnormal information in list to client 10.Particularly, sending module detects in list whether have abnormal information for 250 every ten minutes, if there is abnormal information, sends abnormal information in list to client 10, makes attendant know which hardware of which server 500 goes wrong.For example, the information receiving as maintenance personal during for (20,1,0101), shows the 20 row in data center 50, and first CPU of the server 500 of first row breaks down.
Step S60, in the time that the server 500 of generation abnormal conditions recovers normal, receiver module 230 receives the clearance order of these servers 500, to remove the abnormal information of this server 500 in this list.After attendant repairs server 500, that is to say, server 500 (is for example normally worked after certain hour again, one hour), send clearance order to monitoring server 20, after monitoring server 20 receives this instruction, remove the abnormal information of this server 500 in list.
It should be noted last that, above embodiment is only unrestricted in order to technical scheme of the present invention to be described, although the present invention is had been described in detail with reference to above preferred embodiment, those of ordinary skill in the art is to be understood that, can modify or be equal to replacement technical scheme of the present invention, and not depart from the spirit and scope of technical solution of the present invention.

Claims (8)

1. data center's abnormal information gathering system, this system runs on monitoring server, and this monitoring server is connected with each server of data center, it is characterized in that, and this system comprises:
Creation module, for create a list at monitoring server, the abnormal information that this list sends for service device;
Receiver module, for every Preset Time, receives the abnormal information that each server sends over;
Memory module, for being stored in list by the abnormal information of described each server obtaining;
Sending module, for every Preset Time, sends the abnormal information in list to client; And
Described receiver module, also, in the time that the server of generation abnormal conditions recovers normal, receives the clearance order of this server, to remove the abnormal information of this server in this list.
2. data center as claimed in claim 1 abnormal information gathering system, is characterized in that, described abnormal information comprises the numbering of server and the hardware number of the server of abnormal conditions occurs.
3. data center as claimed in claim 2 abnormal information gathering system, is characterized in that, the numbering of described server comprises line number and the columns of this server in data center.
4. data center as claimed in claim 1 abnormal information gathering system, is characterized in that, between each server of described data center, puts together in the mode of stack.
5. data center's abnormal information collection method, the method applies to monitoring server, and this monitoring server is connected with each server of data center, it is characterized in that, and the method comprises:
In monitoring server, create a list, the abnormal information that this list sends for service device;
Every Preset Time, receive the abnormal information that each server sends over;
The abnormal information of described each server obtaining is stored in list;
Every Preset Time, send the abnormal information in list to client; And
In the time that the server of generation abnormal conditions recovers normal, receive the clearance order of this server, to remove the abnormal information of this server in this list.
6. data center as claimed in claim 5 abnormal information collection method, is characterized in that, described abnormal information comprises the numbering of server and the hardware number of the server of abnormal conditions occurs.
7. data center as claimed in claim 6 abnormal information collection method, is characterized in that, the numbering of described server comprises line number and the columns of this server in data center.
8. data center as claimed in claim 5 abnormal information collection method, is characterized in that, between each server of described data center, puts together in the mode of stack.
CN201210584066.0A 2012-12-28 2012-12-28 Data center abnormal information collection system and method Pending CN103905238A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210584066.0A CN103905238A (en) 2012-12-28 2012-12-28 Data center abnormal information collection system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210584066.0A CN103905238A (en) 2012-12-28 2012-12-28 Data center abnormal information collection system and method

Publications (1)

Publication Number Publication Date
CN103905238A true CN103905238A (en) 2014-07-02

Family

ID=50996395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210584066.0A Pending CN103905238A (en) 2012-12-28 2012-12-28 Data center abnormal information collection system and method

Country Status (1)

Country Link
CN (1) CN103905238A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105898427A (en) * 2016-03-29 2016-08-24 天脉聚源(北京)传媒科技有限公司 Method and apparatus for positioning abnormal log-in of set top box
CN106649555A (en) * 2016-11-08 2017-05-10 深圳市中博睿存科技有限公司 Memory unit state marking method and distributed memory system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060195561A1 (en) * 2005-02-28 2006-08-31 Microsoft Corporation Discovering and monitoring server clusters
CN101707632A (en) * 2009-10-28 2010-05-12 浪潮电子信息产业股份有限公司 Method for dynamically monitoring performance of server cluster and alarming real-timely
CN101854270A (en) * 2010-04-23 2010-10-06 山东中创软件工程股份有限公司 Multisystem running state monitoring method and system
CN102340411A (en) * 2010-07-26 2012-02-01 深圳市腾讯计算机系统有限公司 Server information data management method and system
CN102571441A (en) * 2012-01-18 2012-07-11 百度在线网络技术(北京)有限公司 Method, system and device for intelligently managing whole machine cabinet

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060195561A1 (en) * 2005-02-28 2006-08-31 Microsoft Corporation Discovering and monitoring server clusters
CN101707632A (en) * 2009-10-28 2010-05-12 浪潮电子信息产业股份有限公司 Method for dynamically monitoring performance of server cluster and alarming real-timely
CN101854270A (en) * 2010-04-23 2010-10-06 山东中创软件工程股份有限公司 Multisystem running state monitoring method and system
CN102340411A (en) * 2010-07-26 2012-02-01 深圳市腾讯计算机系统有限公司 Server information data management method and system
CN102571441A (en) * 2012-01-18 2012-07-11 百度在线网络技术(北京)有限公司 Method, system and device for intelligently managing whole machine cabinet

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105898427A (en) * 2016-03-29 2016-08-24 天脉聚源(北京)传媒科技有限公司 Method and apparatus for positioning abnormal log-in of set top box
CN105898427B (en) * 2016-03-29 2019-07-26 天脉聚源(北京)传媒科技有限公司 A kind of method and apparatus that set-top box login positions extremely
CN106649555A (en) * 2016-11-08 2017-05-10 深圳市中博睿存科技有限公司 Memory unit state marking method and distributed memory system

Similar Documents

Publication Publication Date Title
CN103368785A (en) Server operation monitoring system and method
CN105808394B (en) Server self-healing method and device
CN102833083A (en) Data center power supply device control system and method
CN107682172B (en) Control center device, service system processing method and medium
CN102761528A (en) System and method for data management
CN102654836A (en) Virtual machine mounting system and method
CN102811141A (en) Method and system for monitoring running of virtual machines
CN111767173A (en) Network equipment data processing method and device, computer equipment and storage medium
CN103164277A (en) Dynamic resource planning distribution system and method
CN103905238A (en) Data center abnormal information collection system and method
JP2017536759A (en) Method and apparatus for self-healing after disconnection of base station
CN111130934A (en) Monitoring method, device and system of communication system
CN102006190B (en) High-availability cluster backup system and backup method thereof
CN103902310A (en) Scheduling system and method for starting of virtual machines
CN106850262B (en) Server network management method
CN102215518B (en) Method for backing-up and restoring configuration of network access equipment according to user permission
CN103064740A (en) Guest operating system predict migration system and method
CN110019536B (en) Database system based on medical block chain technology
CN104158843A (en) Storage unit invalidation detecting method and device for distributed file storage system
CN115242621B (en) Network private line monitoring method, device, equipment and computer readable storage medium
CN116737444A (en) Database server fault processing method and system
CN104079660A (en) Method and device for constant value interaction of information protection system and constant value pre-warning system
CN100452721C (en) Customer end management system and method
CN110012109B (en) Method for establishing engineering information capable of realizing high accuracy
CN102810067A (en) Virtual machine template updating system and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140702