CN106844145A - A kind of server hardware fault early warning method and device - Google Patents
A kind of server hardware fault early warning method and device Download PDFInfo
- Publication number
- CN106844145A CN106844145A CN201611247164.XA CN201611247164A CN106844145A CN 106844145 A CN106844145 A CN 106844145A CN 201611247164 A CN201611247164 A CN 201611247164A CN 106844145 A CN106844145 A CN 106844145A
- Authority
- CN
- China
- Prior art keywords
- server
- hardware
- early warning
- daily record
- hardware fault
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3006—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3037—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a memory, e.g. virtual memory, cache
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/32—Monitoring with visual or acoustical indication of the functioning of the machine
- G06F11/324—Display of status information
- G06F11/327—Alarm or error message display
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a kind of server hardware fault early warning method and device.Methods described includes:Hardware fault early warning list is pre-created, correspondence saves different hardware fault early warning information and corresponding server log content in the list;Server system operation daily record is obtained, acquired server system operation daily record is matched with hardware fault early warning list is built;If there is occurrence, it is determined that the server will occur the hardware fault described by the corresponding hardware fault early warning information of occurrence.Above-mentioned technical proposal before server hardware failure can timely early warning, process to be known where problem according to early warning information and in time, the time of consumption is short, it is ensured that the stability of whole server hardware system.
Description
Technical field
The present invention relates to field of computer technology, and in particular to a kind of server hardware fault early warning method and device.
Background technology
With on server cluster business demand be incremented by, the quantity of server hardware also can constantly increase.Numerous
Server in, once there is the situation of server hardware failure, such as, more than guarantee period (referred to as cross protect), clothes can be caused
The hydraulic performance decline of business device hardware, or even there is the situation of unexpected machine of delaying, cause shadow can to the operation of whole server hardware system
Ring.In the maintenance work of numerous server hardwares, typically just may be used after server hardware failure in the prior art
To be found, then just solved, server hardware failure can not be found in time, nor the institute that can pinpoint the problems in time
The cycle of solve problem is more long, and then influences the stability of whole server hardware system.
The content of the invention
In view of the above problems, it is proposed that the present invention so as to provide one kind overcome above mentioned problem or at least in part solve on
State the server hardware fault early warning method and device of problem.
According to one aspect of the present invention, there is provided a kind of server hardware fault early warning method, including:
Hardware fault early warning list is pre-created, correspondence saves different hardware fault early warning information and phases in the list
The server log content answered;
Server system operation daily record is obtained, by acquired server system operation daily record with described to build hardware fault pre-
Alert list is matched;
If there is occurrence, it is determined that the server will occur the corresponding hardware fault early warning information of occurrence and be retouched
The hardware fault stated.
Alternatively, it is described acquisition server system operation daily record, by acquired server system operation daily record with it is described
Hardware fault early warning list carries out matching to be included:
Obtain the hardware-related daily record in server running log;
Acquired hardware-related daily record is matched with the hardware fault early warning list.
Alternatively, the hardware-related daily record obtained in server running log includes:
According to the system configuration of server, it is determined that preserving the journal file title of hardware-related daily record;
According to identified journal file title, hardware-related daily record is obtained from corresponding journal file.
Alternatively, will occur described by the corresponding hardware fault early warning information of occurrence in described determination server
After hardware fault, the method is further included:
If there is other servers with the server storage identical data and offer same services, then by the server
On services migrating on described other servers.
Alternatively, will occur described by the corresponding hardware fault early warning information of occurrence in described determination server
After hardware fault, the method is further included:
If there is no other servers with the server storage identical data and offer same services, then this is serviced
Data and service on device are all moved on the standby server specified.
Alternatively, will occur described by the corresponding hardware fault early warning information of occurrence in described determination server
After hardware fault, the method is further included:
The report comprising the server identification and the application and trouble early warning information is sent to specified location by specifying channel
Alert message.
Alternatively, the method is further included:
The early warning wrong report on the server is received to notify;
The server is put back into.
Alternatively, the method is further included:
When there is server to actually occur hardware fault, obtain the server and occur in the corresponding time range of hardware fault
Server system operation daily record in hardware-related daily record;
It is former with the hardware that the server is actually occurred according to finding at least one in acquired hardware-related daily record
Hinder related daily record;
The log content that will be found out is corresponding with the early warning information of the hardware fault that the server is actually occurred to be saved in institute
In stating hardware fault early warning list.
According to another aspect of the present invention, there is provided a kind of server hardware fault pre-alarming device, including:
List maintenance unit, is suitable to be pre-created hardware fault early warning list, and correspondence saves different hard in the list
Part fault pre-alarming information and corresponding server log content;
Log matches unit, is suitable to obtain server system operation daily record, by acquired server system operation daily record
Build hardware fault early warning list and matched with described, if there is occurrence, notify fault pre-alarming unit;
Fault pre-alarming unit, is suitable to after the notice for receiving log matches unit, determines that the server will be matched
Hardware fault described by the corresponding hardware fault early warning information of item.
Alternatively, the log matches unit, is suitable to obtain the hardware-related daily record in server running log;Will
Acquired hardware-related daily record is matched with the hardware fault early warning list.
Alternatively, the log matches unit, is suitable to the system configuration according to server, it is determined that preserving hardware-related
The journal file title of daily record;According to identified journal file title, obtain hardware-related from corresponding journal file
Daily record.
Alternatively, the device is further included:
Early warning processing unit, is suitable to when the fault pre-alarming unit that to determine that the server will occur occurrence corresponding hard
During hardware fault described by part fault pre-alarming information, judge whether and the server storage identical data and offer is identical
Other servers of service, if there is then by the services migrating on the server to described other servers.
Alternatively, the early warning processing unit, is further adapted for working as and judges in the absence of number identical with the server storage
According to and when other servers of same services are provided, the data on the server and service are all moved into the active service specified
On device.
Alternatively, the fault pre-alarming unit, is further adapted for being sent comprising the clothes to specified location by specifying channel
The warning message of business device mark and the application and trouble early warning information.
Alternatively, the fault pre-alarming unit, is further adapted for receiving the early warning wrong report on the server and notifies;Should
Server puts back into.
Alternatively, wherein,
The list maintenance unit, is further adapted for, when there is server to actually occur hardware fault, obtaining the server
Hardware-related daily record in server system operation daily record in the corresponding time range of generation hardware fault;According to acquired
Hardware-related daily record in find at least one daily record related to the hardware fault that the server is actually occurred;To search
The log content for going out is corresponding with the early warning information of the hardware fault that the server is actually occurred to be saved in the hardware fault early warning
In list.
In sum, technology according to the present invention scheme, being pre-created one, to preserve the different hardware faults of correspondence pre-
The hardware fault early warning list of alert information and corresponding server log content;Server system operation daily record is obtained in real time, and
Matched with the hardware fault early warning list of building being pre-created;If without occurrence, illustrating that the server is not in hard
Part failure;If occurrence, then determine that the server hardware will occur the corresponding hardware fault early warning letter of occurrence
The described hardware fault of breath, is that server hardware attendant obtains early warning information in time, according to early warning information can and
When the server hardware that will break down of discovery and problem where, it is possible to processed in time.It can be seen that, the present invention is in clothes
Business device hardware break down before can timely early warning, to process where knowing problem according to early warning information and in time, disappear
The time of consumption is short, it is ensured that the stability of whole server hardware system.
Described above is only the general introduction of technical solution of the present invention, in order to better understand technological means of the invention,
And can be practiced according to the content of specification, and in order to allow the above and other objects of the present invention, feature and advantage can
Become apparent, below especially exemplified by specific embodiment of the invention.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, various other advantages and benefit is common for this area
Technical staff will be clear understanding.Accompanying drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention
Limitation.And in whole accompanying drawing, identical part is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 shows that a kind of flow of server hardware fault early warning method according to an embodiment of the invention is illustrated
Figure;
Fig. 2 shows a kind of structural representation of server hardware fault pre-alarming device according to an embodiment of the invention
Figure;
Fig. 3 shows a kind of structural representation of server hardware fault pre-alarming device in accordance with another embodiment of the present invention
Figure.
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in accompanying drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
Limited.Conversely, there is provided these embodiments are able to be best understood from the disclosure, and can be by the scope of the present disclosure
Complete conveys to those skilled in the art.
Fig. 1 shows that a kind of flow of server hardware fault early warning method according to an embodiment of the invention is illustrated
Figure.As shown in figure 1, the method, including:
Step S110, is pre-created hardware fault early warning list, and it is pre- correspondingly to save different hardware faults in the list
Alert information and corresponding server log content.
System operation daily record in server can preserve different in operation condition of server, including server running
Normal information etc..So, according to known failure early warning information and the log information corresponding to it, create a hardware fault
Early warning list.Can include in the fault pre-alarming list in different hardware fault early warning information and corresponding server log
Hold.For example, include in fault pre-alarming list server delay machine early warning information and its corresponding to server log content.
Step S120, obtains server system operation daily record, by acquired server system operation daily record and hardware event
Barrier early warning list is matched.
Include different hardware fault early warning information and corresponding server log content in fault pre-alarming list,
As long as the server log content in server system operation daily record in faulty early warning list, the server may occur
Corresponding hardware fault.So, in order to whether detection service device occurs hardware fault, it is necessary to obtain server system operation day
Will, is then matched acquired server system operation daily record with hardware fault early warning list, if without occurrence,
Illustrate that the server does not occur the risk of hardware fault.
Step S130, if there is occurrence, it is determined that it is pre- that the server will occur the corresponding hardware fault of occurrence
Hardware fault described by alert information.
As long as whether having in monitoring the system operation log content in each server in meeting the fault pre-alarming list
Server log content, then being considered as the server hardware will occur the corresponding hardware fault early warning information institute of occurrence
The hardware fault of description.For example, include in fault pre-alarming list server delay machine early warning information and its corresponding to service
Device log content.When exist in the system journal content in getting server A and fault pre-alarming list in server delay machine institute
The log content of corresponding server log content matching, then be considered as the server A it may happen that out-of-warranty forecast information
Described in machine of delaying hardware fault.
After there is occurrence, corresponding early warning information is exported, the determination of problem is carried out for attendant, and
Processed in time.Because these early warning information be corresponding server it is possible that hardware fault, attendant can be with
Searched problem in time according to the early warning information, judge that whether the server can be continuing with, and processed accordingly, prevented
After server goes wrong, the stability of system is influenceed.For example, occurring in that the early warning information of the disk failure of server B, then
Just first the business in the disk of server B can be moved out, then attendant is checked, determines problem points, is carried out in time
Solve, if server B can be continuing with, then business can be moved back, if server B is not available, then just exist
New server is added to be changed.
It can be seen that, the present invention server hardware failure before can timely early warning, to be obtained according to early warning information
Know where problem and process in time, the time of consumption is short, it is ensured that the stability of whole server hardware system.
Although the system operation daily record in server can be preserved in operation condition of server, including server running
Abnormal information etc..But the system operation daily record enormous amount in server, for guaranteed efficiency, it is impossible to traversal server
In all of system operation daily record.In one embodiment of the invention, the acquisition server system operation in step S120
Daily record, by acquired server system operation daily record and hardware fault early warning list match including:Obtain server fortune
Hardware-related daily record in row daily record;Acquired hardware-related daily record and hardware fault early warning list are carried out
Match somebody with somebody.Since it is desired that the early warning of hardware fault is carried out, so only needing to obtain the hardware-related day in server running log
Will.For example, the daily record relevant with server memory;And the day of the hardware such as disk, CPU, mainboard, the power supply with server
Will.
It is right to realize because the related daily record of hardware is to constantly update, then obtain the related daily record of hardware in real time
The monitor in real time of server.Or predetermined time period, such as 1 minute, often by 1 minute, related with regard to hardware of acquisition
Daily record.
Specifically, the hardware-related daily record in above-mentioned acquisition server running log includes:According to server
System configuration, it is determined that preserving the journal file title of hardware-related daily record;According to identified journal file title, slave phase
The journal file answered obtains hardware-related daily record.
For example, the relevant information of the internal memory in the system configuration for passing through server, determines the day of internal memory correlation in server
Will file name, then the related journal file title of internal memory according to determined by, obtains and internal memory from corresponding journal file
Related daily record.
In one embodiment of the invention, in step S130 it is determined that the server occurrence will to occur corresponding
After hardware fault described by hardware fault early warning information, the method shown in Fig. 1 is further included:If there is with the service
Device stores identical data and provides other servers of same services, then by the services migrating on the server to other servers
On.
Have determined that the server will occur the hardware fault described by the corresponding hardware fault early warning information of occurrence,
Really there is corresponding hardware fault to prevent the server, it is ensured that the stability of the service that the server undertakes, it is determined that
The server will occur after the hardware fault described by the corresponding hardware fault early warning information of occurrence, first by the server
On services migrating on other servers.And, other servers said herein are and the server storage identical data and carry
For other servers of same services, it is ensured that the normal operation of business.
Services migrating on the server to other servers is reached the standard grade and is searched whether in the presence of identical with the server storage
Other servers of data and offer same services.If there is no if, further, in step S130 it is determined that the clothes
Business device will occur after the hardware fault described by the corresponding hardware fault early warning information of occurrence, and the method shown in Fig. 1 is entered
One step includes:If there is no other servers with the server storage identical data and offer same services, then this is taken
Data and service on business device are all moved on the standby server specified.
In one embodiment of the invention, in step S130 it is determined that the server occurrence will to occur corresponding
After hardware fault described by hardware fault early warning information, the method shown in Fig. 1 is further included:By specifying channel to finger
Positioning puts warning message of the transmission comprising the server identification and application and trouble early warning information.
When it is determined that the server will occur hardware fault described by the corresponding hardware fault early warning information of occurrence it
Afterwards, even if in order to ensure that related personnel gets the early warning information, then need related warning message and corresponding clothes
The information of business device is exported to the position specified, for example, being sent in the mailbox of attendant by way of mail.
But, however not excluded that the possibility of warning message presence mistake, that is, situation about reporting by mistake, if there is the situation of wrong report, but
It is data in corresponding server and service is moved out or the server has been stopped using, in order to ensure the service
Device is put back into, and specifically, the above method is further included:The early warning wrong report on the server is received to notify;Should
Server is put back into, or the data that will be moved out and service are moved back again.For example, the service that supply voltage shakiness sends
After the warning message that device may be powered off, the server may be stopped and use, but, find that the warning message belongs to after investigation
Normal voltage pulsation, then be accomplished by putting back into the server, at this moment, related personnel will send the clothes
The early warning wrong report of business device is notified.So, after the early warning for receiving the server is reported by mistake to be notified, the server is put into again to be made
With.
Because in the fault pre-alarming information and corresponding server log that include in the hardware fault early warning list for creating
Appearance can not cover all of situation, it is also desirable to constantly update hardware fault early warning list.In one embodiment of the invention,
Method shown in Fig. 1 is further included:
When there is server to actually occur hardware fault, since server has occurred and that hardware fault, then just illustrate hard
The list of part fault pre-alarming does not preserve early warning information and corresponding log content on the hardware fault.So it is accomplished by obtaining
Hardware-related daily record in server system operation daily record in the corresponding time range of server generation hardware fault;Root
According to finding at least one day related to the hardware fault that the server is actually occurred in acquired hardware-related daily record
Will;The log content that will be found out is corresponding with the early warning information of the hardware fault that the server is actually occurred to be saved in hardware fault
In early warning list, to realize the renewal to hardware fault early warning list.
Fig. 2 shows a kind of structural representation of server hardware fault pre-alarming device according to an embodiment of the invention
Figure.As shown in Fig. 2 the server hardware fault pre-alarming device 200 includes:
List maintenance unit 210, is suitable to be pre-created hardware fault early warning list, and correspondence saves different in the list
Hardware fault early warning information and corresponding server log content.
System operation daily record in server can preserve different in operation condition of server, including server running
Normal information etc..So, according to known failure early warning information and the log information corresponding to it, create a hardware fault
Early warning list.Can include in the fault pre-alarming list in different hardware fault early warning information and corresponding server log
Hold.For example, include in fault pre-alarming list server delay machine early warning information and its corresponding to server log content.
Log matches unit 220, is suitable to obtain server system operation daily record, by acquired server system operation day
Will is matched with hardware fault early warning list is built, and if there is occurrence, notifies fault pre-alarming unit.
Include different hardware fault early warning information and corresponding server log content in fault pre-alarming list,
As long as the server log content in server system operation daily record in faulty early warning list, the server may occur
Corresponding hardware fault.So, in order to whether detection service device occurs hardware fault, it is necessary to obtain server system operation day
Will, is then matched acquired server system operation daily record with hardware fault early warning list, if without occurrence,
Illustrate that the server does not occur the risk of hardware fault.
Fault pre-alarming unit 230, is suitable to after the notice for receiving log matches unit, determines that the server will occur
With the hardware fault described by the corresponding hardware fault early warning information of item.
As long as whether having in monitoring the system operation log content in each server in meeting the fault pre-alarming list
Server log content, then being considered as the server hardware will occur the corresponding hardware fault early warning information institute of occurrence
The hardware fault of description.For example, include in fault pre-alarming list server delay machine early warning information and its corresponding to service
Device log content.When exist in the system journal content in getting server A and fault pre-alarming list in server delay machine institute
The log content of corresponding server log content matching, then be considered as the server A it may happen that out-of-warranty forecast information
Described in machine of delaying hardware fault.
After there is occurrence, corresponding early warning information is exported, the determination of problem is carried out for attendant, and
Processed in time.Because these early warning information be corresponding server it is possible that hardware fault, attendant can be with
Searched problem in time according to the early warning information, judge that whether the server can be continuing with, and processed accordingly, prevented
After server goes wrong, the stability of system is influenceed.For example, occurring in that the early warning information of the disk failure of server B, then
Just first the business in the disk of server B can be moved out, then attendant is checked, determines problem points, is carried out in time
Solve, if server B can be continuing with, then business can be moved back, if server B is not available, then just exist
New server is added to be changed.
It can be seen that, the present invention server hardware failure before can timely early warning, to be obtained according to early warning information
Know where problem and process in time, the time of consumption is short, it is ensured that the stability of whole server hardware system.
Although the system operation daily record in server can be preserved in operation condition of server, including server running
Abnormal information etc..But the system operation daily record enormous amount in server, for guaranteed efficiency, it is impossible to traversal server
In all of system operation daily record.In one embodiment of the invention, log matches unit 220, is suitable to obtain server
Hardware-related daily record in running log;Acquired hardware-related daily record is carried out with hardware fault early warning list
Matching.Since it is desired that the early warning of hardware fault is carried out, so hardware-related in only needing to acquisition server running log
Daily record.For example, the daily record relevant with server memory;And the hardware such as disk, CPU, mainboard, the power supply with server
Daily record.
It is right to realize because the related daily record of hardware is to constantly update, then obtain the related daily record of hardware in real time
The monitor in real time of server.Or predetermined time period, such as 1 minute, often by 1 minute, related with regard to hardware of acquisition
Daily record.
Specifically, log matches unit 220, is suitable to the system configuration according to server, it is determined that preserving hardware-related
The journal file title of daily record;According to identified journal file title, obtain hardware-related from corresponding journal file
Daily record.
For example, the relevant information of the internal memory in the system configuration for passing through server, determines the day of internal memory correlation in server
Will file name, then the related journal file title of internal memory according to determined by, obtains and internal memory from corresponding journal file
Related daily record.
Fig. 3 shows a kind of structural representation of server hardware fault pre-alarming device in accordance with another embodiment of the present invention
Figure.As shown in figure 3, the server hardware fault pre-alarming device 300 includes:List maintenance unit 310, log matches unit 320,
Fault pre-alarming unit 330 and early warning processing unit 340.Wherein, list maintenance unit 310, log matches unit 320, failure are pre-
List maintenance unit 210, log matches unit 220 shown in alert unit 330 and Fig. 2, fault pre-alarming unit 230 have correspondence phase
Same function, identical part will not be repeated here.
Early warning processing unit 340, is suitable to when fault pre-alarming unit that to determine that the server will occur occurrence corresponding hard
During hardware fault described by part fault pre-alarming information, judge whether and the server storage identical data and offer is identical
Other servers of service, if there is then by the services migrating on the server to other servers.
Have determined that the server will occur the hardware fault described by the corresponding hardware fault early warning information of occurrence,
Really there is corresponding hardware fault to prevent the server, it is ensured that the stability of the service that the server undertakes, it is determined that
The server will occur after the hardware fault described by the corresponding hardware fault early warning information of occurrence, first by the server
On services migrating on other servers.And, other servers said herein are and the server storage identical data and carry
For other servers of same services, it is ensured that the normal operation of business.
Services migrating on the server to other servers is reached the standard grade and is searched whether in the presence of identical with the server storage
Other servers of data and offer same services.If there is no if, in one embodiment of the invention, early warning treatment
Unit 340, is further adapted for when other clothes for judging not existing with the server storage identical data and offer same services
During business device, the data on the server and service are all moved on the standby server specified.
In one embodiment of the invention, fault pre-alarming unit 330, is further adapted for by specifying channel to specific bit
Put warning message of the transmission comprising the server identification and application and trouble early warning information.
When it is determined that the server will occur hardware fault described by the corresponding hardware fault early warning information of occurrence it
Afterwards, even if in order to ensure that related personnel gets the early warning information, then need related warning message and corresponding clothes
The information of business device is exported to the position specified, for example, being sent in the mailbox of attendant by way of mail.
But, however not excluded that the possibility of warning message presence mistake, that is, situation about reporting by mistake, if there is the situation of wrong report, but
It is data in corresponding server and service is moved out or the server has been stopped using, in order to ensure the service
Device puts back into, and specifically, fault pre-alarming unit 330 is further adapted for receiving the early warning wrong report on the server logical
Know;The server is put back into, or the data that will be moved out and service are moved back again.For example, the unstable hair of supply voltage
After the warning message that the server for going out may be powered off, the server may be stopped and use, but, the alarm is found after investigation
Information belongs to normal voltage pulsation, then be accomplished by putting back into the server, at this moment, related personnel will send
The early warning wrong report of one server is notified.So, after the early warning for receiving the server is reported by mistake to be notified, the service is thought highly of
Newly come into operation.
Because in the fault pre-alarming information and corresponding server log that include in the hardware fault early warning list for creating
Appearance can not cover all of situation, it is also desirable to constantly update hardware fault early warning list.In one embodiment of the invention,
List maintenance unit 310, is further adapted for when there is server to actually occur hardware fault, since server has occurred and that hardware
Failure, then just explanation hardware fault early warning list is not preserved in early warning information and corresponding daily record on the hardware fault
Hold.So be accomplished by obtaining in the server system operation daily record that the server occurs in the corresponding time range of hardware fault with
The related daily record of hardware;Actually occurred with the server according to finding at least one in acquired hardware-related daily record
The related daily record of hardware fault;The early warning information pair of the hardware fault that the log content and the server that will be found out are actually occurred
Should be saved in hardware fault early warning list, to realize the renewal to hardware fault early warning list.
In sum, technology according to the present invention scheme, being pre-created one, to preserve the different hardware faults of correspondence pre-
The hardware fault early warning list of alert information and corresponding server log content;Server system operation daily record is obtained in real time, and
Matched with the hardware fault early warning list of building being pre-created;If without occurrence, illustrating that the server is not in hard
Part failure;If occurrence, then determine that the server hardware will occur the corresponding hardware fault early warning letter of occurrence
The described hardware fault of breath, is that server hardware attendant obtains early warning information in time, according to early warning information can and
When the server hardware that will break down of discovery and problem where, it is possible to processed in time.It can be seen that, the present invention is in clothes
Business device hardware break down before can timely early warning, to process where knowing problem according to early warning information and in time, disappear
The time of consumption is short, it is ensured that the stability of whole server hardware system.
It should be noted that:
Algorithm and display be not inherently related to any certain computer, virtual bench or miscellaneous equipment provided herein.
Various fexible units can also be used together with based on teaching in this.As described above, construct required by this kind of device
Structure be obvious.Additionally, the present invention is not also directed to any certain programmed language.It is understood that, it is possible to use it is various
Programming language realizes the content of invention described herein, and the description done to language-specific above is to disclose this hair
Bright preferred forms.
In specification mentioned herein, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention
Example can be put into practice in the case of without these details.In some instances, known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify one or more that the disclosure and helping understands in each inventive aspect, exist
Above to the description of exemplary embodiment of the invention in, each feature of the invention is grouped together into single implementation sometimes
In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor
The application claims of shield features more more than the feature being expressly recited in each claim.More precisely, such as following
Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, and wherein each claim is in itself
All as separate embodiments of the invention.
Those skilled in the art are appreciated that can be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment
Unit or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or
Sub-component.In addition at least some in such feature and/or process or unit exclude each other, can use any
Combine to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so disclosed appoint
Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power
Profit is required, summary and accompanying drawing) disclosed in each feature can the alternative features of or similar purpose identical, equivalent by offer carry out generation
Replace.
Although additionally, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments
In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention
Within the scope of and form different embodiments.For example, in the following claims, embodiment required for protection is appointed
One of meaning mode can be used in any combination.
All parts embodiment of the invention can be realized with hardware, or be run with one or more processor
Software module realize, or with combinations thereof realize.It will be understood by those of skill in the art that can use in practice
Microprocessor or digital signal processor (DSP) realize server hardware fault pre-alarming device according to embodiments of the present invention
In some or all parts some or all functions.The present invention is also implemented as described herein for performing
Some or all equipment or program of device (for example, computer program and computer program product) of method.So
Realize that program of the invention can be stored on a computer-readable medium, or can have one or more signal shape
Formula.Such signal can be downloaded from internet website and obtained, or be provided on carrier signal, or with any other shape
Formula is provided.
It should be noted that above-described embodiment the present invention will be described rather than limiting the invention, and ability
Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims,
Any reference symbol being located between bracket should not be configured to limitations on claims.Word "comprising" is not excluded the presence of not
Element listed in the claims or step.Word "a" or "an" before element is not excluded the presence of as multiple
Element.The present invention can come real by means of the hardware for including some different elements and by means of properly programmed computer
It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch
To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame
Claim.
The invention discloses A1, a kind of server hardware fault early warning method, including:
Hardware fault early warning list is pre-created, correspondence saves different hardware fault early warning information and phases in the list
The server log content answered;
Server system operation daily record is obtained, by acquired server system operation daily record and the hardware fault early warning
List is matched;
If there is occurrence, it is determined that the server will occur the corresponding hardware fault early warning information of occurrence and be retouched
The hardware fault stated.
A2, the method as described in A1, wherein, the acquisition server system operation daily record, by acquired server system
System running log and the hardware fault early warning list match including:
Obtain the hardware-related daily record in server running log;
Acquired hardware-related daily record is matched with the hardware fault early warning list.
A3, the method as described in A2, wherein, the hardware-related daily record bag obtained in server running log
Include:
According to the system configuration of server, it is determined that preserving the journal file title of hardware-related daily record;
According to identified journal file title, hardware-related daily record is obtained from corresponding journal file.
A4, the method as described in A1, wherein, the corresponding hardware fault of occurrence will occur in described determination server
After hardware fault described by early warning information, the method is further included:
If there is other servers with the server storage identical data and offer same services, then by the server
On services migrating on described other servers.
A5, the method as described in A4, wherein, the corresponding hardware fault of occurrence will occur in described determination server
After hardware fault described by early warning information, the method is further included:
If there is no other servers with the server storage identical data and offer same services, then this is serviced
Data and service on device are all moved on the standby server specified.
A6, the method as described in A1, wherein, the corresponding hardware fault of occurrence will occur in described determination server
After hardware fault described by early warning information, the method is further included:
The report comprising the server identification and the application and trouble early warning information is sent to specified location by specifying channel
Alert message.
A7, the method as described in A6, wherein, the method is further included:
The early warning wrong report on the server is received to notify;
The server is put back into.
A8, the method as any one of A1-A7, wherein, the method is further included:
When there is server to actually occur hardware fault, obtain the server and occur in the corresponding time range of hardware fault
Server system operation daily record in hardware-related daily record;
It is former with the hardware that the server is actually occurred according to finding at least one in acquired hardware-related daily record
Hinder related daily record;
The log content that will be found out is corresponding with the early warning information of the hardware fault that the server is actually occurred to be saved in institute
In stating hardware fault early warning list.
The invention also discloses B9, a kind of server hardware fault pre-alarming device, including:
List maintenance unit, is suitable to be pre-created hardware fault early warning list, and correspondence saves different hard in the list
Part fault pre-alarming information and corresponding server log content;
Log matches unit, is suitable to obtain server system operation daily record, by acquired server system operation daily record
Build hardware fault early warning list and matched with described, if there is occurrence, notify fault pre-alarming unit;
Fault pre-alarming unit, is suitable to after the notice for receiving log matches unit, determines that the server will be matched
Hardware fault described by the corresponding hardware fault early warning information of item.
B10, the device as described in B9, wherein,
The log matches unit, is suitable to obtain the hardware-related daily record in server running log;Will be acquired
Hardware-related daily record matched with the hardware fault early warning list.
B11, the device as described in B10, wherein,
The log matches unit, is suitable to the system configuration according to server, it is determined that preserving hardware-related daily record
Journal file title;According to identified journal file title, hardware-related daily record is obtained from corresponding journal file.
B12, the device as described in B9, wherein, the device is further included:
Early warning processing unit, is suitable to when the fault pre-alarming unit that to determine that the server will occur occurrence corresponding hard
During hardware fault described by part fault pre-alarming information, judge whether and the server storage identical data and offer is identical
Other servers of service, if there is then by the services migrating on the server to described other servers.
B13, the device as described in B12, wherein,
The early warning processing unit, is further adapted for working as and judges do not exist and the server storage identical data and offer
During other servers of same services, the data on the server and service are all moved on the standby server specified.
B14, the device as described in B9, wherein,
The fault pre-alarming unit, is further adapted for being sent comprising the server identification to specified location by specifying channel
With the warning message of the application and trouble early warning information.
B15, the device as described in B14, wherein,
The fault pre-alarming unit, is further adapted for receiving the early warning wrong report on the server and notifies;By the server
Put back into.
B16, the device as any one of B9-B15, wherein,
The list maintenance unit, is further adapted for, when there is server to actually occur hardware fault, obtaining the server
Hardware-related daily record in server system operation daily record in the corresponding time range of generation hardware fault;According to acquired
Hardware-related daily record in find at least one daily record related to the hardware fault that the server is actually occurred;To search
The log content for going out is corresponding with the early warning information of the hardware fault that the server is actually occurred to be saved in the hardware fault early warning
In list.
Claims (10)
1. a kind of server hardware fault early warning method, including:
Hardware fault early warning list is pre-created, correspondence saves different hardware fault early warning information and corresponding in the list
Server log content;
Server system operation daily record is obtained, by acquired server system operation daily record and the hardware fault early warning list
Matched;
If there is occurrence, it is determined that the server will occur described by the corresponding hardware fault early warning information of occurrence
Hardware fault.
2. the method for claim 1, wherein acquisition server system operation daily record, by acquired server
System operation daily record and the hardware fault early warning list match including:
Obtain the hardware-related daily record in server running log;
Acquired hardware-related daily record is matched with the hardware fault early warning list.
3. method as claimed in claim 2, wherein, the hardware-related daily record bag obtained in server running log
Include:
According to the system configuration of server, it is determined that preserving the journal file title of hardware-related daily record;
According to identified journal file title, hardware-related daily record is obtained from corresponding journal file.
4. the corresponding hardware event of occurrence the method for claim 1, wherein will occur in described determination server
After hardware fault described by barrier early warning information, the method is further included:
If there is other servers with the server storage identical data and offer same services, then by the server
Services migrating is on described other servers.
5. method as claimed in claim 4, wherein, the corresponding hardware of occurrence will occur in described determination server therefore
After hardware fault described by barrier early warning information, the method is further included:
If there is no other servers with the server storage identical data and offer same services, then by the server
Data and service all move on the standby server specified.
6. the corresponding hardware event of occurrence the method for claim 1, wherein will occur in described determination server
After hardware fault described by barrier early warning information, the method is further included:
Disappeared by specifying channel to send the alarm comprising the server identification and the application and trouble early warning information to specified location
Breath.
7. method as claimed in claim 6, wherein, the method is further included:
The early warning wrong report on the server is received to notify;
The server is put back into.
8. the method as any one of claim 1-7, wherein, the method is further included:
When there is server to actually occur hardware fault, the clothes that the server occurs in the corresponding time range of hardware fault are obtained
Hardware-related daily record in business device system operation daily record;
According to finding at least one hardware fault phase actually occurred with the server in acquired hardware-related daily record
The daily record of pass;
The log content that will be found out is corresponding with the early warning information of the hardware fault that the server is actually occurred be saved in it is described hard
In part fault pre-alarming list.
9. a kind of server hardware fault pre-alarming device, including:
List maintenance unit, is suitable to be pre-created hardware fault early warning list, and correspondence saves different hardware events in the list
Barrier early warning information and corresponding server log content;
Log matches unit, is suitable to obtain server system operation daily record, by acquired server system operation daily record and institute
State and build hardware fault early warning list and matched, if there is occurrence, notify fault pre-alarming unit;
Fault pre-alarming unit, is suitable to after the notice for receiving log matches unit, determines that the server will occur occurrence pair
The hardware fault described by hardware fault early warning information answered.
10. device as claimed in claim 9, wherein,
The log matches unit, is suitable to obtain the hardware-related daily record in server running log;By it is acquired with
The related daily record of hardware is matched with the hardware fault early warning list.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611247164.XA CN106844145A (en) | 2016-12-29 | 2016-12-29 | A kind of server hardware fault early warning method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611247164.XA CN106844145A (en) | 2016-12-29 | 2016-12-29 | A kind of server hardware fault early warning method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106844145A true CN106844145A (en) | 2017-06-13 |
Family
ID=59113429
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611247164.XA Pending CN106844145A (en) | 2016-12-29 | 2016-12-29 | A kind of server hardware fault early warning method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106844145A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108040159A (en) * | 2017-11-30 | 2018-05-15 | 努比亚技术有限公司 | Localization method, mobile terminal and readable storage medium storing program for executing are restarted based on hardware driving |
CN108959038A (en) * | 2018-07-16 | 2018-12-07 | 郑州云海信息技术有限公司 | A kind of method and device of distributed application services monitoring |
CN109558272A (en) * | 2017-09-26 | 2019-04-02 | 北京国双科技有限公司 | The fault recovery method and device of server |
CN109828868A (en) * | 2019-01-04 | 2019-05-31 | 新华三技术有限公司成都分公司 | Date storage method, device, management equipment and dual-active data-storage system |
CN110780646A (en) * | 2019-09-21 | 2020-02-11 | 苏州浪潮智能科技有限公司 | Memory quality early warning method based on MES system |
CN111108481A (en) * | 2017-09-29 | 2020-05-05 | 华为技术有限公司 | Fault analysis method and related equipment |
CN111367397A (en) * | 2020-03-02 | 2020-07-03 | 无锡华云数据技术服务有限公司 | Cloud host migration method, cloud host downtime determination system and storage medium |
CN111778551A (en) * | 2020-07-14 | 2020-10-16 | 哈尔滨科友半导体产业装备与技术研究院有限公司 | Cloud computing-based PVT method crystal growth system automatic early warning system |
CN112948217A (en) * | 2021-03-29 | 2021-06-11 | 腾讯科技(深圳)有限公司 | Server repair checking method and device, storage medium and electronic equipment |
CN113010375A (en) * | 2021-02-26 | 2021-06-22 | 腾讯科技(深圳)有限公司 | Equipment alarm method and related equipment |
CN113094224A (en) * | 2019-12-20 | 2021-07-09 | 中移全通系统集成有限公司 | Server asset management method and device, computer equipment and storage medium |
CN113268377A (en) * | 2021-04-25 | 2021-08-17 | 山东英信计算机技术有限公司 | Abnormal state data backup method, system and storage medium |
CN114003461A (en) * | 2021-09-26 | 2022-02-01 | 苏州浪潮智能科技有限公司 | Server failure prediction method, system, terminal and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102279775A (en) * | 2011-08-19 | 2011-12-14 | 西安交通大学 | Method for processing failure of hard disk under Linux system |
JP2016091125A (en) * | 2014-10-30 | 2016-05-23 | 株式会社日立システムズ | Failure log detection and transfer system, failure log detection and transfer method, and program |
CN105740121A (en) * | 2016-01-26 | 2016-07-06 | 中国银行股份有限公司 | Log text monitoring and early-warning method and apparatus |
CN106254100A (en) * | 2016-07-27 | 2016-12-21 | 腾讯科技(深圳)有限公司 | A kind of data disaster tolerance methods, devices and systems |
-
2016
- 2016-12-29 CN CN201611247164.XA patent/CN106844145A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102279775A (en) * | 2011-08-19 | 2011-12-14 | 西安交通大学 | Method for processing failure of hard disk under Linux system |
JP2016091125A (en) * | 2014-10-30 | 2016-05-23 | 株式会社日立システムズ | Failure log detection and transfer system, failure log detection and transfer method, and program |
CN105740121A (en) * | 2016-01-26 | 2016-07-06 | 中国银行股份有限公司 | Log text monitoring and early-warning method and apparatus |
CN106254100A (en) * | 2016-07-27 | 2016-12-21 | 腾讯科技(深圳)有限公司 | A kind of data disaster tolerance methods, devices and systems |
Non-Patent Citations (1)
Title |
---|
邵必林: "《海量信息存储安全技术及其应用》", 30 April 2014, 西北工业大学出版社 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109558272A (en) * | 2017-09-26 | 2019-04-02 | 北京国双科技有限公司 | The fault recovery method and device of server |
CN111108481A (en) * | 2017-09-29 | 2020-05-05 | 华为技术有限公司 | Fault analysis method and related equipment |
CN111108481B (en) * | 2017-09-29 | 2021-08-13 | 华为技术有限公司 | Fault analysis method and related equipment |
CN108040159B (en) * | 2017-11-30 | 2021-01-29 | 江苏觅丰电商科技有限公司 | Restart positioning method based on hardware drive, mobile terminal and readable storage medium |
CN108040159A (en) * | 2017-11-30 | 2018-05-15 | 努比亚技术有限公司 | Localization method, mobile terminal and readable storage medium storing program for executing are restarted based on hardware driving |
CN108959038A (en) * | 2018-07-16 | 2018-12-07 | 郑州云海信息技术有限公司 | A kind of method and device of distributed application services monitoring |
CN109828868A (en) * | 2019-01-04 | 2019-05-31 | 新华三技术有限公司成都分公司 | Date storage method, device, management equipment and dual-active data-storage system |
CN109828868B (en) * | 2019-01-04 | 2023-02-03 | 新华三技术有限公司成都分公司 | Data storage method, device, management equipment and double-active data storage system |
CN110780646B (en) * | 2019-09-21 | 2021-11-26 | 苏州浪潮智能科技有限公司 | Memory quality early warning method based on MES system |
CN110780646A (en) * | 2019-09-21 | 2020-02-11 | 苏州浪潮智能科技有限公司 | Memory quality early warning method based on MES system |
CN113094224A (en) * | 2019-12-20 | 2021-07-09 | 中移全通系统集成有限公司 | Server asset management method and device, computer equipment and storage medium |
CN113094224B (en) * | 2019-12-20 | 2022-07-29 | 中移全通系统集成有限公司 | Server asset management method and device, computer equipment and storage medium |
CN111367397A (en) * | 2020-03-02 | 2020-07-03 | 无锡华云数据技术服务有限公司 | Cloud host migration method, cloud host downtime determination system and storage medium |
CN111778551A (en) * | 2020-07-14 | 2020-10-16 | 哈尔滨科友半导体产业装备与技术研究院有限公司 | Cloud computing-based PVT method crystal growth system automatic early warning system |
CN113010375A (en) * | 2021-02-26 | 2021-06-22 | 腾讯科技(深圳)有限公司 | Equipment alarm method and related equipment |
CN113010375B (en) * | 2021-02-26 | 2023-03-28 | 腾讯科技(深圳)有限公司 | Equipment alarm method and related equipment |
CN112948217A (en) * | 2021-03-29 | 2021-06-11 | 腾讯科技(深圳)有限公司 | Server repair checking method and device, storage medium and electronic equipment |
CN113268377A (en) * | 2021-04-25 | 2021-08-17 | 山东英信计算机技术有限公司 | Abnormal state data backup method, system and storage medium |
CN114003461A (en) * | 2021-09-26 | 2022-02-01 | 苏州浪潮智能科技有限公司 | Server failure prediction method, system, terminal and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106844145A (en) | A kind of server hardware fault early warning method and device | |
CN108833184B (en) | Service fault positioning method and device, computer equipment and storage medium | |
US9170873B2 (en) | Diagnosing distributed applications using application logs and request processing paths | |
CN105095056B (en) | A kind of method of data warehouse data monitoring | |
US9071535B2 (en) | Comparing node states to detect anomalies | |
Lim et al. | A log mining approach to failure analysis of enterprise telephony systems | |
CN105512027B (en) | Process status monitoring method and device | |
CN110851320A (en) | Server downtime supervision method, system, terminal and storage medium | |
CN112737800B (en) | Service node fault positioning method, call chain generating method and server | |
JP6878984B2 (en) | Monitoring program, monitoring method and monitoring device | |
CN105404581A (en) | Database evaluation method and device | |
CN109034423A (en) | A kind of method, apparatus, equipment and storage medium that fault pre-alarming determines | |
CN111767173A (en) | Network equipment data processing method and device, computer equipment and storage medium | |
CN103701655A (en) | Fault self-diagnosis and self-recovery method and system for interchanger | |
CN106656636A (en) | Cloud platform fault detection method and device | |
CN109271270A (en) | The troubleshooting methodology, system and relevant apparatus of bottom hardware in storage system | |
CN111062503B (en) | Power grid monitoring alarm processing method, system, terminal and storage medium | |
CN111597093B (en) | Exception handling method, device and equipment thereof | |
CN115102838B (en) | Emergency processing method and device for server downtime risk and electronic equipment | |
WO2014196982A1 (en) | Identifying log messages | |
CN106789335A (en) | A kind of method and system for processing information | |
CN107682173B (en) | Automatic fault positioning method and system based on transaction model | |
JP2017211806A (en) | Communication monitoring method, security management system, and program | |
CN115941441A (en) | System link automation monitoring operation and maintenance method, system, equipment and medium | |
CN109003643A (en) | A kind of data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170613 |
|
RJ01 | Rejection of invention patent application after publication |