CN106598790A - Server hardware failure detection method, apparatus of server, and server - Google Patents
Server hardware failure detection method, apparatus of server, and server Download PDFInfo
- Publication number
- CN106598790A CN106598790A CN201510673005.5A CN201510673005A CN106598790A CN 106598790 A CN106598790 A CN 106598790A CN 201510673005 A CN201510673005 A CN 201510673005A CN 106598790 A CN106598790 A CN 106598790A
- Authority
- CN
- China
- Prior art keywords
- server
- hardware
- output system
- basic input
- fault
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The present invention provides a server hardware failure detection method, an apparatus of a server, and the server. The method comprises the steps that a basic input output system apparatus of the server detects that the server enters a startup phase; the basic input output system apparatus begins to perform failure detection and analysis on hardware of the server in each working phase, wherein the working phase comprises the startup phase, information of hardware failures detected by the BIOS apparatus covers information of pre-detected failures of the hardware of the server in the whole running period of the server, so that the failures of the server in running are processed timely, so as to improve running stability and reliability of the server; and further, the basic input output system apparatus stores the hardware failure information obtained through detection and analysis. Operators can process the failures conveniently, and the hardware failure information is stored and managed in a unified manner.
Description
Technical field
The present invention relates to computer and the communications field, more particularly to a kind of server hardware fault detection method and
Its device and server.
Background technology
On current middle-and high-end server, server typically all has part black box function, is used for
The fault message record that operating system collapse is, can be by OS's (operating system, Operating System)
Various kernel exceptions such as kernel fault, restart reset, abnormal type information etc. and record, it is also possible to pass through
The simple hardware error in SEL (System Event Log, System Event Log) records part, then it is or logical
The mode (such as joint test link) crossed outside band gathers at the scene mistake after failure generation, or logical
The passive monitoring device exception of exception-triggered mechanism crossed in band, and the exception-triggered mechanism in band needs exception
Condition go trigger its exception record module just recorded.These methods can help to a certain extent safeguard people
Member determines failure Producing reason, but these methods still suffer from following defect:
1st, above-mentioned method is recorded by passive detection trigger, lacks the active detecting to server, especially
It is that monitoring is screened in the active to server hardware failure.For normally starting and run in system, and business
The situation that quality significantly declines, system can't trigger fault message record, and this is to will result in failure letter
Breath is missed so that attendant traces difficulty when safeguarding to fault message.
2nd, in system crash or generation exception-triggered, can just fault message be recorded to detection due to only,
Therefore, cause serious to the acquisition capacity and analysis ability of hardware fault in system (business) running
Deficiency, so as to cause the pre-alerting ability of system not enough, reduces the stability and reliability of system.
3rd, it is excessively simple, scattered for the fault message of record, without accurately unified record management, it is impossible to
Accomplish to settle fault information analysis at one go, the later stage needs substantial amounts of analysis and examination, cross validation just can look for
To major failure source.
4th, to fault information acquisition by way of outside band, professional, office point environment, information can be limited to
Safety etc., environment deployment, personnel's coordination, environment recovery etc. are with high costs.
Therefore, in current server failure information record implementation, only under given conditions could
The detection record of fault message is realized, and the fault message of its record is simple, scattered, needs the big of later stage
Amount analysis.
The content of the invention
The main technical problem to be solved in the present invention be to provide a kind of server hardware fault detection method and its
Device and server, solve to realize carrying out reality to the hardware of server each working stage in prior art
When fault message detection and the technical problem of record storage.
To solve above-mentioned technical problem, the present invention provides a kind of server hardware fault detection method, including:
The basic input output system device of server detects the server and enters startup stage;
The basic input output system device starts to carry out hardware fault in each working stage to the server
Detection, the working stage includes the startup stage;
The basic input output system device will detect that the hardware fault information for obtaining is stored.
In an embodiment of the present invention, the startup stage include initial phase, the basic input and output
System and device carries out hardware failure detection in the initial phase to the server to be included:
The hardware detection mechanism that the basic input output system device is provided according to the server is to the clothes
At least one of the business CPU of device, internal memory, chipset and power supply carry out the pre-detection of hardware and obtain current
Hardware information, faulty hardware information is filtered out from the hardware information is analyzed process and obtain accordingly
Hardware fault information.
In an alternative embodiment of the invention, the startup stage is described substantially defeated also including the device enumeration stage
Entering output system device and carrying out hardware failure detection to the server in the device enumeration stage includes:
The basic input output system device obtains the status information and resource letter of each hardware on the server
Breath, and therefrom recognize the fault message of the hardware for breaking down.
In an alternative embodiment of the invention, the startup stage is cold-start phase or thermal starting stage.
In an alternative embodiment of the invention, the working stage also includes operating system pre-boot phase and operation
At least one of system business operation phase.
In an alternative embodiment of the invention, it is described when the working stage includes operating system pre-boot phase
Basic input output system device carries out hardware fault in the operating system pre-boot phase to the server
Detection includes:
The basic input output system device is to the hardware device outside the server band that will be booted up
Carry out pre-detection;
Obtain the Current hardware information of the hardware device;
The fault message of the hardware device for breaking down is filtered out from the Current hardware information;
When the working stage includes the operating system service operation stage, the basic input output system device
Carrying out hardware failure detection to the server in the operating system service operation stage includes:It is described basic
Input-output system device judges whether the hardware interrupt of the server arrives, if so, the then base
This input-output system device is detected to the related hardware of the operating system;Obtain the event of the hardware
Barrier information.
In an alternative embodiment of the invention, the failure for obtaining will be detected in the basic input output system device
Before information is stored, it is additionally included on the server serial flash memorizer and distributes one for storing
The failed storage area of the hardware fault information.
To solve above-mentioned technical problem, the present invention also provides a kind of basic input output system device, including:
Fault message detects trigger module, and whether startup stage is entered for detection service device;
Fault message detection module, enters for detecting the server in the fault detect trigger module
During startup stage, start to carry out hardware failure detection, the work rank in each working stage to the server
Section includes the startup stage;
Fault message memory module, for the hardware fault information that fault message detection module detection is obtained
Stored.
In an alternative embodiment of the invention, also including storage setup module, in fault message storage
Before module is stored the hardware fault information, distribute on the server serial flash memorizer
One failed storage area for being used to store the hardware fault information.
To solve above-mentioned technical problem, the present invention also provides a kind of server includes basic input as above
Output system device.
The invention has the beneficial effects as follows:
A kind of server hardware fault detection method and its device and server that the present invention is provided, by server
Basic input output system (Basic Input Output System, BIOS) device detect server and enter
When entering startup stage, starting the hardware of each working stage to the server carries out fault detection analysis and then obtains
To corresponding hardware fault information.It is the BIOS devices of server itself due to what is utilized, therefore can detects
The all hardware failure being likely to occur in the whole cycle of server operation, can be lifted and be improve to hardware fault
The comprehensive and accuracy of infomation detection, and more conducively realize the unified storage to server hardware fault message
Management, it is ensured that attendant can be accurately obtained hardware fault letter when safeguarding to the server
Breath learns position and the failure cause of the hardware for needing troubleshooting, further increasing stablizing for server
Property and reliability.
Description of the drawings
Fig. 1 is the flow chart of server hardware fault detection method provided by the present invention;
Fig. 2 is the flow chart that the server initiation stage provided by the present invention carries out hardware failure detection;
Fig. 3 is the flow chart that present device enumeration stage carries out hardware failure detection;
Fig. 4 is the flow chart that operating system pre-boot phase of the present invention carries out hardware failure detection;
Fig. 5 is the flow chart that the operating system service operation stage of the present invention carries out hardware failure detection;
The basic input output system apparatus structure block diagram that Fig. 6 is provided for the present invention.
Specific embodiment
Accompanying drawing is combined below by specific embodiment to be described in further detail the present invention.
Embodiment one:
Fig. 1 is refer to, Fig. 1 is the flow chart of server hardware fault detection method provided by the present invention, this
The server hardware fault detection method that embodiment is provided should be understood that by the basic input and output system
Bulk cargo is put actively carries out fault detect to the hardware of server, and active here refers to be preset according to server
Testing mechanism, in startup of server, the BIOS devices are immediately performed the operation hardware to the server
Carry out failure detection operations or the BIOS devices all carries out hardware to each working stage of the server
Failure detection operations, specifically include following steps:
S101, the basic input output system device of server detects the server and enters startup stage;
In the present embodiment, the startup stage of server is cold-start phase or thermal starting stage;It is described basic
Input-output system device detects the server and refers to into startup stage:It is when the startup stage
During cold-start phase, the basic input output system device can in the following manner be detected whether to enter and opened
The dynamic stage but it is not limited in the following manner:Clothes are pressed or detected to the power switch key on detection service device whether
Whether the power supply circuits of business device are connected or by the state flag bit of inspection power supply with server power supply interface,
If so, then server has been enter into startup stage, and server runs, execution step S102, otherwise, continues
Detection;
When the startup stage being the thermal starting stage, the basic input output system device is by detecting institute
State whether server has reset enabling signal to be input into, if so, then server proceeds by thermal starting operation, holds
Row step S102, otherwise, continues to detect;Here reset enabling signal can be input into by hardware trigger,
Such as:It is input into by way of reset key;Can also be input into by way of software is realized, such as:
Realize periodically being input into server by code, instrument;Can also be user actively by order or behaviour
Make the input of " restarting " button.
S102, the basic input output system device start to the server each working stage hardware
Fault detect is carried out, the working stage includes the startup stage;
S103, the basic input output system device will detect that the hardware fault information for obtaining is stored.
In the present embodiment, before step S103, it is additionally included on the server serial flash memorizer
Distribution one is used to store the failed storage area of the hardware fault information;Further, the basic input
The fault message content of output system device record storage includes:Time, the event of generation, the order of severity,
Particular location or fault details, it is proposed that processing mode.
In the present embodiment, performed after above-mentioned step detects hardware fault information and stored, when
When attendant needs to safeguard the server, attendant can be by being connected with the memory block
The hardware fault information with storing described in outer control platform or network user interface direct access, convenient dimension
Shield personnel follow the trail of failure and track occur, and in-situ FTIR spectroelectrochemitry, replacing fault hardware are (such as:Certain CPU is directly changed,
Which root memory bar is directly changed, fault bus interface card is directly replaced).In middle and high end server by heat
Plugging technique (including but not limited to:CPU hot plugs, memory hot plug, EBI hot plug) completely
Can ensure that system operation is uninterrupted, reach early discovery, early early warning, early prevention, the early purpose for processing.I.e.
Server is hung during cold start-up or thermal starting extremely, server peripheral hardware cannot use (as network interface is obstructed,
Screen is not bright, keyboard and mouse is not responding to), still can get effective fault message.
In the present embodiment, operation maintenance personnel gets hardware fault information by control platform, except locating in time
Outside reason, can also be by the hardware fault storage dump to other band peripheral storage device.
In the present embodiment, the startup stage include initial phase, basic input output system dress
Put the step of the initial phase carries out fault detect to the hardware of the server as shown in Fig. 2 its
Specifically include:
S201, the basic input output system device initialization CPU, internal memory, chipset and power supply;
S202, the basic input output system device detection is obtained in CPU, internal memory, chipset and power supply
At least one Current hardware information;
In the present embodiment, the basic input output system device is the hardware provided according to the server
Testing mechanism carries out pre-detection at least one of the CPU of the server, internal memory, chipset and power supply
Current hardware information is obtained, is filtered out from the hardware information from faulty hardware information is analyzed
Reason obtains corresponding hardware fault information.
Specifically, in the present embodiment, when the basic input output system device detects the server
When having been enter into initial phase, the BIOS devices can be increased using the BIOS devices itself or actively
Pressure or the hardware detection mechanism provided using CPU and chipset or the integration tool in utilization band are (such as
MEMTEST instrument, system event diary testing tool) etc. mode, actively initiate to CPU, internal memory, core
The failure of the server hardware such as piece group and power supply and configuration detected, obtains corresponding hardware information, then
Preanalysis judgement, pre- statistics, pre- examination, scanning, tolerance hardware are carried out to accessed hardware information,
And collect test result, and it is (follow-up abnormal including triggering system to filter out effective fault message
Information) carry out detailed record and stored;So that when server occurs in this stage system exception feelings
During condition, it is ensured that the server obtained and recorded more detailed hardware failures before system exception generation
Information.In this stage, the fault message of the record storage includes but is not limited to:CPU mistakes with alarm,
CBO (buffer area, Caching Agent) mistakes with alarm, QPI (Quick Path Interconnect,
QuickPathInterconnect) mistake and alarm, IIO (integrated input/output, Integrated I/O) port
Mistake (integrates Memory control with alarm, HA (local agent, Home Agent) mistakes and alarm, IMC
Device, Integrated Memory Controller) mistake with alarm, PCU (power control unit, Power Control
Unit) mistake and alarm, power supply and voltage error and alarm, EMS memory error and alarm are (including memory bar itself
Mistake and alarm, main memory access mistake and alarm, internal memory are inserted method mistake and alarm, memory voltage mistake and are accused
The incompatible mistake of alert, internal memory and alarm, configuration error and alarm etc.).
In the present embodiment, the startup stage also include the device enumeration stage, the stage carries out hardware fault
The flow chart of detection is as shown in figure 3, specifically include following steps:
S301, the basic input output system device starts device enumeration;
S302, the basic input output system device detects the current information of acquisition equipment;
Further, the basic input output system device obtains the state letter of each hardware on the server
Breath and resource information, and therefrom recognize the fault message of the software and hardware for breaking down.In this stage, it is described
Fault message includes but is not limited to:Equipment access errors (illegal including internal memory and I/O requirement), third party
Firmware (OPTION ROM) be not carried out (including insufficient space, form not to), device damage it is disabled.
Specifically, in the present embodiment, when the server is to EBI (Peripheral Component Interface
Express, PCIE) peripheral hardware issues probe task, and during computational resource requirements, the BIOS devices are according to inspection
Survey mechanism starts to recognize third party's firmware (OPTION ROM) identifier, the manufacturer's letter that industrial specification is formulated
Breath, device class information and capacity, check hardware state configured information (such as linking status, bandwidth information)
Deng, and identify that the fault message of faulty hardware is stored from above-mentioned information.
In the present embodiment, the working stage also includes operating system pre-boot phase and operating system business
At least one of operation phase;Refer to Fig. 4,5, respectively operating system pre-boot phase, operation system
The system service operation stage carries out the flow chart of hardware failure detection;
Such as Fig. 4, the operating system pre-boot phase carries out hardware failure detection analysis and comprises the following steps:
S401, the basic input output system device is to hard outside the server band that will be booted up
Part equipment carries out pre-detection;
S402, obtains the Current hardware information of the hardware device;
S403, filters out the fault message of the hardware device for breaking down from the Current hardware information;
In the present embodiment, the hardware device outside the server band is included but is not limited to:Hard disk, server
Network interface, equipment guiding attribute;The fault message includes but is not limited to:Can not starting device, hard disk (or U
Disk) damage (destruction of subregion containing MBR), the failure of PXE netboots (containing port information, network ping not
It is logical), ME (Management Engine) working condition exception.Preferably, it is described when in this stage
When basic input output system device carries out fault detect to the fdisk, the basic input and output system
Bulk cargo is put and actively initiate detection acquisition signal, obtains the MBR (MBR subregions) of hard disk (USB flash disk)
Data, analysis boot flag, end mark and error message data field, according to the hardware that the server is provided
Testing machine judges whether hard disk (USB flash disk) can guide, damage;By issuing self-inspection command determination server
Communication link state, mode of operation between main frame;By DHCP (Dynamic Host Configuration
Protocol, DHCP) communicate whether inspection network connects;Single board starting equipment is enumerated, is examined
Look into the presence or absence of can starting device.
Such as Fig. 5, the operating system service operation stage carries out hardware failure detection analysis and comprises the following steps:
Whether S501, the hardware interrupt for judging the server arrives;
S502, if so, then the basic input output system device enters to the related hardware of the operating system
Row detection;
S503, obtains the fault message of the hardware;
It is described substantially defeated when the hardware interrupt arrival is determined in above-mentioned fault detection analysis
Entering the output system device pair hardware related to the service operation carries out fault detect, and hard to what is detected
Part fault message is analyzed, classifies, counts, and then the fault message is stored.In the stage
The fault message of detection is included but is not limited to:CPU mistakes and alarm, CBO mistakes and alarm, QPI mistakes
With alarm, VT-D mistakes and alarm, IIO port errors and alarm, EMS memory error and alarm, PCIE mistakes
With alarm, PCU mistakes and alarm, Ubox (Utility Box) mistakes and alarm.Preferably, in the rank
The hardware failure detection process of section, the BIOS devices open MCA (Machine Check Architecture)
Function and enhancement mode error logging AER (Advance Error Report) function, open each component correspondence
Error detection block (Machine Check Error Bank) switch, mounting Fault Identification classification function and
The fault processing Hook Function of each component.When MCE (Machine-Check Exception) occurs extremely
When, hardware drags down error condition pin, generation system management interrupt (SMI).Now BIOS devices
Control is obtained, recognizes that classification function reads the error condition deposit that CPU and bridge piece are carried by hardware fault
Device, obtains error detection block (Machine Check Error Bank) specifying information, then according to chip handss
Volume is parsed in detail, specific hardware error message is separated, is interpreted and.
Embodiment two:
Present embodiments provide a kind of basic input output system device, it should be appreciated that the BIOS devices can
In to be arranged at any server, realize to server in the hardware failure detection of any working stage, please join
As shown in Figure 6, basic input output system device 60 includes:
Fault message detects trigger module 61, and for detecting the server startup stage is entered;
Fault message detection module 62, for starting to carry out event in the hardware of each working stage to the server
Barrier detection, the working stage includes the startup stage;
Fault message memory module 63, will test and analyze what is obtained for the basic input output system device
Hardware fault information is stored.
In the present embodiment, in the startup stage of server, the fault message detection module 62 is according to described
The hardware detection mechanism that server is provided in the CPU of the server, internal memory, chipset and power supply extremely
Few one carries out pre-detection and obtains current hardware information, filters out from the hardware information faulty hard
Part information is analyzed process and obtains corresponding hardware fault information.
In the device enumeration stage of server, the fault message detection module 62 obtains each on the server
The status information and resource information of hardware, and therefrom recognize the fault message of the hardware for breaking down.
In the operating system pre-boot phase of server, the fault message detection module 62 pairs will be guided
The hardware device outside the server band for starting carries out pre-detection;
Obtain the Current hardware information of the hardware device;
The fault message of the hardware device for breaking down is filtered out from the Current hardware information;
At the operating system service operation stage of server, 62 pairs of clothes of the fault message detection module
Business device carries out hardware failure detection to be included:The basic input output system device judges the hard of the server
Whether part interrupt signal arrives, if so, then the basic input output system device to the operating system
Related hardware is detected;Obtain the fault message of the hardware.
In the present embodiment, also including storage setup module 64, for inciting somebody to action in the fault message memory module
Before the fault message is stored, one is distributed on the server serial flash memorizer is used to deposit
Store up the failed storage area of the hardware fault information.
In the present invention, a kind of server is additionally provided, the server includes basic input as above
Output system device.
The technical scheme that the present invention is provided can be widely applied on the equipment such as computer, network communication equipment, lead to
The hardware device crossed in the whole cycle that basic input output system device runs to the server carries out failure
Detection, can prevent the server to break down in running, improve the steady of the server operation
Qualitative and reliability.
Above content is to combine specific embodiment further description made for the present invention, it is impossible to recognized
Being embodied as of the fixed present invention is confined to these explanations.For the ordinary skill of the technical field of the invention
For personnel, without departing from the inventive concept of the premise, some simple deduction or replace can also be made,
Protection scope of the present invention should be all considered as belonging to.
Claims (10)
1. a kind of server hardware fault detection method, it is characterised in that include:
The basic input output system device of server detects the server and enters startup stage;
The basic input output system device starts to carry out hardware fault in each working stage to the server
Detection, the working stage includes the startup stage;
The basic input output system device will detect that the hardware fault information for obtaining is stored.
2. server failure detection method as claimed in claim 1, it is characterised in that the startup
Stage includes initial phase, and the basic input output system device is in the initial phase to the clothes
Business device carries out hardware failure detection to be included:
The hardware detection mechanism that the basic input output system device is provided according to the server is to the clothes
At least one of the business CPU of device, internal memory, chipset and power supply carry out the pre-detection of hardware and obtain current
Hardware information, faulty hardware information is filtered out from the hardware information is analyzed process and obtain accordingly
Hardware fault information.
3. server hardware fault detection method as claimed in claim 2, it is characterised in that described
Startup stage also includes the device enumeration stage, and the basic input output system device is in the device enumeration rank
Section carries out hardware failure detection to the server to be included:
The basic input output system device obtains the status information and resource letter of each hardware on the server
Breath, and therefrom recognize the fault message of the hardware for breaking down.
4. the server hardware fault detection method as described in any one of claim 1-3, its feature exists
In the startup stage is cold-start phase or thermal starting stage.
5. the server hardware fault detection method as described in any one of claim 1-3, its feature exists
In, the working stage also include operating system pre-boot phase and in the operating system service operation stage extremely
It is few one.
6. server hardware fault detection method as claimed in claim 5, it is characterised in that described
When working stage includes operating system pre-boot phase, the basic input output system device is in the operation
System pre-boot phase carries out hardware failure detection to the server to be included:
The basic input output system device is to the hardware device outside the server band that will be booted up
Carry out pre-detection;
Obtain the Current hardware information of the hardware device;
The fault message of the hardware device for breaking down is filtered out from the Current hardware information;
When the working stage includes the operating system service operation stage, the basic input output system device
Carrying out hardware failure detection to the server in the operating system service operation stage includes:It is described basic
Input-output system device judges whether the hardware interrupt of the server arrives, if so, the then base
This input-output system device is detected to the related hardware of the operating system;Obtain the event of the hardware
Barrier information.
7. the server hardware fault detection method as described in any one of claim 1-3, its feature exists
In before the basic input output system device will detect that the fault message that obtains is stored, also wrapping
Include and distribute a failure for being used to store the hardware fault information on the server serial flash memorizer
Memory block.
8. a kind of basic input output system device, it is characterised in that include:
Fault message detects trigger module, and whether startup stage is entered for detection service device;
Fault message detection module, enters for detecting the server in the fault detect trigger module
During startup stage, start to carry out hardware failure detection, the work rank in each working stage to the server
Section includes the startup stage;
Fault message memory module, for the hardware fault information that fault message detection module detection is obtained
Stored.
9. basic input output system device as claimed in claim 8, it is characterised in that also include
Storage setup module, for the hardware fault information to be carried out into storage in the fault message memory module
Before, an event for being used to store the hardware fault information is distributed on the server serial flash memorizer
Barrier memory block.
10. a kind of server, it is characterised in that including basic input as claimed in claim 8 or 9
Output system device.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510673005.5A CN106598790A (en) | 2015-10-16 | 2015-10-16 | Server hardware failure detection method, apparatus of server, and server |
PCT/CN2016/100618 WO2017063505A1 (en) | 2015-10-16 | 2016-09-28 | Method for detecting hardware fault of server, apparatus thereof, and server |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510673005.5A CN106598790A (en) | 2015-10-16 | 2015-10-16 | Server hardware failure detection method, apparatus of server, and server |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106598790A true CN106598790A (en) | 2017-04-26 |
Family
ID=58517771
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510673005.5A Pending CN106598790A (en) | 2015-10-16 | 2015-10-16 | Server hardware failure detection method, apparatus of server, and server |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN106598790A (en) |
WO (1) | WO2017063505A1 (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291584A (en) * | 2017-06-27 | 2017-10-24 | 郑州云海信息技术有限公司 | A kind of chassis failure detection method and system |
CN109117299A (en) * | 2017-06-23 | 2019-01-01 | 佛山市顺德区顺达电脑厂有限公司 | The error detecting device and its debugging method of server |
CN109426606A (en) * | 2017-08-23 | 2019-03-05 | 东软集团股份有限公司 | Kernel failure diagnosis information processing method, device, storage medium and electronic equipment |
CN109697144A (en) * | 2018-11-22 | 2019-04-30 | 合肥联宝信息技术有限公司 | The hard disk detection method and electronic equipment of a kind of electronic equipment |
CN109783283A (en) * | 2018-12-11 | 2019-05-21 | 中国长城科技集团股份有限公司 | A kind of processing method, device and the terminal device of hardware detection information |
CN109918257A (en) * | 2017-12-12 | 2019-06-21 | 杭州海康威视数字技术股份有限公司 | A kind of hard disk abnormality eliminating method and device |
CN111722954A (en) * | 2020-06-30 | 2020-09-29 | 曙光信息产业(北京)有限公司 | Server abnormity positioning method and device, storage medium and server |
CN111767184A (en) * | 2020-09-01 | 2020-10-13 | 苏州浪潮智能科技有限公司 | Fault diagnosis method and device, electronic equipment and storage medium |
CN112148576A (en) * | 2020-09-28 | 2020-12-29 | 北京基调网络股份有限公司 | Application performance monitoring method and system and storage medium |
CN113064747A (en) * | 2021-03-26 | 2021-07-02 | 山东英信计算机技术有限公司 | Fault positioning method, system and device in server starting process |
CN113190278A (en) * | 2021-03-18 | 2021-07-30 | 山东英信计算机技术有限公司 | Multi-scenario fault processing method, system and medium |
CN115047322A (en) * | 2022-08-17 | 2022-09-13 | 中诚华隆计算机技术有限公司 | Method and system for identifying fault chip of intelligent medical equipment |
WO2022262525A1 (en) * | 2021-06-18 | 2022-12-22 | 华为技术有限公司 | Fault handling method and apparatus, device, and system |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110187994A (en) * | 2019-05-28 | 2019-08-30 | 北京星网锐捷网络技术有限公司 | A kind of failure separation method, equipment and fault isolation system |
CN110737560B (en) * | 2019-10-22 | 2023-10-20 | 北京百度网讯科技有限公司 | Service state detection method and device, electronic equipment and medium |
CN113220407B (en) * | 2020-02-04 | 2023-09-26 | 北京京东振世信息技术有限公司 | Fault exercise method and device |
CN113590413B (en) * | 2021-06-29 | 2024-05-10 | 浪潮商用机器有限公司 | UNIX server, and UNIX server fault early warning method and device |
CN114389971B (en) * | 2022-03-23 | 2022-12-23 | 苏州浪潮智能科技有限公司 | Intelligent monitoring fine adjustment method, device, equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1308843A2 (en) * | 2001-11-02 | 2003-05-07 | Siemens Aktiengesellschaft | Method for displaying error messages on a microcomputer |
CN102369513A (en) * | 2011-08-31 | 2012-03-07 | 华为技术有限公司 | Method for improving stability of computer system and computer system |
CN103166773A (en) * | 2011-12-09 | 2013-06-19 | 国家电网公司 | Method and system for monitoring operation state of server |
CN103713981A (en) * | 2013-12-31 | 2014-04-09 | 国网山东省电力公司 | Database server performance detection and early warning method |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8892719B2 (en) * | 2007-08-30 | 2014-11-18 | Alpha Technical Corporation | Method and apparatus for monitoring network servers |
JP5678717B2 (en) * | 2011-02-24 | 2015-03-04 | 富士通株式会社 | Monitoring device, monitoring system, and monitoring method |
-
2015
- 2015-10-16 CN CN201510673005.5A patent/CN106598790A/en active Pending
-
2016
- 2016-09-28 WO PCT/CN2016/100618 patent/WO2017063505A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1308843A2 (en) * | 2001-11-02 | 2003-05-07 | Siemens Aktiengesellschaft | Method for displaying error messages on a microcomputer |
CN102369513A (en) * | 2011-08-31 | 2012-03-07 | 华为技术有限公司 | Method for improving stability of computer system and computer system |
CN103166773A (en) * | 2011-12-09 | 2013-06-19 | 国家电网公司 | Method and system for monitoring operation state of server |
CN103713981A (en) * | 2013-12-31 | 2014-04-09 | 国网山东省电力公司 | Database server performance detection and early warning method |
Non-Patent Citations (1)
Title |
---|
王修智: "《电子信息技术》", 30 April 2007 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117299B (en) * | 2017-06-23 | 2022-04-05 | 佛山市顺德区顺达电脑厂有限公司 | Error detecting device and method for server |
CN109117299A (en) * | 2017-06-23 | 2019-01-01 | 佛山市顺德区顺达电脑厂有限公司 | The error detecting device and its debugging method of server |
CN107291584A (en) * | 2017-06-27 | 2017-10-24 | 郑州云海信息技术有限公司 | A kind of chassis failure detection method and system |
CN109426606A (en) * | 2017-08-23 | 2019-03-05 | 东软集团股份有限公司 | Kernel failure diagnosis information processing method, device, storage medium and electronic equipment |
CN109918257A (en) * | 2017-12-12 | 2019-06-21 | 杭州海康威视数字技术股份有限公司 | A kind of hard disk abnormality eliminating method and device |
CN109918257B (en) * | 2017-12-12 | 2022-11-04 | 杭州海康威视数字技术股份有限公司 | Hard disk exception handling method and device |
CN109697144A (en) * | 2018-11-22 | 2019-04-30 | 合肥联宝信息技术有限公司 | The hard disk detection method and electronic equipment of a kind of electronic equipment |
CN109783283A (en) * | 2018-12-11 | 2019-05-21 | 中国长城科技集团股份有限公司 | A kind of processing method, device and the terminal device of hardware detection information |
CN111722954A (en) * | 2020-06-30 | 2020-09-29 | 曙光信息产业(北京)有限公司 | Server abnormity positioning method and device, storage medium and server |
CN111767184A (en) * | 2020-09-01 | 2020-10-13 | 苏州浪潮智能科技有限公司 | Fault diagnosis method and device, electronic equipment and storage medium |
CN112148576A (en) * | 2020-09-28 | 2020-12-29 | 北京基调网络股份有限公司 | Application performance monitoring method and system and storage medium |
CN113190278A (en) * | 2021-03-18 | 2021-07-30 | 山东英信计算机技术有限公司 | Multi-scenario fault processing method, system and medium |
WO2022198972A1 (en) * | 2021-03-26 | 2022-09-29 | 山东英信计算机技术有限公司 | Method, system and apparatus for fault positioning in starting process of server |
CN113064747A (en) * | 2021-03-26 | 2021-07-02 | 山东英信计算机技术有限公司 | Fault positioning method, system and device in server starting process |
WO2022262525A1 (en) * | 2021-06-18 | 2022-12-22 | 华为技术有限公司 | Fault handling method and apparatus, device, and system |
CN115047322A (en) * | 2022-08-17 | 2022-09-13 | 中诚华隆计算机技术有限公司 | Method and system for identifying fault chip of intelligent medical equipment |
Also Published As
Publication number | Publication date |
---|---|
WO2017063505A1 (en) | 2017-04-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106598790A (en) | Server hardware failure detection method, apparatus of server, and server | |
US11360842B2 (en) | Fault processing method, related apparatus, and computer | |
US8843785B2 (en) | Collecting debug data in a secure chip implementation | |
US20100262863A1 (en) | Method and device for the administration of computers | |
CN104639380A (en) | Server monitoring method | |
CN106776282A (en) | The abnormality eliminating method and device of a kind of bios program | |
CN106789306A (en) | Restoration methods and system are collected in communication equipment software fault detect | |
CN113806127B (en) | Server log collection method, device and readable storage medium | |
CN104734904B (en) | The automatic test approach and system of bypass equipment | |
CN116126772A (en) | UART serial port management system and method applied to ARM server | |
CN103995759B (en) | High-availability computer system failure handling method and device based on core internal-external synergy | |
CN106610878A (en) | Fault debugging method for dual-controller system | |
CN115599617A (en) | Bus detection method and device, server and electronic equipment | |
CN103605593B (en) | The fault diagnosis of heterogeneous system, restoration methods and device | |
CN107179911A (en) | A kind of method and apparatus for restarting management engine | |
CN113076210A (en) | Server fault diagnosis result notification method, system, terminal and storage medium | |
CN113742113A (en) | Embedded system health management method, equipment and storage medium | |
CN113867994B (en) | Cabinet VPD information processing method and device, storage equipment and readable storage medium | |
CN105160259B (en) | A kind of virtualization vulnerability mining system and method based on fuzz testing | |
CN109284218A (en) | A kind of method and device thereof of detection service device operation troubles | |
JPH1188471A (en) | Test method and test equipment | |
JP7367495B2 (en) | Information processing equipment and communication cable log information collection method | |
KR102526368B1 (en) | Server management system supporting multi-vendor | |
CN116489001A (en) | Switch fault diagnosis and recovery method and device, switch and storage medium | |
JP7183841B2 (en) | electronic controller |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |