CN109597726A - A kind of method of main board failure active detecting - Google Patents

A kind of method of main board failure active detecting Download PDF

Info

Publication number
CN109597726A
CN109597726A CN201811276682.3A CN201811276682A CN109597726A CN 109597726 A CN109597726 A CN 109597726A CN 201811276682 A CN201811276682 A CN 201811276682A CN 109597726 A CN109597726 A CN 109597726A
Authority
CN
China
Prior art keywords
main board
active detecting
equipment
detection
board failure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811276682.3A
Other languages
Chinese (zh)
Inventor
马晓光
陈亮甫
张武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Chaoyue CNC Electronics Co Ltd
Original Assignee
Shandong Chaoyue CNC Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Chaoyue CNC Electronics Co Ltd filed Critical Shandong Chaoyue CNC Electronics Co Ltd
Priority to CN201811276682.3A priority Critical patent/CN109597726A/en
Publication of CN109597726A publication Critical patent/CN109597726A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention provides a kind of method of main board failure active detecting, includes the following steps: that host sends order and obtains facility information from pci bus;Read the corresponding activation bit of each equipment;Judge corresponding device PCI with the presence or absence of failure according to the activation bit of acquisition.This method further include: if judging that corresponding device PCI state is normal according to the activation bit of acquisition, carry out the detection of the network equipment.The detection of the network equipment includes the following steps: that mii-tool tool verifies the connection status of the network equipment;Determine whether current network link is in normal condition by obtaining returning the result for mii-tool.

Description

A kind of method of main board failure active detecting
Technical field
The present invention relates to computerized information monitoring technology fields, and in particular to a kind of method of main board failure active detecting.
Background technique
As message area degree domestic increasingly improves, domestic processor and operating system are increasingly by government and army The favor of work enterprise, 1500A processor of soaring is the processor of new generation of China's national defense University of Science and Technology independent development, with its other country Production processor is higher compared to its performance, stability is stronger, and more and more domestic servers and desktop computer use the processor. Operating system compatible with Feiteng processor is milky way kylin operating system, which is open source operating system, for government And for military enterprise, the reliability of operation system and its hardware support kit environment is particularly important, needs to design a kind of main board failure The method of active detecting detects each module of mainboard.
Summary of the invention
In order to overcome the deficiencies in the prior art described above, the present invention provides a kind of method of main board failure active detecting, with Solve above-mentioned technical problem.
The technical scheme is that
A kind of method of main board failure active detecting, includes the following steps:
Host sends order and obtains facility information from pci bus;
Read the corresponding activation bit of each equipment;
Judge corresponding device PCI with the presence or absence of failure according to the activation bit of acquisition.
Preferably, this method further include:
If judging that corresponding device PCI state is normal according to the activation bit of acquisition, the detection of the network equipment is carried out.
Preferably, the detection of the network equipment includes the following steps:
Mii-tool tool verifies the connection status of the network equipment;
Determine whether current network link is in normal condition by obtaining returning the result for mii-tool.
Preferably, this method further include:
CAN device is detected.
Preferably, detection is carried out to CAN device to include the following steps:
Detection/dev/ttyUBU0 device node;
If/dev/ttyUBU0 device node exists, data transmit-receive test is carried out;
If specified transmission data can be properly received, the CAN device normal operation can determine that, be otherwise communication failure.
Preferably, step judges corresponding device PCI with the presence or absence of failure according to the activation bit of acquisition, further includes: if Judge that corresponding device PCI state is abnormal according to the activation bit of acquisition, Write fault code.
Preferably, step determines whether current network link is in normal condition by obtaining returning the result for mii-tool Include:
If it is determined that current network link is in normal condition, detection terminates, otherwise, Write fault code.
Preferably, if step can be properly received specified transmission data, it can determine that the CAN device normal operation, otherwise For in communication failure, if it is determined that communication failure, Write fault code.
Preferably, this method further include:
Error code is pre-defined before carrying out equipment detection.
Preferably, pre-defined error code includes:
Setting error code is 4 byte datas, and from low to high, every kind of fault type uses two bit to failed byte.
For monitoring as a result, being indicated using the error code pre-defined.Error code indicates with 4 byte datas, therefore Hinder byte from low to high, every kind of fault type uses two bit, and two bit indicate normal when being 00, expression event when 01 Barrier, 11 indicate that the equipment does not support the fault detection of the type.
Actively hardware environment failure problem is positioned by analysis fault code and provides alarm.
It as can be seen from the above technical solutions, the invention has the following advantages that can be in real time to system or equipment internal mode Block is monitored, and can be reacted at the first time when system breaks down, be provided alarm.Entire fault diagnosis module can be used as Separate modular exists, and realization process does not depend on external device, only relies on program itself and be monitored to board status.Domestic The real time monitoring to each module information of mainboard is realized in operating system, it is ensured that the stable operation of system health, and in hardware loop Border actively can position failure problems and provide alarm when breaking down.
In addition, design principle of the present invention is reliable, structure is simple, has very extensive application prospect.
It can be seen that compared with prior art, the present invention have substantive distinguishing features outstanding and it is significant ground it is progressive, implementation Beneficial effect be also obvious.
Detailed description of the invention
Fig. 1 is device PCI overhaul flow chart in a kind of method of main board failure active detecting;
Fig. 2 is network equipment overhaul flow chart;
Fig. 3 is CAN device overhaul flow chart.
Specific embodiment
The present invention provides a kind of method of main board failure active detecting, can carry out in real time to system or equipment internal module Monitoring can react at the first time when system breaks down, provide alarm.Entire fault diagnosis module can be used as independent mould Block exists, and realization process does not depend on external device, only relies on program itself and be monitored to board status.In domestic operation system The real time monitoring to each module information of mainboard is realized on system, it is ensured that the stable operation of system health, and occur in hardware environment When failure can actively failure problems be positioned and provide alarm.
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
The present invention can be by software form to equipment and module different in 1500A processor main board of soaring, using difference Monitoring means carry out failure active detecting.It, can be from pci bus for common pci interface equipment, such as video card, sound card equipment Obtain equipment current state information, the operating status of real-time monitoring equipment;For service communication module, such as CAN mouthfuls, serial ports, net Mouthful etc., under the premise of verifying equipment state is normal, it is also necessary to be monitored to its communication capacity, to guarantee that module is in best Working condition.For monitoring as a result, by corresponding fault diagnosis object code or ether network packet with the event of unified definition Barrier code form is reported.
Embodiment one
As shown in Figure 1-Figure 3, the embodiment of the present invention provides a kind of method of main board failure active detecting, sets to device PCI, network Standby and CAN communication equipment is detected, and is included the following steps:
S1: device PCI is detected;
Host sends order and obtains facility information from pci bus, while reading the corresponding activation bit of each equipment, if equipment is correct It is mounted in pci bus, then its current state can be obtained by this operation, can be confirmed pair by the state that judgement returns Answer equipment with the presence or absence of failure.Host sends order and obtains facility information from pci bus;
S2: the network equipment is detected;
In this step, after the existence in addition to verifying equipment by the way of mentioning in S1, it is also necessary to its communication link into Row is verified, and mii-tool tool can verify the connection status of network card equipment under system, passes through the knot for obtaining mii-tool Fruit can determine that whether current network link is in normal condition.
S3: CAN communication equipment is detected;
The different modes for accessing mainboard according to it verify equipment state.Such as turn UART using USB to turn to access by the way of CAN bus again System, then the equipment will create/dev/ttyUBU0 device node under system, be set by judging that the node state can confirm Standby state.Sending and receiving data test is carried out for the selected channel CAN later, if specified transmission data can be properly received, It can determine that the equipment normal operation, be otherwise communication failure.
S4: abort situation is positioned by analysis detection result;
It is indicated using the error code pre-defined.Error code indicates with 4 byte datas, failed byte from low to high, every kind Fault type uses two bit, and two bit indicate normally when being 00, and 01 when indicates failure, and 11 indicate that the equipment is not supported The fault detection of the type.
Embodiment two
The embodiment of the present invention provides a kind of method of main board failure active detecting, includes the following steps:
S1: device PCI is detected;
Host sends order and obtains facility information from pci bus, while reading the corresponding activation bit of each equipment, if equipment is correct It is mounted in pci bus, then its current state can be obtained by this operation, can be confirmed pair by the state that judgement returns Answer equipment with the presence or absence of failure.Host sends order and obtains facility information from pci bus;
S2: the network equipment is detected;
In this step, after the existence in addition to verifying equipment by the way of mentioning in S1, it is also necessary to its communication link into Row is verified, and mii-tool tool can verify the connection status of network card equipment under system, passes through the knot for obtaining mii-tool Fruit can determine that whether current network link is in normal condition.
S3: CAN communication equipment is detected;
The different modes for accessing mainboard according to it verify equipment state.Such as turn UART using USB to turn to access by the way of CAN bus again System, then the equipment will create/dev/ttyUBU0 device node under system, be set by judging that the node state can confirm Standby state.Sending and receiving data test is carried out for the selected channel CAN later, if specified transmission data can be properly received, It can determine that the equipment normal operation, be otherwise communication failure.
In this step, the specific steps of sending and receiving data test include:
S31: test equipment gets and test data information is configured to test patterns after the information command of test data;
S32: test equipment transmitting terminal sends test patterns to receiving end by the channel CAN;
S33: receiving end receives the test patterns that the transmitting terminal is sent
S34: being compared verification with preset test patterns for the test patterns that receiving end receives, and store Inspection, judgement It whether normal communicates;If it is abnormal to test code check, communication failure is judged.
Sending and receiving data test process further include;
If transmitting terminal does not receive the return information of receiving end within the set time, equipment CAN tunneling traffic failure is judged.
S4: abort situation is positioned by analysis detection result;
It is indicated using the error code pre-defined.Error code indicates with 4 byte datas, failed byte from low to high, every kind Fault type uses two bit, and two bit indicate normally when being 00, and 01 when indicates failure, and 11 indicate that the equipment is not supported The fault detection of the type.
Description and claims of this specification and term " first ", " second ", " third " " in above-mentioned attached drawing The (if present)s such as four " are to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should manage The data that solution uses in this way are interchangeable under appropriate circumstances, so as to the embodiment of the present invention described herein can in addition to Here the sequence other than those of diagram or description is implemented.In addition, term " includes " and " having " and their any deformation, It is intended to cover and non-exclusive includes.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (10)

1. a kind of method of main board failure active detecting, which comprises the steps of:
Host sends order and obtains facility information from pci bus;
Read the corresponding activation bit of each equipment;
Judge corresponding device PCI with the presence or absence of failure according to the activation bit of acquisition.
2. a kind of method of main board failure active detecting according to claim 1, which is characterized in that this method further include:
If judging that corresponding device PCI state is normal according to the activation bit of acquisition, the detection of the network equipment is carried out.
3. a kind of method of main board failure active detecting according to claim 2, which is characterized in that the detection of the network equipment Include the following steps:
Mii-tool tool verifies the connection status of the network equipment;
Determine whether current network link is in normal condition by obtaining returning the result for mii-tool.
4. a kind of method of main board failure active detecting according to claim 1, which is characterized in that this method further include:
CAN device is detected.
5. a kind of method of main board failure active detecting according to claim 4, which is characterized in that carried out to CAN device Detection includes the following steps:
Detection/dev/ttyUBU0 device node;
If/dev/ttyUBU0 device node exists, data transmit-receive test is carried out;
If specified transmission data can be properly received, the CAN device normal operation can determine that, be otherwise communication failure.
6. a kind of method of main board failure active detecting according to claim 5, which is characterized in that
Step judges corresponding device PCI with the presence or absence of failure according to the activation bit of acquisition, further includes: if according to the drive of acquisition Dynamic information judges that corresponding device PCI state is abnormal, Write fault code.
7. a kind of method of main board failure active detecting according to claim 6, which is characterized in that
Step by obtain mii-tool return the result determine current network link whether in normal condition include:
If it is determined that current network link is in normal condition, detection terminates, otherwise, Write fault code.
8. a kind of method of main board failure active detecting according to claim 7, which is characterized in that if step can correctly connect Specified transmission data are received, then can determine that the CAN device normal operation, otherwise in communication failure, if it is determined that communication failure, writes Enter fault code.
9. a kind of method of main board failure active detecting according to claim 4, which is characterized in that this method further include:
Error code is pre-defined before carrying out equipment detection.
10. a kind of method of main board failure active detecting according to claim 9, which is characterized in that
Pre-defined error code includes:
Setting error code is 4 byte datas, and from low to high, every kind of fault type uses two bit to failed byte.
CN201811276682.3A 2018-10-30 2018-10-30 A kind of method of main board failure active detecting Pending CN109597726A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811276682.3A CN109597726A (en) 2018-10-30 2018-10-30 A kind of method of main board failure active detecting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811276682.3A CN109597726A (en) 2018-10-30 2018-10-30 A kind of method of main board failure active detecting

Publications (1)

Publication Number Publication Date
CN109597726A true CN109597726A (en) 2019-04-09

Family

ID=65958216

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811276682.3A Pending CN109597726A (en) 2018-10-30 2018-10-30 A kind of method of main board failure active detecting

Country Status (1)

Country Link
CN (1) CN109597726A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1838011A (en) * 2006-04-04 2006-09-27 西安电子科技大学 Intelligent management apparatus and management method for distributed control network based on CAN bus
CN103001808A (en) * 2012-12-24 2013-03-27 上海斐讯数据通信技术有限公司 Switch for detecting port faults and achieving method
CN204288502U (en) * 2014-12-23 2015-04-22 中国电子科技集团公司第二十二研究所 CAN data collector and system
US9052995B2 (en) * 2013-04-26 2015-06-09 Netapp, Inc. Systems and methods providing mount catalogs for rapid volume mount
CN105446857A (en) * 2015-11-16 2016-03-30 山东超越数控电子有限公司 Fault diagnosis method and system
CN106130761A (en) * 2016-06-22 2016-11-16 北京百度网讯科技有限公司 The recognition methods of the failed network device of data center and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1838011A (en) * 2006-04-04 2006-09-27 西安电子科技大学 Intelligent management apparatus and management method for distributed control network based on CAN bus
CN103001808A (en) * 2012-12-24 2013-03-27 上海斐讯数据通信技术有限公司 Switch for detecting port faults and achieving method
US9052995B2 (en) * 2013-04-26 2015-06-09 Netapp, Inc. Systems and methods providing mount catalogs for rapid volume mount
CN204288502U (en) * 2014-12-23 2015-04-22 中国电子科技集团公司第二十二研究所 CAN data collector and system
CN105446857A (en) * 2015-11-16 2016-03-30 山东超越数控电子有限公司 Fault diagnosis method and system
CN106130761A (en) * 2016-06-22 2016-11-16 北京百度网讯科技有限公司 The recognition methods of the failed network device of data center and device

Similar Documents

Publication Publication Date Title
EP2696534B1 (en) Method and device for monitoring quick path interconnect link
US6944796B2 (en) Method and system to implement a system event log for system manageability
CN105589776B (en) A kind of Fault Locating Method and server
CN106936616A (en) Backup communication method and apparatus
CN110580235B (en) SAS expander communication method and device
CN109002310A (en) firmware upgrade method
CN110457164A (en) The method, apparatus and server of equipment management
CN109120522B (en) Multipath state monitoring method and device
CN111740877B (en) Link detection method and system
CN102457403B (en) Method and device for detecting network connection faults
CN114003445B (en) BMC I2C monitoring function test method, system, terminal and storage medium
CN109104335A (en) A kind of industrial control equipment network attack test method and system
CN110389849A (en) A kind of Fault Locating Method of PCIe device, system and server
CN115129249A (en) SAS link topology identification management method, system, terminal and storage medium
CN116436540A (en) LED display screen receiving card testing method and device, testing card and storage medium
CN106502944A (en) The heartbeat detecting method of computer, PCIE device and PCIE device
CN102662808A (en) Method and device for realizing hardware fault detection on PCIE (peripheral component interconnect express)
CN109428778A (en) Mainboard network connectivty test method and device
CN109710479A (en) A kind of processing method and the first equipment, the second equipment
CN109150669A (en) A kind of use based on Ethernet fells and transports the telecommunication system and processing method of dimension
CN109271096A (en) NVME storage expansion system
CN109597726A (en) A kind of method of main board failure active detecting
CN107423185B (en) Method and device for testing compatibility adaptation of disk array and host
CN109885420A (en) A kind of analysis method, BMC and the storage medium of PCIe link failure
CN110413322A (en) A kind of server network interface management method, system and baseboard management controller

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190409

RJ01 Rejection of invention patent application after publication