CN102567550A - Method and device for collecting data of emergency event in operating system (OS) - Google Patents
Method and device for collecting data of emergency event in operating system (OS) Download PDFInfo
- Publication number
- CN102567550A CN102567550A CN2011104557862A CN201110455786A CN102567550A CN 102567550 A CN102567550 A CN 102567550A CN 2011104557862 A CN2011104557862 A CN 2011104557862A CN 201110455786 A CN201110455786 A CN 201110455786A CN 102567550 A CN102567550 A CN 102567550A
- Authority
- CN
- China
- Prior art keywords
- capsule
- packet
- emergency event
- data
- operating system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Debugging And Monitoring (AREA)
Abstract
The invention provides a method and a device for collecting data of an emergency event in an operating system (OS). The method comprises the following steps of: starting the operation system; checking whether a data packet of a capsule exists during the generation of an OS emergency event, wherein if the data packet of the capsule is discovered, and data containing the OS emergency event are checked to exist in the data packet or not, if the data containing the OS emergency event are discovered, and the data packet of the capsule is determined to have permanent efficacy on the OS or not, and if the data packet of the capsule is discovered to have the permanent efficacy on the OS, and the data packet of the capsule is listed into a system table; and resetting the system.
Description
Technical field
Present invention relates in general to computer realm, more specifically, relate to the collecting method and the device of emergency event among the operating system OS.
Background technology
In the prior art, when the application of the data center of large-scale cluster and computing center, usually need be configured and safeguard each node
Distributional pattern, data center or computer room administrator that node disperses; When relating to the maintenance work that needs the BIOS of each node of configuration setting; According to the data of the IDC of Internet data center, the failure rate of server OS emergency event OS panic that is applied in industry-by-industry is up to 10%.
Yet, occur under the situation of so high failure rate, often can't accurately locate and solve because of the basic data at the scene when lacking fault and taking place.Therefore, need a kind of mechanism of the collection to the data relevant at present badly with fault.
Summary of the invention
For addressing the above problem, the invention provides the collecting method of emergency event among a kind of operating system OS, may further comprise the steps: start the operating system; When the OS emergency event takes place when; Whether inspection has the packet of capsule capsule, if find to have the packet of capsule, checks then whether the data that comprise said OS emergency event are arranged in the said packet; If find to have the data that comprise said OS emergency event; Whether the packet that then defines capsule is effectively permanent to said OS, if find saidly have the packet of capsule forever effective to said OS, then has the packet of capsule to list in system's table with said; And with system reset.
Before starting the step of said operating system, said method also comprises: handle the data of last emergency event, and will the analytical information relevant with said data show and carry out fault diagnosis.
When said OS emergency event took place, said OS can expand the related service of the said capsule that defines in the fixed interface UEFI standard and makes up said capsule through calling unification, and said capsule is sent to firmware firmware.
Wherein, the said related service that makes up said capsule becomes said capsule with said data encapsulation, and said firmware is recorded in said capsule in the storage medium.
Wherein, said storage medium is a non-volatile memory medium.
In addition, the data collector of emergency event among a kind of operating system OS is provided also, has comprised: started module, be used to start the operating system; The first inspection module is used for when the OS emergency event takes place, and whether inspection has the packet of capsule capsule; Second detection module is used for when finding the packet of capsule is arranged, checking whether the data that comprise said OS emergency event are arranged in the said packet; Determination module is used for when finding the data that comprise said OS emergency event are arranged, and whether the packet that defines capsule is effectively permanent to said OS; Acquisition module is used for having the packet of capsule to list in system's table finding that the said packet that capsule arranged is permanent effectively the time to said OS with said; And reseting module, be used for system reset.
Wherein, when said OS emergency event took place, said OS can expand the related service of the said capsule that defines in the fixed interface UEFI standard and makes up said capsule through calling unification, and said capsule is sent to firmware firmware.
Description of drawings
When combining accompanying drawing to read, can understand the present invention better according to following detailed description.Should be emphasized that according to the standard practices in the industry, various parts are not drawn in proportion.In fact, in order clearly to discuss, the size of various parts can be by any increase or minimizing
Fig. 1 shows the basic flow sheet according to OS emergency event data processing function under the OS of exemplary embodiment of the present invention;
Fig. 2 shows the collection mechanism according to the OS panic data of exemplary embodiment of the present invention; And
Fig. 3 shows the basic hardware layout according to exemplary embodiment of the present invention.
Embodiment
For the different parts of embodiment of the present invention, below describe many various embodiment or example are provided.The particular example of below describing element and layout is to simplify the present invention.Certainly these only are that example does not plan to limit.Moreover; First parts are formed on and can comprise on second parts that wherein first and second parts are with the embodiment of direct contact formation in below describing; And can comprise that also wherein additional parts formation is inserted into the embodiment in first and second parts, make first and second parts directly not contact.For the sake of simplicity with clear, can be at random with the various parts of different plotted.
Under traditional BIOS, when the system failure, can only be through calling the Video service of INT10; Mistake and some relevant information are printed to screen; And contrast traditional startup method, also there is clean boot in different mechanism, the UEFI normalized definition operating system and the platform firmware interface of linking up; This structure has comprised the relevant information of platform, and the relevant startup that can call under the OS and resident service.These interfaces and service provide a kind of mechanism.Operating system can become capsule with data encapsulation, and passes to platform firmware.
The present invention's technical scheme thinking substantially is following:
This programme is through the system bios aspect based on the UEFI framework; When Panic is appearred in the OS of node with OS panic data; Be packaged into packet, give long-range supervisor console through procotol with packet, perhaps the resident service under operating system sends the data to system bios; Realization is to the collection of node OS panic data, for the location of problem provides the necessary base data.
To combine accompanying drawing to specifically describe technical scheme of the present invention below.
Fig. 1 shows the basic flow sheet according to OS emergency event data processing function under the OS of exemplary embodiment of the present invention.
Be in the state (140) that OS starts operation at each node; After OS Panic appears in OS; The corresponding treatment progress of system (135) is handled; Treatment progress (135) goes to check the packet whether capsule is arranged, if there is not to find to have the packet of capsule, just directly withdraws from treatment progress; If find to have the packet of capsule, in the inspection packet whether the data that comprise OS panic are arranged, to the data packet format that OS panic is arranged, carry out particular processing (such as, sign).
Can discern these data this moment if desired under OS; And these data are also handled afterwards, be to OS effectively (138) with the capsule package identification, and list capsule in system's table (141); The panic Data Identification with finish dealing with after, with system reset (132).At this moment OS panic data have been gathered completion, after the system reset, again with the panic data capsule that collects, analyze and show and carry out fault diagnosis and analysis.
Further, Fig. 2 shows the collection mechanism according to the OS panic data of exemplary embodiment of the present invention.That is, Fig. 2 has highlighted the key step of gathering OS panic data:
As OS during in normal operation (208), system has unusually, the generation of OS Panic incident; At this moment OS makes up capsule (defining in the UEFI standard) through the related service of calling capsule, and capsule is sent to firmware, and the service that makes up capsule becomes capsule with data encapsulation; After firmware is notified; Through updatedcapsule (), data are recorded in the nonvolatile medium (such as, the Flash flash memory); System is after reboot again like this, and the panic data are still available.
The Panic data acquisition determines whether the needs resetting system according to the system design needs after accomplishing.Resetting system if desired, system will jump to reseting vector reset vector (132), and executive system resets.After the system reset, whether operating system decides last panic data available through the check system table, if data can be used, normal OS reads the data of last panic, and with data analysis information, shows and carry out fault diagnosis.
In addition, Fig. 3 shows the basic hardware layout according to exemplary embodiment of the present invention.
Microcontroller (110) passes through spi bus; Visit non-volatile storage medium flash memory; OS Panic data with certain format on the SPI flash memory; Microcontroller (110) adopts the mode of OOB to obtain the data of OS panic like this, and microcontroller hangs under the south bridge ICH through the PCIe bus, and microcontroller and system processor are through bus communications such as SMBUS.
After system processor was hung up, microcontroller detected after system hung up through heartbeat signal, and the visit flash memory gets access to the data of capsule, and OS panic data are sent to management node through NIC.After management node is received data, data are recorded in the database, supply consequent malfunction analysis and location.
In general, the invention provides the collecting method of emergency event among a kind of operating system OS, may further comprise the steps: start the operating system; When the OS emergency event takes place when; Whether inspection has the packet of capsule capsule, if find to have the packet of capsule, checks then whether the data that comprise the OS emergency event are arranged in the packet; If find to have the data that comprise the OS emergency event; Whether the packet that then defines capsule is effectively permanent to OS, if find have the packet of capsule forever effective to OS, then will have the packet of capsule to list in system's table; And with system reset.
Preferably, before the step that starts the operating system, this method also comprises: handle the data of last emergency event, and analytical information associated with the data shown carry out fault diagnosis.
Preferably, when the OS emergency event took place, OS can expand the related service of the capsule that defines in the fixed interface UEFI standard and makes up capsule through calling unification, and capsule is sent to firmware firmware.
Preferably, the related service that makes up capsule becomes capsule with data encapsulation, and firmware is recorded in capsule in the storage medium.
Preferably, storage medium is a non-volatile memory medium.
In addition, the present invention also provides the data collector of emergency event among a kind of operating system OS, comprising: start module, be used to start the operating system; The first inspection module is used for when the OS emergency event takes place, and whether inspection has the packet of capsule capsule; Second detection module is used for when finding the packet of capsule is arranged, and in the inspection packet whether the data that comprise the OS emergency event is arranged; Determination module; Be used for when finding the data that comprise the OS emergency event are arranged, whether the packet that defines capsule is effectively permanent to OS, acquisition module; Be used at the packet of finding to have capsule forever effectively the time, will have the packet of capsule to list in system's table OS; And reseting module, be used for system reset.
Preferably, when the OS emergency event took place, OS can expand the related service of the capsule that defines in the fixed interface UEFI standard and makes up capsule through calling unification, and capsule is sent to firmware firmware.
Discuss the parts of some embodiment above, made those of ordinary skills can understand various aspects of the present invention better.It will be understood by those skilled in the art that can use at an easy rate the present invention design or change as the basis other be used to reach with here the identical purpose of the embodiment that introduces and/or realize the processing and the structure of same advantage.Those of ordinary skills should be appreciated that also this equivalent constructions does not deviate from the spirit and scope of the present invention, and under the situation that does not deviate from the spirit and scope of the present invention, can carry out multiple variation, replacement and change.
Claims (7)
1. the collecting method of emergency event among the operating system OS is characterized in that, may further comprise the steps:
Start the operating system;
When the OS emergency event took place, whether inspection had the packet of capsule capsule,
If find to have the packet of capsule, check then whether the data that comprise said OS emergency event are arranged in the said packet,
If finding has the data that comprise said OS emergency event, whether the packet that then defines capsule is effectively permanent to said OS,
If finding saidly has the packet of capsule forever effective to said OS, then there is the packet of capsule to list in system's table with said; And
With system reset.
2. collecting method according to claim 1 is characterized in that, before the step that starts said operating system, said method also comprises:
Handle the data of last emergency event, and will the analytical information relevant show and carry out fault diagnosis with said data.
3. collecting method according to claim 1; It is characterized in that; When said OS emergency event takes place; Said OS can expand the related service of the said capsule that defines in the fixed interface UEFI standard and makes up said capsule through calling unification, and said capsule is sent to firmware firmware.
4. collecting method according to claim 3 is characterized in that, the said related service that makes up said capsule becomes said capsule with said data encapsulation, and said firmware is recorded in said capsule in the storage medium.
5. collecting method according to claim 4 is characterized in that, said storage medium is a non-volatile memory medium.
6. the data collector of emergency event among the operating system OS is characterized in that, comprising:
Start module, be used to start the operating system;
The first inspection module is used for when the OS emergency event takes place, and whether inspection has the packet of capsule capsule;
Second detection module is used for when finding the packet of capsule is arranged, checking whether the data that comprise said OS emergency event are arranged in the said packet;
Determination module is used for when finding the data that comprise said OS emergency event are arranged, and whether the packet that defines capsule is effectively permanent to said OS;
Acquisition module is used for having the packet of capsule to list in system's table finding that the said packet that capsule arranged is permanent effectively the time to said OS with said; And
Reseting module is used for system reset.
7. data collector according to claim 6; It is characterized in that; When said OS emergency event takes place; Said OS can expand the related service of the said capsule that defines in the fixed interface UEFI standard and makes up said capsule through calling unification, and said capsule is sent to firmware firmware.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011104557862A CN102567550A (en) | 2011-12-31 | 2011-12-31 | Method and device for collecting data of emergency event in operating system (OS) |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011104557862A CN102567550A (en) | 2011-12-31 | 2011-12-31 | Method and device for collecting data of emergency event in operating system (OS) |
Publications (1)
Publication Number | Publication Date |
---|---|
CN102567550A true CN102567550A (en) | 2012-07-11 |
Family
ID=46412948
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2011104557862A Pending CN102567550A (en) | 2011-12-31 | 2011-12-31 | Method and device for collecting data of emergency event in operating system (OS) |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102567550A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103207797A (en) * | 2013-03-15 | 2013-07-17 | 南京工业大学 | Capsule type custom-made updating method based on unified extensible firmware interface firmware system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101025709A (en) * | 2006-02-22 | 2007-08-29 | 联想(北京)有限公司 | System and method for obtaining fault in-situ information for computer operating system |
US20090006827A1 (en) * | 2007-06-26 | 2009-01-01 | Rothman Michael A | Firmware Processing for Operating System Panic Data |
US20090327679A1 (en) * | 2008-04-23 | 2009-12-31 | Huang David H | Os-mediated launch of os-independent application |
US20100082932A1 (en) * | 2008-09-30 | 2010-04-01 | Rothman Michael A | Hardware and file system agnostic mechanism for achieving capsule support |
CN102147763A (en) * | 2010-02-05 | 2011-08-10 | 中国长城计算机深圳股份有限公司 | Method, system and computer for recording weblog |
-
2011
- 2011-12-31 CN CN2011104557862A patent/CN102567550A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101025709A (en) * | 2006-02-22 | 2007-08-29 | 联想(北京)有限公司 | System and method for obtaining fault in-situ information for computer operating system |
US20090006827A1 (en) * | 2007-06-26 | 2009-01-01 | Rothman Michael A | Firmware Processing for Operating System Panic Data |
US20090327679A1 (en) * | 2008-04-23 | 2009-12-31 | Huang David H | Os-mediated launch of os-independent application |
US20100082932A1 (en) * | 2008-09-30 | 2010-04-01 | Rothman Michael A | Hardware and file system agnostic mechanism for achieving capsule support |
CN102147763A (en) * | 2010-02-05 | 2011-08-10 | 中国长城计算机深圳股份有限公司 | Method, system and computer for recording weblog |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103207797A (en) * | 2013-03-15 | 2013-07-17 | 南京工业大学 | Capsule type custom-made updating method based on unified extensible firmware interface firmware system |
CN103207797B (en) * | 2013-03-15 | 2013-11-27 | 南京工业大学 | Capsule type custom-made updating method based on unified extensible firmware interface firmware system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105938450B (en) | The method and system that automatic debugging information is collected | |
CN106648958B (en) | Basic input output system replys management system and its method and program product | |
US8069371B2 (en) | Method and system for remotely debugging a hung or crashed computing system | |
CN102419803B (en) | Method, system and device for searching and killing computer virus | |
WO2016197737A1 (en) | Self-check processing method, apparatus and system | |
US20170212815A1 (en) | Virtualization substrate management device, virtualization substrate management system, virtualization substrate management method, and recording medium for recording virtualization substrate management program | |
CN116204933B (en) | Method for isolating PCIe network card based on jailhouse under ARM64 architecture | |
WO2012155707A1 (en) | Preventing data loss during reboot and logical storage resource management device | |
CN104216771A (en) | Restarting method and device for software program | |
CN106997313B (en) | Signal processing method and system of application program and terminal equipment | |
CN116340053A (en) | Log processing method, device, computer equipment and medium for system crash | |
US11226755B1 (en) | Core dump in a storage device | |
CN110851334A (en) | Flow statistical method, electronic device, system and medium | |
CN106227540A (en) | Obtain the methods, devices and systems of displaying information on screen | |
JP6337437B2 (en) | Information processing apparatus, information processing system, and program | |
CN102567550A (en) | Method and device for collecting data of emergency event in operating system (OS) | |
CN114765051A (en) | Memory test method and device, readable storage medium and electronic equipment | |
CN104618191B (en) | Communication fault detection method and device between a kind of host and naked memory block | |
US20240046720A1 (en) | Vehicle-mounted information processing apparatus and vehicle-mounted information processing method | |
CN113064750B (en) | Tracking method, device and medium for BIOS log information | |
CN106484523B (en) | A kind of managing hardware device method and device thereof | |
CN114070755B (en) | Virtual machine network flow determination method and device, electronic equipment and storage medium | |
CN109344032A (en) | A kind of monitoring method and device | |
JP5163180B2 (en) | Device controller | |
CN107168815A (en) | A kind of method for collecting hardware error message |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20120711 |