CN113434324A - Abnormal information acquisition method, system, device and storage medium - Google Patents

Abnormal information acquisition method, system, device and storage medium Download PDF

Info

Publication number
CN113434324A
CN113434324A CN202110728525.7A CN202110728525A CN113434324A CN 113434324 A CN113434324 A CN 113434324A CN 202110728525 A CN202110728525 A CN 202110728525A CN 113434324 A CN113434324 A CN 113434324A
Authority
CN
China
Prior art keywords
slave
machine
host
abnormal information
shared memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110728525.7A
Other languages
Chinese (zh)
Inventor
赵欣
毛弘毅
伍晓宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Keda Technology Co Ltd
Original Assignee
Suzhou Keda Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Keda Technology Co Ltd filed Critical Suzhou Keda Technology Co Ltd
Priority to CN202110728525.7A priority Critical patent/CN113434324A/en
Publication of CN113434324A publication Critical patent/CN113434324A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides an abnormal information acquisition method, a system, equipment and a storage medium, wherein the method comprises the following steps: when the slave machine is abnormal, the slave machine acquires abnormal information through a hook function in a kernel program; the slave machine writes the abnormal information into a shared memory of the host machine and the slave machine; and the host is configured to read abnormal information from the shared memory corresponding to the slave after receiving the interrupt signal sent by the slave. By adopting the invention, when the slave machine has a fault, the abnormal information can be obtained through the hook function in the kernel program, then the abnormal information is written into the shared memory, the interrupt signal is sent to the host machine, and the host machine can read the abnormal information from the shared memory after receiving the interrupt signal, thereby solving the problem that the abnormal information can not be obtained and recorded when the kernel has a fault in the prior art.

Description

Abnormal information acquisition method, system, device and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method, a system, a device, and a storage medium for acquiring abnormal information.
Background
Pcie (peripheral component interconnect express) is a high-speed serial computer expansion bus standard, which belongs to high-speed serial point-to-point dual-channel high-bandwidth transmission, connected devices allocate independent channel bandwidths and do not share bus bandwidths, and mainly support functions of active power management, error reporting, end-to-end reliability transmission, hot plug, quality of service monitoring, and the like. After a kernel error (kenrel panic) occurs in a slave, the general exception information can be obtained only from a console of the slave, and is not obtained from other channels.
When a kernel error occurs in the slave, the kernel cannot determine what error caused the error, so that the storage device cannot be operated, and the exception information cannot be saved. If the kernel error information of the slave needs to be captured, the console of the slave must be observed in real time. And once the kernel error occurs to the slave, the slave is restarted, and even if the control console of the slave is observed, if the restart is not noticed, the abnormal information cannot be captured.
Disclosure of Invention
The invention aims to provide an abnormal information acquisition method, system, equipment and storage medium, aiming at solving the problem that abnormal information cannot be acquired and recorded when a kernel is wrong in the prior art.
The embodiment of the invention provides an abnormal information acquisition method, which comprises the following steps:
when the slave machine is abnormal, the slave machine acquires abnormal information through a hook function in a kernel program;
the slave machine writes the abnormal information into a shared memory of the host machine and the slave machine;
and the host is configured to read abnormal information from the shared memory corresponding to the slave after receiving the interrupt signal sent by the slave.
By adopting the abnormal information acquisition method, when the slave machine has a fault, the abnormal information can be acquired through the hook function in the kernel program, then the abnormal information is written into the shared memory, the interrupt signal is sent to the host machine, and the host machine can read the abnormal information from the shared memory after receiving the interrupt signal, thereby solving the problem that the abnormal information cannot be acquired and recorded when the kernel is wrong in the prior art.
In some embodiments, before the slave writes the exception information into the shared memory of the master and the slave, the method further includes the following steps:
the host machine distributes a section of memory for the slave machine on the local machine as a shared memory, and writes the physical address of the memory into the register of the slave machine;
the slave machine acquires the physical address of the shared memory allocated by the host machine through a register of the slave machine, and establishes mapping between the shared memory allocated by the host machine and a local machine space; alternatively, the first and second electrodes may be,
the slave machine distributes a section of memory in the local machine as a shared memory and establishes mapping between the shared memory and the read address of the host machine.
In some embodiments, before the slave sends the interrupt signal to the master, the method further includes the following steps:
the master assigns a msi interrupt number to the slave.
In some embodiments, after the slave sends the interrupt signal to the master, the method further includes the following steps:
the master machine receives an interrupt signal sent by the slave machine;
the host reads abnormal information from the shared memory corresponding to the slave;
and the host writes the read exception information into an exception log of the host.
In some embodiments, after the master receives the interrupt signal sent by the slave, the method further includes the following steps:
the host computer analyzes the type of the received interrupt signal;
and if the type of the interrupt signal is a kernel error type, the host reads abnormal information from the shared memory corresponding to the slave.
In some embodiments, the host writes the read exception information into a native exception log, including the steps of:
the host adds the identifier of the slave to the abnormal information;
and the host initiates a log collection process according to the kernel error level of the host side, and writes the abnormal information into an abnormal log of the host.
In some embodiments, when a slave machine is abnormal, before the slave machine acquires the abnormal information through a hook function in a kernel program, the method further includes the following steps:
the slave machine registers a hook function in an operating system, and the hook function is configured to acquire the abnormal information of the slave machine when the slave machine is abnormal.
The embodiment of the invention also provides an abnormal information acquisition system, which is applied to the abnormal information acquisition method, wherein the system comprises a host and a slave, and the host and the slave are used for executing the following steps:
when the slave machine is abnormal, the slave machine acquires abnormal information through a hook function in a kernel program;
the slave machine writes the abnormal information into a shared memory of the host machine and the slave machine;
the slave computer sends an interrupt signal to the host computer;
and after receiving the interrupt signal sent by the slave, the host reads the abnormal information from the shared memory corresponding to the slave.
By adopting the abnormal information acquisition system, when the slave machine fails, the slave machine can acquire the abnormal information through the hook function in the kernel program, then write the abnormal information into the shared memory, and send the interrupt signal to the host machine, and the host machine can read the abnormal information from the shared memory after receiving the interrupt signal, so that the problem that the abnormal information cannot be acquired and recorded when the kernel fails in the prior art is solved.
An embodiment of the present invention further provides an abnormal information acquiring apparatus, including:
a processor;
a memory having stored therein executable instructions of the processor;
wherein the processor is configured to perform the steps of the exception information acquisition method via execution of the executable instructions.
By adopting the abnormal information acquisition equipment provided by the invention, the processor executes the abnormal information acquisition method when executing the executable instruction, so that the beneficial effect of the abnormal information acquisition method can be obtained.
The embodiment of the present invention further provides a computer-readable storage medium, which is used for storing a program, and when the program is executed by a processor, the steps of the abnormal information acquiring method are implemented.
By adopting the computer-readable storage medium provided by the present invention, the stored program realizes the steps of the abnormality information acquisition method when being executed, thereby obtaining the beneficial effects of the abnormality information acquisition method.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, with reference to the accompanying drawings.
Fig. 1 is a flowchart illustrating reporting of exception information from a slave device in an exception information acquisition method according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a host computer acquiring exception information in an exception information acquiring method according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an abnormal information acquiring system according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an abnormality information acquisition apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computer storage medium according to an embodiment of the present invention.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals in the drawings denote the same or similar structures, and thus their repetitive description will be omitted.
As shown in fig. 1, in an embodiment, the present invention provides an exception information acquiring method, which may be used to acquire exception information of a PCIe slave when an exception occurs, where the method includes the following steps:
s100: when the slave machine is abnormal, the slave machine acquires abnormal information through a hook function in a kernel program;
s200: the slave machine writes the abnormal information into a shared memory of the host machine and the slave machine;
s300: the slave sends an interrupt signal to the host, and the host is configured to read exception information from the shared memory corresponding to the slave after receiving the interrupt signal sent by the slave, wherein the exception information includes a current state of an operating system of the slave and some other exception parameters needing to be recorded.
By adopting the abnormal information acquisition method, when the slave machine has a fault, the abnormal information can be acquired through the hook function in the kernel program in the step S100, then the abnormal information is written into the memory of the host machine in the step S200, the interrupt signal is sent to the host machine in the step S300, the host machine can read the abnormal information from the shared memory after receiving the interrupt signal, and the problem that the abnormal information cannot be acquired and recorded when the kernel is wrong in the prior art is solved.
In this embodiment, the step S200: before the slave writes the abnormal information into the shared memory of the host and the slave, the method further comprises the following steps:
the host machine distributes a section of memory for the slave machine on the local machine as a shared memory, and writes the physical address of the memory into the register of the slave machine; the size of the memory can be selected according to the needs, for example, 4Mbyte, 5Mbyte, etc., the host and the slave can be in one-to-one or one-to-many relationship, when one host corresponds to a plurality of slaves, the size of the memory allocated by the host to each slave can be the same or different;
the slave acquires the physical address of the shared memory allocated by the host through a register of the slave, and establishes mapping between the shared memory allocated by the host and a local space, wherein the memory allocated by the host refers to a memory space on the host side, the local space refers to a local space on the slave side, the slave maps the shared memory allocated by the host to the local space by utilizing PCIe bar outbond setting, and the mapping is outward mapping established by the slave.
In the above embodiments, the shared memory is on the host side. In another embodiment, the shared memory may be on the slave side, which is more beneficial to reduce the storage burden on the master side when the number of slaves is large. Specifically, the step S200: before the slave writes the abnormal information into the shared memory of the host and the slave, the method further includes: the slave machine distributes a section of memory in the local machine as a shared memory, and establishes mapping between the shared memory and the read address of the host machine, wherein the mapping is the inward mapping established by the slave machine.
In this embodiment, the step S300: before the slave sends an interrupt signal to the master, the method further comprises the following steps:
the master machine distributes an msi (message signal interrupt) interrupt number to the slave machine, and a mapping relation between the msi interrupt number and the slave machine is established.
As shown in fig. 2, in this embodiment, the step S300: after the slave sends an interrupt signal to the master, the method further comprises the following steps:
s400: the master machine receives an interrupt signal sent by the slave machine;
s500: the host reads abnormal information from the shared memory corresponding to the slave;
s600: and the host writes the read exception information into an exception log of the host.
Therefore, when the slave fails, the master receives the interrupt signal sent by the slave in step S400, the master can know that the slave is abnormal through the received interrupt signal, and can read the abnormal information of the slave in the shared memory locally allocated to the slave in step S500, and then write the read abnormal information of the slave into the abnormal log of the master in step S600, and then only need to check the log system of the master, so that whether the slave is abnormal can be determined, and the reason for the abnormality of the slave can be analyzed according to the abnormal information of the slave.
In this embodiment, the slave occurrence abnormality includes, for example, a slave occurrence kernel error (kernel serial), and in other alternative embodiments, the abnormality information obtaining method of the present invention may also be applied to other types of slave abnormalities without being limited to a kernel error, so that when the slave detects other types of abnormalities, initiates an interrupt and obtains abnormality information, the master may also obtain the abnormality information of the slave.
In this embodiment, it may also be defined that only a slave exception of a kernel error type requires the master to obtain an exception log, specifically, in this embodiment, the step S400: after the host receives the interrupt signal sent by the slave, the method also comprises the following steps:
the host computer analyzes the type of the received interrupt signal;
and if the type of the interrupt signal is a kernel error type, the host reads abnormal information from the shared memory corresponding to the slave.
In this embodiment, the step S600: the host writes the read exception information into an exception log of the host, and the method comprises the following steps:
the host adds the identifier of the slave to the abnormal information, and then can determine which slave is abnormal when the abnormal information is checked at the host side;
the method comprises the steps that a host initiates a log collection process (syslog/rsyslog) at a kernel error (kernel) level of the host side, and abnormal information is stored in a local log, so that the abnormal information of a slave can be acquired by looking up the log of the host subsequently, whether the slave is abnormal or not is known, and if the abnormal information occurs, the reason of the abnormal slave at that time can be analyzed according to the abnormal information.
In this embodiment, the step S100: when the slave machine is abnormal, before the slave machine acquires the abnormal information through a hook function in the kernel program, the method further comprises the following steps:
the slave machine registers a hook function in an operating system, and the hook function is configured to acquire the abnormal information of the slave machine when the slave machine is abnormal.
As shown in fig. 3, an embodiment of the present invention further provides an abnormal information acquiring system, which is applied to the abnormal information acquiring method, where the system includes a master M100 and a slave M200, and the master M100 and the slave M200 are configured to execute the following steps:
when the slave M200 is abnormal, the slave M200 acquires abnormal information through a hook function in a kernel program;
the slave M200 writes the abnormal information into the shared memory of the master M100 and the slave M200;
the slave M200 sends an interrupt signal to the master M100;
after receiving the interrupt signal sent by the slave M200, the master M100 reads the abnormal information from the shared memory corresponding to the slave M200.
By adopting the abnormal information acquisition system, when the slave machine fails, the slave machine can acquire the abnormal information through the hook function in the kernel program, then write the abnormal information into the shared memory, and send the interrupt signal to the host machine, and the host machine can read the abnormal information from the shared memory after receiving the interrupt signal, so that the problem that the abnormal information cannot be acquired and recorded when the kernel fails in the prior art is solved.
Further, in this embodiment, after the master M100 reads the exception information from the shared memory corresponding to the slave M200, the master M100 further writes the read exception information into the local exception log. Specifically, the writing, by the host M100, of the read exception information into a local exception log includes: the master M100 adds the identifier of the slave M200 to the abnormal information, and then when the abnormal information is checked at the master side, which slave is abnormal can be determined; the master M100 initiates a log collection process (syslog/rsyslog) at a kernel error level on the master side, and stores the abnormal information in the local log, so that the abnormal information of the slave M200 can be obtained by looking up the log of the master M100 in the following process, and whether the slave M200 is abnormal or not is known, and if the abnormal information occurs, the reason of the abnormal slave M200 at that time can be analyzed according to the abnormal information.
In this embodiment, the master M100 is further configured to allocate a segment of shared memory to the slave M200 locally, and write a physical address of the shared memory into a register of the slave M200; the size of the shared memory may be selected according to needs, for example, 4Mbyte, 5Mbyte, etc., the host and the slave M200 may be in a one-to-one or one-to-many relationship, and when one host corresponds to a plurality of slaves M200, the size of the memory allocated by the host to each slave M200 may be the same or different.
The slave M200 is further configured to obtain, through a register of the slave M, a physical address of the shared memory allocated by the master M100, and establish mapping between the shared memory allocated by the master M100 and a local space, where the shared memory allocated by the master M100 refers to a memory space on a master side, and the local space refers to a local space on a slave side, and the slave M200 maps the shared memory allocated by the master M100 to the local space by using a PCIe bar outbond setting, which is mapping to an outward mapping established by the slave.
In another embodiment, the shared memory may be located on the slave side, which is more beneficial to reduce the storage burden on the master side when the number of slaves is large. Specifically, the slave M200 is further configured to allocate a segment of memory as a shared memory in the local computer, and establish a mapping between the shared memory and the read address of the master, where the mapping is an inward mapping established by the slave.
The embodiment of the invention also provides abnormal information acquisition equipment, which comprises a processor; a memory having stored therein executable instructions of the processor; wherein the processor is configured to perform the steps of the exception information acquisition method via execution of the executable instructions.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 4. The electronic device 600 shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 4, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one storage unit 620, a bus 630 that connects the various system components (including the storage unit 620 and the processing unit 610), a display unit 640, and the like.
Wherein the storage unit stores program code executable by the processing unit 610 to cause the processing unit 610 to perform steps according to various exemplary embodiments of the present invention described in the above-mentioned electronic prescription flow processing method section of the present specification. For example, the processing unit 610 may perform the steps as shown in fig. 1.
The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read-only memory unit (ROM) 6203.
The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 630 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 via the bus 630. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
By adopting the abnormal information acquisition equipment provided by the invention, the processor executes the abnormal information acquisition method when executing the executable instruction, so that the beneficial effect of the abnormal information acquisition method can be obtained.
The embodiment of the present invention further provides a computer-readable storage medium, which is used for storing a program, and when the program is executed by a processor, the steps of the abnormal information acquiring method are implemented. In some possible embodiments, aspects of the present invention may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the present invention described in the above-mentioned electronic prescription flow processing method section of this specification, when the program product is run on the terminal device.
Referring to fig. 5, a program product 800 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or cluster. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
By adopting the computer-readable storage medium provided by the present invention, the stored program realizes the steps of the abnormality information acquisition method when being executed, thereby obtaining the beneficial effects of the abnormality information acquisition method.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (10)

1. An abnormal information acquisition method is characterized by comprising the following steps:
when the slave machine is abnormal, the slave machine acquires abnormal information through a hook function in a kernel program;
the slave machine writes the abnormal information into a shared memory of the host machine and the slave machine;
and the host is configured to read abnormal information from the shared memory corresponding to the slave after receiving the interrupt signal sent by the slave.
2. The method according to claim 1, wherein before the slave writes the abnormality information into the shared memory of the master and the slave, the method further comprises:
the host machine distributes a section of memory for the slave machine on the local machine as a shared memory, and writes the physical address of the memory into the register of the slave machine;
the slave machine acquires the physical address of the shared memory allocated by the host machine through a register of the slave machine, and establishes mapping between the shared memory allocated by the host machine and a local machine space; alternatively, the first and second electrodes may be,
the slave machine distributes a section of memory in the local machine as a shared memory and establishes mapping between the shared memory and the read address of the host machine.
3. The method according to claim 1, wherein before the slave sends an interrupt signal to the master, the method further comprises:
the master assigns a msi interrupt number to the slave.
4. The method according to claim 1, wherein after the slave sends an interrupt signal to the master, the method further comprises:
the master machine receives an interrupt signal sent by the slave machine;
the host reads abnormal information from the shared memory corresponding to the slave;
and the host writes the read exception information into an exception log of the host.
5. The method according to claim 4, wherein the host computer further includes, after receiving the interrupt signal sent from the slave computer, the steps of:
the host computer analyzes the type of the received interrupt signal;
and if the type of the interrupt signal is a kernel error type, the host reads abnormal information from the shared memory corresponding to the slave.
6. The method according to claim 5, wherein the host writes the read exception information into a local exception log, and the method comprises:
the host adds the identifier of the slave to the abnormal information;
and the host initiates a log collection process according to the kernel error level of the host side, and writes the abnormal information into an abnormal log of the host.
7. The method according to claim 1, wherein when a slave machine is abnormal, before the slave machine acquires the abnormal information through a hook function in a kernel program, the method further comprises the steps of:
the slave machine registers a hook function in an operating system, and the hook function is configured to acquire the abnormal information of the slave machine when the slave machine is abnormal.
8. An abnormality information acquisition system applied to the abnormality information acquisition method according to any one of claims 1 to 7, the system comprising a master and a slave, the master and the slave being configured to execute the steps of:
when the slave machine is abnormal, the slave machine acquires abnormal information through a hook function in a kernel program;
the slave machine writes the abnormal information into a shared memory of the host machine and the slave machine;
the slave computer sends an interrupt signal to the host computer;
and after receiving the interrupt signal sent by the slave, the host reads the abnormal information from the shared memory corresponding to the slave.
9. An abnormality information acquisition apparatus characterized by comprising:
a processor;
a memory having stored therein executable instructions of the processor;
wherein the processor is configured to perform the steps of the anomaly information acquisition method of any one of claims 1-7 via execution of the executable instructions.
10. A computer-readable storage medium storing a program, wherein the program, when executed by a processor, implements the steps of the abnormality information acquisition method according to any one of claims 1 to 7.
CN202110728525.7A 2021-06-29 2021-06-29 Abnormal information acquisition method, system, device and storage medium Pending CN113434324A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110728525.7A CN113434324A (en) 2021-06-29 2021-06-29 Abnormal information acquisition method, system, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110728525.7A CN113434324A (en) 2021-06-29 2021-06-29 Abnormal information acquisition method, system, device and storage medium

Publications (1)

Publication Number Publication Date
CN113434324A true CN113434324A (en) 2021-09-24

Family

ID=77757698

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110728525.7A Pending CN113434324A (en) 2021-06-29 2021-06-29 Abnormal information acquisition method, system, device and storage medium

Country Status (1)

Country Link
CN (1) CN113434324A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115309617A (en) * 2022-08-08 2022-11-08 科东(广州)软件科技有限公司 Desktop display method of background operating system information, heterogeneous system and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0512235A (en) * 1991-07-08 1993-01-22 Tokyo Electric Co Ltd Electronic equipment
JP2006338445A (en) * 2005-06-03 2006-12-14 Matsushita Electric Ind Co Ltd Abnormality information storage apparatus
CN102469474A (en) * 2010-11-15 2012-05-23 中兴通讯股份有限公司 Method and device for processing abnormal information of communication equipment
CN105204977A (en) * 2014-06-30 2015-12-30 中兴通讯股份有限公司 System exception capturing method, main system, shadow system and intelligent equipment
CN111274059A (en) * 2020-01-21 2020-06-12 浙江大华技术股份有限公司 Software exception handling method and device for slave equipment
CN111694684A (en) * 2019-03-15 2020-09-22 百度在线网络技术(北京)有限公司 Abnormal construction method and device of storage equipment, electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0512235A (en) * 1991-07-08 1993-01-22 Tokyo Electric Co Ltd Electronic equipment
JP2006338445A (en) * 2005-06-03 2006-12-14 Matsushita Electric Ind Co Ltd Abnormality information storage apparatus
CN102469474A (en) * 2010-11-15 2012-05-23 中兴通讯股份有限公司 Method and device for processing abnormal information of communication equipment
CN105204977A (en) * 2014-06-30 2015-12-30 中兴通讯股份有限公司 System exception capturing method, main system, shadow system and intelligent equipment
CN111694684A (en) * 2019-03-15 2020-09-22 百度在线网络技术(北京)有限公司 Abnormal construction method and device of storage equipment, electronic equipment and storage medium
CN111274059A (en) * 2020-01-21 2020-06-12 浙江大华技术股份有限公司 Software exception handling method and device for slave equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115309617A (en) * 2022-08-08 2022-11-08 科东(广州)软件科技有限公司 Desktop display method of background operating system information, heterogeneous system and storage medium

Similar Documents

Publication Publication Date Title
US8904055B2 (en) Switching control device and switching control method
JP2014026567A (en) Monitoring device, information processing device, and monitoring method
US10324888B2 (en) Verifying a communication bus connection to a peripheral device
US9747149B2 (en) Firmware dump collection from primary system dump device adapter
CN111818145B (en) File transmission method, device, system, equipment and storage medium
CN105874442A (en) Computer system and method for accessing endpoint device in the computer system
EP3035227B1 (en) Method and device for monitoring data integrity in shared memory environment
US11068337B2 (en) Data processing apparatus that disconnects control circuit from error detection circuit and diagnosis method
US9239807B2 (en) Providing bus resiliency in a hybrid memory system
EP3429128A1 (en) Hard drive operation method and hard drive manager
CN113434324A (en) Abnormal information acquisition method, system, device and storage medium
CN114880266B (en) Fault processing method and device, computer equipment and storage medium
CN107818061B (en) Data bus and management bus for associated peripheral devices
CN113434089B (en) Data moving method and device and PCIE system
CN112015690A (en) Intelligent device management method and device, network device and readable storage medium
CN110602162B (en) Terminal evidence obtaining method, device, equipment and storage medium
CN112579507A (en) Host machine and BMC communication method, BIOS, operating system, BMC and server
US10291582B2 (en) System and method of supporting more than 256 sensors by intelligent platform management interface (IPMI) based server management controller
US20180247087A1 (en) Securing an Unprotected Hardware Bus
EP1649372A2 (en) Maintenance interface unit for servicing multiprocessor systems
JP2022107091A (en) Information processing device, information processing system, information processing device control method, and information processing device control program
CN115061759A (en) Data acquisition method, related device and storage medium
CN116455809A (en) Method, device, equipment and storage medium for processing data after link abnormality
CN118051366A (en) Fault processing method and computing device
CN113312209A (en) Data hot standby method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination