WO2024016864A1 - 处理器、获取信息的方法、单板及网络设备 - Google Patents
处理器、获取信息的方法、单板及网络设备 Download PDFInfo
- Publication number
- WO2024016864A1 WO2024016864A1 PCT/CN2023/098211 CN2023098211W WO2024016864A1 WO 2024016864 A1 WO2024016864 A1 WO 2024016864A1 CN 2023098211 W CN2023098211 W CN 2023098211W WO 2024016864 A1 WO2024016864 A1 WO 2024016864A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- reset
- processor
- register
- control module
- module
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 230000015654 memory Effects 0.000 claims description 86
- 230000002159 abnormal effect Effects 0.000 claims description 24
- 238000004891 communication Methods 0.000 abstract description 27
- 238000010586 diagram Methods 0.000 description 18
- 230000006870 function Effects 0.000 description 14
- 230000008569 process Effects 0.000 description 14
- 230000004044 response Effects 0.000 description 14
- 238000004590 computer program Methods 0.000 description 12
- 238000005516 engineering process Methods 0.000 description 11
- 238000012545 processing Methods 0.000 description 10
- 230000003287 optical effect Effects 0.000 description 7
- 230000003068 static effect Effects 0.000 description 6
- 230000001360 synchronised effect Effects 0.000 description 6
- 238000013461 design Methods 0.000 description 4
- 238000007726 management method Methods 0.000 description 4
- 230000005856 abnormality Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000003990 capacitor Substances 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000013256 coordination polymer Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0721—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
- G06F11/0754—Error or fault detection not based on redundancy by exceeding limits
- G06F11/0757—Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/079—Root cause analysis, i.e. error or fault diagnosis
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C29/00—Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
- G11C29/52—Protection of memory contents; Detection of errors in memory contents
Definitions
- the present application relates to the field of computer technology, and in particular, to a processor, a method for obtaining information, a single board, and a network device.
- the processor's normal operation is an important factor that affects whether the computer system can perform operations and control normally. Therefore, when the processor runs abnormally, it is necessary to obtain relevant information about the processor in order to analyze the cause of the abnormal operation of the processor.
- This application proposes a processor, a method for obtaining information, a single board and a network device, which are used to obtain relevant information of the processor in order to analyze the reasons for abnormal operation of the processor.
- a processor in a first aspect, includes a control module, a first register and a cache.
- the processor is communicatively connected to a storage medium that is not lost upon reset.
- the control module is used to obtain a reset instruction generated by abnormal operation of the processor, and obtain relevant information of the processor based on the reset instruction, where the relevant information includes at least one of register information of the first register or data stored in the cache; converting the relevant information Store it in a storage medium that will not be lost upon reset.
- the control module inside the processor can obtain relevant information about the processor based on the reset instruction.
- the processor does not need to rely on external modules to obtain relevant information about the processor, nor does it need to rely on the operating system (OS) to respond to interrupt signals and execute interrupt response programs.
- the method by which the processor obtains relevant information is more reliable.
- the control module obtains the register information of the first register and the data stored in the cache, the relevant information obtained is relatively comprehensive. Therefore, when the cause of abnormal operation of the processor is analyzed based on the relevant information, the accuracy of the analyzed cause is relatively high.
- the control module can be a hardware module, after the control module obtains the reset instruction, it can quickly obtain the relevant information of the processor, thereby obtaining the relevant information more efficiently.
- the processor further includes a second register.
- the second register is a register used when the processor is running, the first register is used to record the register information of the second register, the first register is a register that is not lost during reset, and the reset indication is used to instruct the processor to reset.
- the control module is configured to instruct the first register to stop recording the register information of the second register based on the reset instruction, and obtain the register information of the first register after the processor is reset based on the reset instruction.
- the second register includes a program counter (PC), a stack pointer (SP), a frame pointer (FP), a control register (CR) or a connection At least one of the registers (link register, LR).
- PC program counter
- SP stack pointer
- FP frame pointer
- CR control register
- connection At least one of the registers (link register, LR).
- the types of the second register are relatively rich and flexible, and the processor can retain register information of multiple types of registers.
- the processor has a reset pin, and the reset pin is used to generate a reset indication and transmit the reset indication to the control module. It is easier for the processor to generate a reset indication by setting up the reset pin that is used to generate the reset indication.
- control module is communicatively connected to the reset module, and the reset module is configured to send a reset instruction to the control module.
- the control module can also obtain the reset instruction by receiving the reset instruction sent by the reset module.
- the method for the processor to obtain the reset instruction is relatively flexible.
- the storage medium that is not lost upon reset includes a memory inside the processor that is not lost upon reset, a non-volatile storage medium inside the processor, a memory outside the processor that is not lost upon reset, or a memory outside the processor that is not lost upon reset. At least one of non-volatile storage media.
- the type of storage media that is not lost upon reset is relatively flexible. Therefore, the architecture of the processor can be more flexible and diverse.
- the control module can store relevant information about the processor in a storage medium that is not lost upon reset.
- control module is also used to obtain the relevant information from a storage medium that is not lost after the processor is reset, and generate a running exception record based on the relevant information. By generating a running exception record, the cause of the processor running exception can be found. Moreover, in the case where the relevant information includes the register information of the first register and the data stored in the cache, the accuracy of the reason obtained by analysis is relatively high.
- a method of obtaining information is provided.
- the method is applied to a processor.
- the processor includes a control module, a first register and a cache.
- the processor is communicatively connected to a storage medium that is not lost upon reset.
- the method includes: controlling The module obtains a reset instruction generated by abnormal operation of the processor, and obtains relevant information of the processor based on the reset instruction.
- the relevant information includes at least one of register information of the first register or data stored in the cache; and stores the relevant information to Reset without losing storage media.
- the processor further includes a second register.
- the second register is a register used when the processor is running.
- the first register is used to record the register information of the second register.
- the first register is a register that is not lost after reset.
- the reset instruction is used to instruct the processor to reset; based on the reset instruction, obtain relevant information of the processor, including: based on the reset instruction, instruct the first register to stop recording the register information of the second register; after the processor is reset based on the reset instruction, obtain Register information of the first register.
- the second register includes at least one of PC, SP, FP, CR or LR.
- the processor has a reset pin
- the method further includes: the processor generates a reset indication through the reset pin, and the reset pin transmits the reset indication to the control module; obtaining the reset indication includes: receiving the reset indication .
- control module is communicatively connected to the reset module, and the reset module is configured to send a reset instruction to the control module; obtaining the reset instruction includes: receiving the reset instruction sent by the reset module.
- the storage medium that is not lost upon reset includes a memory inside the processor that is not lost upon reset, a non-volatile storage medium inside the processor, a memory outside the processor that is not lost upon reset, or a memory outside the processor that is not lost upon reset. At least one of non-volatile storage media.
- the method further includes: after the processor is reset, the control module obtains the relevant information from a storage medium that is not lost after reset, and generates a running exception record based on the relevant information.
- a single board in a third aspect, includes the processor of any one of the above-mentioned first aspects, and a storage medium that is communicatively connected to the processor and is not lost in reset.
- the single board further includes a reset module.
- the reset module is communicatively connected to the control module in the processor.
- the reset module is used to send a reset instruction to the control module.
- a fourth aspect provides a network device, which includes at least one processor according to any one of the above-mentioned first aspects, and a reset-proof storage medium communicatively connected to the processor.
- the network device further includes a reset module.
- the reset module is communicatively connected to the control module in the processor.
- the reset module is configured to send a reset instruction to the control module.
- a fifth aspect provides a network device, which includes at least one single board according to any one of the above third aspects.
- a sixth aspect provides a chip, which includes at least one processor according to any one of the above-mentioned first aspects, and a reset-proof storage medium communicatively connected to the processor.
- the chip further includes a reset module, which is communicatively connected to the control module in the processor, and the reset module is configured to send a reset instruction to the control module.
- the chip also includes: an input interface, an output interface, and a memory.
- the memory includes the above-mentioned storage medium that is not lost during reset.
- the input interface, the output interface, the processor, and the memory are connected through internal connection paths. .
- Figure 1 is a schematic structural diagram of a processor provided by an embodiment of the present application.
- FIG. 2 is a schematic structural diagram of another processor provided by an embodiment of the present application.
- FIG. 3 is a schematic structural diagram of another processor provided by an embodiment of the present application.
- Figure 4 is a flow chart of a method for obtaining information provided by an embodiment of the present application.
- Figure 5 is a schematic diagram of a first register recording register information of a second register provided by an embodiment of the present application
- Figure 6 is a schematic diagram of a process for obtaining relevant information provided by an embodiment of the present application.
- Figure 7 is a schematic structural diagram of a network device provided by an embodiment of the present application.
- Figure 8 is a schematic structural diagram of another network device provided by an embodiment of the present application.
- Figure 9 is a schematic structural diagram of another network device provided by an embodiment of the present application.
- Figure 10 is a schematic structural diagram of yet another network device provided by an embodiment of the present application.
- processors are used in computer systems in various forms. For example, processors with multiple cores are widely used in devices such as servers and terminals.
- a computer system uses a processor system composed of multiple processors. The processor system is used to provide stronger computing performance, data processing performance, artificial intelligence reasoning performance, artificial intelligence training performance, etc.
- processor hang-up can refer to the situation where the processor cannot execute the program normally due to interrupt nesting, program fleet (program fleet), and program deadloop (deadloop).
- the information related to the processor may include information related to the moment when the processor runs abnormally.
- the processor-related information can also be called on-site information about the running abnormality. Because some operating abnormalities will cause the processor to restart, causing on-site information to be flushed, generating logs or alarm information, and obtaining error records stored in memory will not be able to analyze the processor because on-site information cannot be obtained.
- the specific reason for the abnormal operation Therefore, when a processor runs abnormally, how to effectively obtain relevant information about the processor in order to analyze the specific reasons for the processor's abnormal operation is an urgent problem that needs to be solved.
- Related Art No. 1 proposes a method of obtaining processor-related information based on a watchdog chip.
- the watchdog chip is connected to an input/output (I/O) pin of the processor, and the processor periodically sends high and low power to the watchdog chip through the I/O pin. flat transition input signal. If the processor is running abnormally, it cannot send an input signal to the watchdog chip. In this case, the watchdog chip sends a reset signal to the controller. After receiving the reset signal, the controller first sends an interrupt signal to the processor. The OS running on the processor responds to the interrupt signal and runs an interrupt response program to collect register information that causes the processor to run abnormally. After a period of time when the controller sends the interrupt signal, it sends a reset signal to the processor, triggering the processor, such as the central processing unit (CPU), to reset.
- CPU central processing unit
- Related technology 2 provides a method of obtaining processor-related information based on a coprocessor.
- the processor is connected to a built-in or external co-processor through a bus.
- the co-processor senses that the processor is running abnormally, the co-processor actively accesses the processor through the bus to obtain the information that caused the processor to run abnormally. Register information.
- Related technology 1 relies on the reliable operation of the watchdog chip and controller and the OS running on the processor can respond to interrupt signals. If the watchdog chip or controller operates with low reliability, or the processor executes an interrupt response program When an abnormality occurs, the reliability of related technology 1 is low.
- the second related technology relies on reliable connection between the coprocessor and the processor. When the reliability of the bus connection is low, the reliability of the second related technology is low. Furthermore, related technology 1 and related technology 2 only obtain register information, and the information used to analyze the cause of abnormal operation of the processor is relatively limited, and the accuracy of the analyzed cause is low.
- FIG. 1 is a schematic structural diagram of a processor provided by an embodiment of the present application.
- the processor includes a control module 101, a first register 102 and a cache 103.
- the control module 101, the first register 102 and the cache 103 are communicatively connected.
- the control module 101, the first register 102 and the cache 103 are communicatively connected through communication wiring in the processor.
- the control module 101 is configured to obtain a reset indication, which is an indication generated by abnormal operation of the processor.
- the control module 101 is also configured to obtain relevant information of the processor based on the reset indication, where the relevant information includes at least one of register information of the first register 102 or data stored in the cache 103 .
- the processor is communicatively connected to a storage medium that is not lost upon reset, and the control module 101 is also configured to store relevant information about the processor into a storage medium that is not lost upon reset.
- the situations in which the control module 101 obtains the reset indication include but are not limited to situation one and situation two.
- the processor has a reset pin, which is used to generate a reset instruction and transmit the reset instruction to the control module 101 .
- the control module 101 can obtain the reset indication by receiving the reset indication transmitted by the reset pin.
- the reset pin may be a hard reset pin or a soft reset pin, which is not limited in the embodiments of the present application.
- This reset pin can be used to generate a reset indication based on a low level signal.
- the processor is installed on a single board. When the single board senses that the processor is running abnormally, the single board lowers the level of the reset pin, causing the processor to generate a reset indication through the reset pin based on the low-level signal. By setting the reset pin, the processor can quickly respond to low-level signals and improve the efficiency of generating reset instructions.
- the single board may be any circuit component including a processor.
- the single board may be a circuit component including a processor, a resistor, and a capacitor.
- the embodiment of the present application does not limit the way in which a single board senses abnormal operation of the processor.
- the board is also equipped with a watchdog chip, which is used to detect abnormal operation of the processor.
- the processor periodically sends input signals to the watchdog chip. When the processor runs abnormally, the processor stops sending input signals to the watchdog chip.
- the watchdog chip generates an output signal based on not receiving the input signal. Output signals are used to indicate that the processor is operating abnormally.
- control module 101 is connected through communication with the reset module, and the reset module is used to send a reset instruction to the control module 101 .
- the control module 101 can obtain the reset instruction by receiving the reset instruction sent by the reset module.
- the reset module can be internal to the processor or external to the processor. In the embodiment of the present application, the setting position of the reset module is relatively flexible.
- the reset module is any one of the control logic or control circuits inside the processor, or any one of the control logic or control circuits external to the processor.
- the control logic may also be a software program executed on a software module in the processor. Regardless of whether the reset module is located inside the processor or outside the processor, the reset module can be used to generate a reset instruction based on the first signal and send the reset instruction to the control module 101 .
- the reset module is communicatively connected with the watchdog chip, and the watchdog chip is also communicatively connected with the processor.
- the processor periodically sends an input signal to the watchdog chip according to a first duration.
- the first duration can be set based on experience or actual needs, which is not limited in the embodiments of the present application.
- the processor will stop sending input signals to the watchdog chip or the time interval between two input signals sent by the processor will be greater than the first duration.
- the watchdog chip generates an output signal based on not receiving the next input signal within a first period of time after receiving an input signal, and sends the output signal as the first signal to the reset module, so that the reset module generates a reset based on the first signal. instruction, sending a reset instruction to the control module 101.
- the processor further includes a second register, and the second register is a register used when the processor is running.
- the first register 102 is used to record the register information of the second register.
- the first register 102 is a register that is not lost during reset.
- the reset indication obtained by the control module 101 is used to instruct the processor to reset.
- the control module 101 is used to instruct the first register 102 to stop recording the register information of the second register based on the reset instruction, and obtain the register information of the first register 102 after the processor is reset based on the reset instruction, so as to obtain the register information of the processor.
- Related Information Regarding the number of the first register 102 and the second register, the embodiment of the present application does not limit this.
- the processor also includes multiple first registers 102, and one first register 102 is used to record register information of one or more second registers.
- the first register 102 can be used to record the register information of the second register in real time. That is to say, during the process of the processor running the program, every time the register information of the second register changes, the second register The corresponding first register 102 records the register information after a change.
- the second register is PC
- the register information of PC includes the PC pointer.
- the processor is running a program, each time the PC pointer points to a new instruction, the first register 102 corresponding to the PC records a changed PC pointer.
- the function of the first register 102 to record the register information of the second register in real time may be called the real-time backup recording function of the first register 102 .
- the reset indication may be used to instruct the processor to reset after a reference time period.
- the reference time period can be based on experience or timing needs Please set it up. For example, in the case where the register information of the first register is obtained as the related information of the processor, the reference time period satisfies a time period greater than or equal to the time required to instruct the first register 102 to stop recording the register information of the second register. In the case where the data stored in the cache 103 is obtained as processor-related information, the reference time period satisfies a time period greater than or equal to the time required for the control module 101 to obtain the data stored in the cache 103 .
- the reference time period is greater than or equal to the second duration
- the second duration is the time required for the control module 101 to obtain the data stored in the cache 103 and
- the control module 101 stores the data of the cache 103 into the storage medium that is not lost during reset and the sum of the time required.
- the second register in the processor includes at least one of PC, SP, FP, CR or LR.
- the register information of the PC includes but is not limited to at least one PC pointer.
- the at least one PC pointer may include the PC pointer of the currently running target instruction, the PC pointer of the previous A instructions before the target instruction and adjacent to the target instruction, and the PC pointer of the target instruction.
- the PC pointers of the next B instructions after the instruction and adjacent to the target instruction, A and B are both positive integers.
- the register information of SP includes but is not limited to the call stack
- the register information of FP includes but is not limited to the call frame
- the register information of CR includes but is not limited to the system control flags that control the processor operating mode and status, and the system control flags that cause page faults.
- the register information of LR includes but is not limited to the difference between the PC value and the reference value when the running exception occurs.
- the reference value can be set based on experience or actual needs.
- the embodiment of the present application does not limit the size of the reference value.
- the first register 102 used to record the register information of the PC can be called a backup program counter (backup PC, BPC), and the first register 102 used to record the register information of the SP can be is called the backup stack pointer (backup SP, BSP), the first register 102 used to record the register information of the FP can be called the backup frame pointer (backup FP, BFP), and the first register 102 used to record the register information of the CR can be Called a backup control register (backup CR, BCR), the first register 102 used to record the register information of the LR may be called a backup connection register (backup LR, BLR).
- backup PC backup PC
- BPC backup program counter
- BSP backup stack pointer
- the first register 102 used to record the register information of the FP can be called the backup frame pointer (backup FP, BFP)
- the first register 102 used to record the register information of the CR can be Called a backup control register (backup CR, BCR)
- the first register 102 may have multiple names. For example, if a first register 102 is used to record both the register information of the PC and the register information of the SP, the first register 102 can be called either the BPC or the BSP.
- the above mentioned second registers are intended to illustrate the types of registers used when the processor is running.
- other types of registers used when the processor is running can also be used as second registers.
- the processor further includes a first register 102 corresponding to other types of second registers to record register information of other types of second registers.
- the storage medium that is communicatively connected to the processor and is not lost by reset includes, but is not limited to, the memory that is not lost by reset inside the processor, the non-volatile storage medium inside the processor, the memory that is not lost by reset outside the processor, or At least one of the non-volatile storage media external to the processor.
- the storage medium that is not lost upon reset is the storage medium 104 that is not lost upon reset and is located inside the processor.
- the storage medium 104 whose reset is not lost is communicatively connected with other modules in the processor through the communication wiring inside the processor, thereby realizing the communication connection with the processor.
- the storage medium 104 that is reset without loss is communicatively connected to the control module 101 , the first register 102 and the cache 103 through the communication wiring inside the processor.
- the storage medium 104 that is not lost after reset includes but is not limited to memory that is not lost after reset and non-volatile memory. storage media.
- Memory that is not lost after reset includes but is not limited to static random-access memory (static random-access memory, SRAM).
- Non-volatile storage media include but are not limited to double data rate synchronous dynamic random-access memory (DDR SDRAM), flash card (flash card), secure digital memory (SD) card, At least one of a serial advanced technology attachment (SATA) card or a universal serial bus (USB card) card.
- DDR SDRAM double data rate synchronous dynamic random-access memory
- flash card flash card
- SD secure digital memory
- SATA serial advanced technology attachment
- USB card universal serial bus
- the non-lost reset storage medium is a non-lost reset storage medium 105 located outside the processor.
- the storage medium 105 that is not lost upon reset includes, but is not limited to, a memory that is not lost upon reset and a non-volatile storage medium.
- the principle of memory that is not lost after reset is the same as the memory that is not lost after reset explained above, and will not be described again here.
- Non-volatile storage media includes, but is not limited to, at least one of DDR SDRAM external to the processor, flash memory card, SD card, SATA card, or Universal Serial Bus card.
- the storage medium 105 that is not lost during reset may also include storage media in other processors or single boards.
- the processor is communicatively connected to other processors or single boards through channels such as Ethernet to enable the processor to communicate with other processors or single boards.
- the communication connection of the storage medium in the single board, so as to realize the reset without losing the storage medium 105, includes other processors or storage media in the single board.
- Storage media in other processors or single boards include but are not limited to at least one of volatile storage media or non-volatile storage media.
- control module 101 is also configured to obtain relevant information of the processor from a storage medium that is not lost during reset after the processor is reset, and generate a running exception record based on the relevant information.
- the control module 101 is also configured to obtain the register information of the first register 102 from the storage medium after the processor is reset, based on The register information of the first register 102 generates a running exception record.
- the control module 101 is also configured to obtain the data from the storage medium after the processor is reset, and generate a running exception record based on the data.
- the control module 101 may include multiple control sub-modules, and the functions of the control module 101 are implemented by multiple control sub-modules.
- the control module 101 includes a first control sub-module, a second control sub-module and a third control sub-module.
- the first control submodule is configured to obtain the register information of the first register 102 based on the reset instruction, and store the register information of the first register 102 in a storage medium that is not lost after reset.
- the second control submodule is used to obtain the data stored in the cache 103 based on the reset instruction, and store the data stored in the cache 103 in a storage medium that is not lost after reset.
- the third control submodule is used to obtain processor-related information from a storage medium that is not lost upon reset, and generate a running exception record based on the related information.
- the first control sub-module and the second control sub-module are both hardware modules, and the third control sub-module is a software module.
- the first control sub-module and the second control sub-module may be implemented in the same hardware module, or may be implemented in different hardware modules respectively, which is not limited in the embodiments of the present application.
- the control module 101 inside the processor can obtain relevant information of the processor based on the reset instruction.
- the processor does not need to rely on external modules to obtain processor-related information, nor does it need to rely on the OS to respond to interrupt signals to execute interrupt response programs.
- the method by which the processor obtains relevant information is more reliable.
- the control module 101 obtains the register information of the first register 102 and the data stored in the cache 103, the relevant information obtained is relatively comprehensive. Therefore, when the cause of abnormal operation of the processor is analyzed based on the relevant information, the accuracy of the analyzed cause is relatively high.
- the control module is a hardware module, after the control module obtains the reset instruction, it can quickly obtain the relevant information of the processor, so that the efficiency of obtaining the relevant information is relatively high.
- the processor provided by the embodiment of the present application may be a CPU, and the processor may also include other modules besides the above-mentioned modules.
- Figure 3 is a schematic structural diagram of another processor provided by an embodiment of the present application. As shown in Figure 3, the CPU includes at least one CPU core, and the control module 101, the first register 102 and the second register are included in each CPU core.
- the cache 103 is composed of multiple levels of cache, as shown in Figure 3.
- the multiple levels of cache include level (L) 1 cache, L2 cache, L3 cache and L4 cache.
- the L1 cache and L2 cache are within each CPU core, and the L3 cache and L4 cache are outside each CPU core.
- the processor may also include at least one of an internal cache (buffer) or SRAM, a hard acceleration engine for implementing hardware acceleration, a DDR controller for controlling the DDR SDRAM, and a non-volatile storage medium for communicating with the non-volatile storage medium. Volatile memory interface, logic interface for communication with control logic, and low-speed I/O or general-purpose I/O for communication with watchdog chip.
- Various modules in the processor can be connected by communication wiring in the processor. Each module in the CPU core is connected through communication wiring in the CPU core, and each module in the CPU is connected through communication wiring in the CPU.
- the embodiment of the present application also provides a method for obtaining information.
- the method is applied to the processor shown in the above embodiment. As shown in Figure 4, the method includes but is not limited to S401-S403.
- the control module obtains a reset instruction, which is an instruction generated by abnormal operation of the processor.
- the processor has a reset pin
- the method further includes: the processor generates a reset indication through the reset pin, and the reset pin transmits the reset indication to the control module.
- obtaining the reset indication includes: receiving the reset indication.
- the reset pin provided by the processor may be a hard reset pin or a soft reset pin.
- a reset indication can be generated based on the reset pin.
- the single board can sense that the processor is running abnormally.
- the single board senses that the processor is running abnormally, the single board lowers the level of the reset pin to achieve reset.
- the pin inputs a low level signal.
- the control module is communicatively connected with the reset module, and the reset module is used to send a reset instruction to the control module.
- obtaining the reset indication includes: receiving the reset indication sent by the reset module.
- the reset module may be located inside the processor or outside the processor.
- the reset module can also communicate with the watchdog chip, and the watchdog chip can also communicate with the processor.
- the processor periodically sends an input signal to the watchdog chip according to a first duration. In the event that the processor runs abnormally, the processor will stop sending input signals to the watchdog chip or the time interval between two input signals sent by the processor will be greater than the first duration.
- the watchdog chip generates an output signal based on not receiving the next input signal within the first time period after receiving an input signal, and sends the output signal to the reset module as the first signal.
- the reset module generates a reset instruction based on the first signal and sends the reset instruction to the control module.
- the control module receives the reset instruction sent by the reset module to obtain the reset instruction.
- the embodiment of the present application does not limit the location of the watchdog chip.
- the watchdog chip can be installed in the processor or outside the processor.
- the control module obtains relevant information of the processor based on the reset indication, where the relevant information includes at least one of register information of the first register or data stored in the cache.
- the control module after receiving the reset indication, the control module directly performs the operation of obtaining the data stored in the cache based on the reset indication. After the control module obtains the data stored in the cache, the processor is reset.
- the reset indication may also be used to instruct the processor to reset after a reference period of time.
- the reference time period is greater than or equal to the time required for the control module to obtain the data stored in the cache.
- the reference time period can also be greater than or equal to a second duration.
- the second duration is the sum of the duration required by the control module to obtain the data stored in the cache and the duration required by the control module to store the cached data in a storage medium that is not lost upon reset. .
- the processor also includes a second register.
- the second register is a register used when the processor is running.
- the first register is used to record the register information of the second register.
- the first register is a reset non-lost register, and the reset indication is used to Instructs the processor to reset.
- obtaining relevant information of the processor includes: based on the reset indication, instructing the first register to stop recording the register information of the second register; and after the processor is reset based on the reset indication, acquiring the register information of the first register.
- the second register has the same principle as the second register in the above embodiment, and will not be described again here.
- the reset indication may also be used to instruct the processor to reset after a reference period of time. In the case where the register information of the first register is obtained as the related information of the processor, the reference time period is greater than or equal to the time required to instruct the first register to stop recording the register information of the second register.
- the control module can obtain not only the register information of the first register, but also the data stored in the cache. For example, based on the reset instruction, the control module performs acquisition of data stored in the cache and instructs the first register to stop recording the register information of the second register, and after the processor is reset based on the reset instruction, acquires the register information of the first register. That is to say, in this case, if the reset indication is used to instruct the processor to reset after a reference time period, the reference time period is greater than or equal to the time required for the control module to obtain the data stored in the cache, and is greater than or equal to indicating that the first register stops The length of time required to record the register information of the second register. Of course, the reference time period may also be greater than or equal to the second duration, and greater than or equal to the duration required to instruct the first register to stop recording the register information of the second register.
- the control module stores the relevant information of the processor in a storage medium that is not lost after reset.
- the storage medium that is not lost after reset can be located inside the processor or outside the processor.
- the storage medium that is not lost after reset includes but is not limited to the memory that is not lost by reset inside the processor, the non-volatile storage medium inside the processor, the memory that is not lost by reset outside the processor, or the non-volatile storage medium outside the processor. of at least one.
- the reset-proof storage medium is located inside the processor, that is, the reset-not-lost storage medium includes a reset-proof memory inside the processor or a non-volatile storage medium inside the processor.
- the control module stores the relevant information of the processor to a storage medium that is not lost after reset through the communication wiring inside the processor.
- control The module stores the relevant information of the processor to the storage medium that is not lost upon reset through the communication connection between the processor and the storage medium that is not lost upon reset.
- the method further includes: after the processor is reset, the control module obtains relevant information of the processor from a storage medium that is not lost after reset, and generates a running exception record based on the relevant information.
- the control module obtains the register information of the first register and the data stored in the cache from the storage medium that is not lost after reset, and generates a running exception record based on the obtained register information and data.
- Running exception records can include acquired register information and data. That is to say, in the case where the first register includes BPC, BSP, BFP, BLR and BCR, the running exception record may include the acquired data, BPC register information, BSP register information, BFP register information, and BLR register information and BCR register information.
- the register information of BPC includes but is not limited to at least one PC pointer
- the register information of BSP includes but is not limited to the call stack
- the register information of BFP includes but is not limited to the call frame
- the register information of BLR includes but is not limited to the PC when a running exception occurs.
- the register information of the BCR includes but is not limited to at least one of the system control flags that control the operating mode and status of the processor, the linear address that causes a page fault, or the physical memory base address of the page directory table.
- the control module in the processor can obtain The reset instruction arrives to obtain processor-related information. Therefore, there is no need to rely on the external module of the processor to obtain the relevant information of the processor, and there is no need to rely on the OS to respond to the interrupt signal to execute the interrupt response program.
- the method of obtaining the relevant information is more reliable.
- this method can obtain relatively comprehensive relevant information. Therefore, when the cause of abnormal operation of the processor is analyzed based on the relevant information, the accuracy of the analyzed cause is relatively high.
- the control module is a hardware module, after the control module obtains the reset instruction, it can quickly obtain the relevant information of the processor, so that the efficiency of obtaining the relevant information is relatively high.
- An embodiment of the present application also provides a single board, which includes any of the above-mentioned processors, and a reset-proof storage medium that is communicatively connected to the processor.
- the single board includes an instruction module configured to send a first instruction to a control module of the processor. The first instruction is used to instruct the control module to instruct the first register to start recording the register information of the second register.
- Figure 5 shows a schematic diagram of a first register recording register information of a second register.
- the control module includes a plurality of fourth control sub-modules, and a fourth control sub-module is used to instruct a first register to start recording. Register information of the second register corresponding to the first register.
- the instruction module of the single board is used to synchronously send the first instruction to the plurality of fourth control sub-modules, so that the plurality of fourth control sub-modules can synchronously instruct the plurality of first registers to start recording the second data corresponding to the plurality of first registers.
- Register information for the register As shown in Figure 5, the plurality of first registers include BPC, BCR, BSP, BLR and BFP, where BPC corresponds to PC, BCR corresponds to CR, BSP corresponds to SP, BLR corresponds to LR, and BFP corresponds to FP.
- first register shown in Figure 5 is intended to illustrate how to record the register information of the second register, and is not used to limit the type of the first register and the number of first registers of each type.
- the number of first registers of the same type may be one or more.
- the single board also includes a storage module, which is used to store running abnormal records. For example, after the control module of the processor obtains the running exception record, it can transmit the running exception record to the storage module of the single board, and the storage module of the single board stores the running exception record.
- the storage module of the single board is at least one of a memory external to the processor that is not lost upon reset or a non-volatile storage medium external to the processor.
- the single board may also include the reset module in the above embodiment, and the reset module will not be described again here.
- Figure 6 is a schematic diagram of a process for obtaining relevant information provided by an embodiment of the present application. As shown in Figure 6, the method includes but is not limited to S601-S607.
- the startup module of the single board obtains the startup instruction.
- the startup module may be a startup interface of a single board, and the startup interface is used to receive startup instructions.
- Startup instructions include but are not limited to cold start instructions and hot start instructions.
- the cold start instruction refers to the startup instruction received by the startup interface when the board is powered off.
- the startup interface is powered on when the board is powered off, thereby receiving the cold start instruction.
- the hot start instruction refers to the startup instruction received by the startup interface when the board is not powered on.
- the watchdog chip set on the board senses that the processor is running abnormally, the watchdog chip generates an output signal. Output signal as hot start indication. Please refer to the description in the above embodiment for details on how the watchdog chip senses abnormal operation of the processor, and will not be described again here.
- S602 is executed.
- S603 is executed.
- the instruction module of the single board sends the first instruction to the control module of the processor.
- the indication module in response to obtaining the cold start indication, sends a first instruction to the control module.
- the first instruction is used to instruct the control module to instruct the first register to start recording the register information of the second register, so that the control module can instruct the first register to start recording the register information of the second register based on the first instruction.
- S603 The reset module or control circuit of the single board transmits a reset instruction to the control module.
- the watchdog chip provided on the board is also used as the reset module in the above embodiment. That is to say, the watchdog chip is communicatively connected with the control module. After the watchdog chip generates a hot start instruction, it sends a reset instruction to the control module based on the hot start instruction.
- the watchdog chip is only used to sense abnormal operation of the processor and generate a warm start indication when it senses abnormal operation of the processor.
- the single board also has a control circuit, which is used to lower the level of the reset pin of the processor based on the hot start indication, so that the processor generates a reset signal through the reset pin based on the low-level signal, and the reset pin sends the signal to the control module. Transmission reset indication.
- the reset indication since the reset indication is obtained based on the hot start indication, the reset indication may also be called a warm reset indication, and the reset performed by the processor based on the hot reset indication may be called a processor hot reset.
- the control module obtains relevant information of the processor based on the reset instruction.
- S604 has the same principles as the above-mentioned S402 and will not be described again here.
- the control module stores the relevant information of the processor in a storage medium that is not lost after reset.
- the instruction module sends the first instruction to the control module.
- the control module instructs the first register to start recording the register information of the second register.
- the single board starts. Board startup means that each module and component of the board starts running.
- the single board provided by the embodiment of the present application includes any of the above processors.
- the control module in the processor can obtain relevant information of the processor based on the obtained reset indication. Therefore, there is no need to rely on the external module of the processor to obtain the relevant information of the processor, and there is no need to rely on the OS to respond to the interrupt signal to execute the interrupt response program.
- the method of obtaining the relevant information is more reliable.
- the relevant information obtained is relatively comprehensive. Therefore, when the cause of abnormal operation of the processor is analyzed based on the relevant information, the accuracy of the analyzed cause is relatively high.
- control module When the control module is a hardware module, after the control module obtains the reset instruction, it can quickly obtain the relevant information of the processor, thereby obtaining the relevant information more efficiently. Furthermore, since there is no need to set up a coprocessor for obtaining processor-related information, the design complexity of the single board is low and the cost is also low.
- An embodiment of the present application also provides a network device, which includes at least one processor in the above embodiment, and a storage medium that is communicatively connected to the processor and is not lost in reset.
- the network device further includes a reset module.
- the reset module is communicatively connected to the control module in the processor.
- the reset module is configured to send a reset instruction to the control module. Since the processor provided by the embodiment of the present application can also be provided on a single board, the embodiment of the present application also provides a network device, which includes at least one single board in the above embodiment.
- the network device can be a box-type device, which refers to a network device that only includes one of the above-mentioned single boards.
- the network device can also be a frame-type device.
- a frame-type device refers to a network device that includes a main control board and at least one of the above-mentioned single boards.
- the main control board and at least one single board are connected through an inter-board management channel.
- the main control board is used to control the network. At least one board in the device.
- the process of obtaining information includes but is not limited to the following situation A and situation B.
- the network device is a box-type device.
- FIG. 7 is a schematic structural diagram of a network device provided by an embodiment of the present application.
- the network device includes a single board. That is, the network device is a box-type device.
- the processor included in the single board may be a microcontroller unit (MCU) or CPU.
- MCU microcontroller unit
- the MCU has a built-in CPU core, reset volatile memory, and reset non-lost storage media.
- the processor is a CPU
- the CPU has a built-in CPU core.
- the CPU may also have built-in at least one of a reset volatile memory and a reset non-lost storage medium.
- at least one of resetting the volatile memory and resetting the non-lost storage medium can also be located outside the CPU.
- storage media that are not lost upon reset include memory and non-volatile storage media that are not lost upon reset.
- the CPU core, memory that is volatile upon reset, and storage media that is not lost upon reset are all hardware modules.
- the CPU core includes cache, PC, CR, SP, BPC, BCR, BSP, and control modules 1 to 4.
- the control module 1 is used to instruct the BPC to start recording the register information of the PC or stop recording the register information of the PC, obtain the register information of the BPC based on the reset instruction, and store the register information of the BPC in a storage medium that is not lost after reset.
- the control module 2 is used to instruct the BCR to start recording the register information of the CR or stop recording the register information of the CR, obtain the register information of the BCR based on the reset instruction, and store the register information of the BCR in a storage medium that is not lost after reset.
- the control module 3 is used to instruct the BSP to start recording the register information of the SP or stop recording the register information of the SP, obtain the register information of the BSP based on the reset instruction, and store the register information of the BSP in a storage medium that is not lost after reset.
- the control module 4 is configured to obtain the data stored in the cache based on the reset instruction, and store the data stored in the cache into a storage medium that is not lost after reset.
- the startup module, indicator module, watchdog chip, reset module and control circuit included in the single board are not shown in Figure 7.
- the numbers and types of the first registers, second registers and control modules mentioned above are only for illustration and are not used to limit the number and types of the first registers, second registers and control modules included in the network device.
- the process of obtaining information includes the following S701 to S707.
- the startup module of the board obtains the cold start instruction.
- the instruction module of the single board sends the first instruction to the control module of the processor.
- S702 In response to receiving the cold start instruction, S702 is executed.
- S702 has the same principle as the above-mentioned S602 and will not be described again here.
- the first instruction is used to instruct the control module 1 to instruct the BPC to start recording the register information of the PC, and is also used to instruct
- the control module 2 instructs the BCR to start recording the register information of the CR, and is also used to instruct the BSP to start recording the register information of the SP.
- the control module instructs the first register to start recording the register information of the second register based on the first instruction.
- the control module sends a trigger signal to the first register based on the first instruction, where the trigger signal is used to instruct the first register to start recording the register information of the second register.
- This application does not limit the form of the trigger signal.
- the control module 1 sends a trigger signal to the BPC based on the first instruction.
- the trigger signal is used to instruct the BPC to start recording the register information of the PC.
- the control module 2 sends a trigger signal to the BCR based on the first instruction.
- the trigger signal is used to instruct the BCR to start recording the register information of the CR.
- the control module 3 sends a trigger signal to the BSP based on the first instruction.
- the trigger signal is used to instruct the BSP to start recording the register information of the SP.
- S704 has the same principles as the above-mentioned S607 and will not be described again here.
- the watchdog chip in the board senses that the CPU is running abnormally, and the reset module or control circuit of the board transmits a reset instruction to the control module.
- S705 has the same principles as the above-mentioned S603 and will not be described again here.
- the control module obtains relevant information of the processor based on the reset instruction.
- the control module 4 obtains data stored in the cache based on the reset indication.
- the control module 1 instructs the BPC to stop recording the register information of the PC based on the reset instruction;
- the control module 2 instructs the BCR to stop recording the register information of the CR based on the reset instruction;
- the control module 3 instructs the BSP to stop recording the register information of the SP based on the reset instruction.
- the processor resets based on the reset indication. After the processor is reset based on the reset instruction, the control module 1 obtains the register information of the BPC, the control module 2 obtains the register information of the BCR, and the control module 3 obtains the register information of the BSP.
- the control module stores the relevant information of the processor in a storage medium that is not lost after reset.
- the principle of S707 is the same as that of the above-mentioned S403 and S605.
- the control module 1 stores the register information of the BPC in a storage medium that is not lost after reset
- the control module 2 stores the register information of the BCR in a storage medium that is not lost after reset
- the control module 3 The register information of the BSP is stored in a storage medium that is not lost upon reset
- the control module 4 stores the data stored in the cache into a storage medium that is not lost upon reset.
- control module when the control module stores the relevant information in a storage medium that is not lost after reset, it can first store the relevant information in a volatile memory that is reset, and then store the relevant information in a memory that is not lost after reset. in the storage medium.
- control module after the control module stores the relevant information of the processor in a storage medium that is not lost upon reset, the control module also obtains the relevant information stored in the storage medium and generates a running exception record based on the relevant information. For example, the control module obtains at least one PC pointer, call stack and page directory table physical memory base address stored in the storage medium, and generates a running exception based on the obtained at least one PC pointer, call stack and page directory table physical memory base address. Record. For example, after the control module generates the running exception record, it also stores the running exception record in a storage medium that is not lost after reset.
- the network device is a frame device.
- FIG. 8 is a schematic structural diagram of another network device provided by an embodiment of the present application.
- the network device includes a main control board and a single board, that is, the network device is a frame-type device.
- the main control board includes a CPU and a non-volatile storage medium.
- the non-volatile storage medium can be used to store logs.
- the single board includes a CPU and a storage medium that is not lost upon reset. Storage media that are not lost after reset include memory and non-volatile storage media that are not lost after reset. Among them, the locations of the CPU, the memory that is not lost upon reset, and the non-volatile storage medium are in the same principle as the relevant content in Figure 7 above, and will not be described again here.
- the CPU includes a cache, a first register, a second register and a control module.
- the startup module, indicator module, watchdog chip, reset module and control circuit included in the single board are not shown in Figure 8.
- the process of obtaining information includes the following S801 to S807.
- the startup module of the board obtains the cold start instruction.
- S801 has the same principle as the above-mentioned S601 and S701 for obtaining cold start instructions.
- the main control board powers on the startup module of the single board through the inter-board management channel to transmit the cold start instruction to the startup module of the single board, so that the startup module of the single board can obtain the cold start instruction. Start instructions.
- the instruction module of the single board sends the first instruction to the control module of the processor.
- S802 is executed.
- the cold start instruction is also used to instruct the instruction module of the single board to send the first instruction to the control module of the processor. That is to say, the main control board instructs the instruction module of the single board to send the first instruction to the control module of the processor through the inter-board management channel.
- the indication module can also be based on The received cold start instruction automatically sends the first instruction to the control module. The method of triggering the instruction module to send the first instruction to the control module is relatively flexible.
- the control module instructs the first register to start recording the register information of the second register based on the first instruction.
- S803 has the same principles as the above-mentioned S703 and will not be described again here.
- S804 has the same principles as the above-mentioned S607 and S704, and will not be described again here.
- the watchdog chip in the board senses that the CPU is running abnormally, and the reset module or control circuit of the board transmits a reset instruction to the control module.
- S805 has the same principles as the above-mentioned S603 and S705, and will not be described again here.
- the control module obtains relevant information of the processor based on the reset indication.
- S806 has the same principles as the above-mentioned S402, S604 and S706, and will not be described again here.
- the control module stores the relevant information of the processor in a storage medium that is not lost after reset.
- control module stores the register information of the first register into a storage medium that is not lost upon reset of the single board, and the control module also stores the data stored in the cache into a storage medium that is not lost after reset of the single board. in the storage medium.
- control module after the control module stores the relevant information of the processor in a storage medium that is not lost upon reset, the control module also obtains the relevant information stored in the storage medium and generates a running exception record based on the relevant information. For example, after the control module generates the running exception record, it also stores the running exception record in a storage medium that is not lost after reset. For example, for the network device shown in Figure 8, after the control module generates the running exception record, it stores the running exception record into the non-volatile storage medium of the main control board through the inter-board management channel. In the non-volatile storage medium, running exception records can be stored in the form of logs.
- the network device includes any of the above processors.
- the control module in the processor can obtain relevant information of the processor based on the obtained reset indication. Therefore, there is no need to rely on the external module of the processor to obtain the relevant information of the processor, and there is no need to rely on the OS to respond to the interrupt signal to execute the interrupt response program.
- the method of obtaining the relevant information is more reliable.
- the relevant information obtained is relatively comprehensive. Therefore, when the cause of abnormal operation of the processor is analyzed based on the relevant information, the accuracy of the analyzed cause is relatively high.
- control module When the control module is a hardware module, after the control module obtains the reset instruction, it can quickly obtain the relevant information of the processor, thereby obtaining the relevant information more efficiently. Furthermore, since there is no need to set up a coprocessor for obtaining relevant information about the processor, the single board provided by the embodiments of the present application has lower design complexity and lower cost. Therefore, the design complexity of the network equipment including the single board is low, and the cost is also low.
- FIG. 9 is a schematic structural diagram of yet another network device provided by an embodiment of the present application.
- the network device 2000 shown in FIG. 9 is configured with the processor shown in any of the above-mentioned FIGS. 1-3, and the processor is used to execute the method of obtaining information shown in the above-mentioned FIG. 4.
- the network device 2000 is, for example, a switch, a router, a server, a terminal, etc., and the network device 2000 can be implemented by a general bus architecture.
- the network device 2000 includes at least one processor 2001, a memory 2003, and at least one communication interface 2004.
- the processor 2001 may be the processor shown in any of the above-mentioned Figures 1-3.
- the processor 2001 is, for example, a CPU, a digital signal processor (DSP), a network processor (NP), a graphics processing unit (GPU), a neural network processor (neural- network processing units (NPU), data processing units (DPU), microprocessors, or one or more integrated circuits used to implement the solution of the present application.
- the processor 2001 includes an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.
- a PLD is, for example, a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (GAL), or any combination thereof.
- the processor can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a DSP and a microprocessor, and so on.
- the network device 2000 also includes a bus.
- Buses are used to transfer information between components of network device 2000.
- the bus can be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
- PCI peripheral component interconnect
- EISA extended industry standard architecture
- the bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in Figure 9, but it does not mean that there is only one bus or one type of bus.
- the memory 2003 is, for example, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, or a random access memory (random access memory, RAM) or a device that can store information and instructions.
- ROM read-only memory
- RAM random access memory
- Other types of dynamic storage devices such as electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disk storage, optical discs Storage (including compressed optical discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program code in the form of instructions or data structures and can Any other media accessed by a computer, without limitation.
- the memory 2003 exists independently, for example, and is connected to the processor 2001 through a bus.
- the memory 2003 may also be integrated with the processor 2001.
- the communication interface 2004 uses any device such as a transceiver for communicating with other devices or a communication network.
- the communication network may be Ethernet, a radio access network (RAN) or a wireless local area network (WLAN), etc.
- the communication interface 2004 may include a wired communication interface and may also include a wireless communication interface.
- the communication interface 2004 can be an Ethernet (Ethernet) interface, a fast Ethernet (FE) interface, a gigabit Ethernet (GE) interface, an asynchronous transfer mode (asynchronous transfer mode, ATM) interface, or a WLAN interface. Cellular network communications interface or combination thereof.
- the Ethernet interface can be an optical interface, an electrical interface, or a combination thereof.
- the communication interface 2004 can be used for the network device 2000 to communicate with other devices.
- the processor 2001 may include one or more CPUs, such as CPU0 and CPU1 as shown in FIG. 9 .
- Each of these processors may be a single-CPU processor or a multi-CPU processor.
- a processor here may refer to one or more devices, circuits, and/or processing cores for processing data (eg, computer program instructions).
- the network device 2000 may include multiple processors, such as the processor 2001 and the processor 2005 shown in FIG. 9 .
- processors can be a single-core processor (single-CPU) or a multi-core processor (multi-CPU).
- a processor here may refer to one or more devices, circuits, and/or processing cores for processing data (such as computer program instructions).
- the network device 2000 may also include an output device and an input device.
- output The device communicates with processor 2001 and can display information in a variety of ways.
- the output device may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, a projector, etc.
- Input devices communicate with processor 2001 and can receive user input in a variety of ways.
- the input device may be a mouse, a keyboard, a touch screen device or a sensing device, etc.
- the memory 2003 is used to store the program code 2010 for executing the solution of the present application
- the processor 2001 can execute the program code 2010 stored in the memory 2003.
- Program code 2010 may include one or more software modules.
- the processor 2001 itself can also store program codes or instructions for executing the solution of the present application.
- Each step of the method of obtaining information shown in Figure 4 is completed through an integrated logic circuit of the processor or instructions in the form of software.
- the steps of the methods disclosed in conjunction with the embodiments of the present application can be directly implemented by a hardware processor for execution, or can be executed by a combination of hardware and software modules in the processor.
- the software module can be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other mature storage media in this field.
- the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware. To avoid repetition, the details will not be described here.
- FIG. 10 is a schematic structural diagram of yet another network device provided by an embodiment of the present application.
- the network device includes a processor shown in any one of Figures 1-3, which is used to execute each step of the method of obtaining information shown in Figure 4.
- the network device is, for example, a server or a terminal.
- the network device may vary greatly due to different configurations or performance, and may include one or more processors 1001 and one or more memories 1002, wherein one or more At least one computer program is stored in the memory 1002, and the at least one computer program is loaded and executed by one or more processors 1001.
- the network device can also have components such as wired or wireless network interfaces, keyboards, and input and output interfaces to facilitate input and output.
- the network device can also include other components for realizing device functions, which will not be described again here.
- An embodiment of the present application also provides a communication device, which includes: a transceiver, a memory, and a processor.
- the transceiver, memory and processor communicate with each other through internal connection paths.
- the memory is used to store instructions
- the processor is used to execute instructions stored in the memory to control the transceiver to receive signals and control the transceiver to send signals.
- Memory stores instructions that cause the processor to perform methods to obtain information.
- processor may be a CPU, or other general-purpose processor, DSP, ASIC, FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- a general-purpose processor can be a microprocessor or any conventional processor, etc. It is worth noting that the processor may be a processor that supports advanced RISC machines (ARM) architecture.
- ARM advanced RISC machines
- the above-mentioned memory may include a read-only memory and a random access memory, and provide instructions and data to the processor.
- Memory may also include non-volatile random access memory.
- the memory may also store device type information.
- the memory may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory can be read-only memory (ROM), programmable ROM (PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically removable memory. Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
- Volatile memory may be random access memory (RAM), which is used as an external cache. By way of illustration, but not limitation, many forms of RAM are available.
- static random access memory static random access memory
- dynamic random access memory dynamic random access memory
- DRAM dynamic random access memory
- SDRAM synchronous dynamic random access memory
- double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
- enhanced synchronous dynamic Random access memory enhanced SDRAM, ESDRAM
- synchronous link dynamic random access memory direct memory bus random access memory
- direct rambus RAM direct rambus RAM, DR RAM
- An embodiment of the present application also provides a chip, which includes at least one processor in the above embodiment, and a storage medium that is communicatively connected to the processor and does not lose reset.
- the chip also includes a reset module, which is communicatively connected with the control module in the processor, and the reset module is used to send a reset instruction to the control module.
- the chip also includes: an input interface, an output interface, and a memory.
- the memory includes the above-mentioned storage medium that is not lost upon reset.
- the input interface, the output interface, the processor, and the memory are connected through internal connection paths.
- a computer program or computer program product includes one or more computer instructions.
- the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
- Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g., computer instructions may be transmitted from a website, computer, server or data center via a wired link (e.g.
- Coaxial cable, optical fiber, digital subscriber line) or wireless means to transmit to another website, computer, server or data center.
- Computer-readable storage media can be any available media that can be accessed by a computer or a data storage device such as a server, data center, or other integrated media that contains one or more available media.
- the available media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., digital video discs (DVD)), or semiconductor media (e.g., solid state disks (SSD) )wait.
- Computer program codes for implementing the methods of embodiments of the present application may be written in one or more programming languages. These computer program codes may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data forwarding device, so that when executed by the computer or other programmable data forwarding device, the program code causes the flowchart and/or block diagram to appear. The functions/operations specified in are implemented.
- the program code may execute entirely on the computer, partly on the computer, as a stand-alone software package, partly on the computer and partly on a remote computer or entirely on the remote computer or server.
- the computer program code or related data may be carried by any appropriate carrier, so that the device, device or processor can perform the various processes and operations described above.
- carriers include signals, computer-readable media, and the like.
- signals may include electrical, optical, radio, acoustic, or other forms of propagated signals, such as carrier waves, infrared signals, and the like.
- the disclosed processor, single board, device and method can be implemented in other ways.
- the device embodiments described above are only illustrative, and the layout of each module The division is only a logical division of functions. In actual implementation, there may be other divisions. For example, multiple modules or components may be combined or integrated into another system, or some features may be ignored or not executed.
- the coupling or direct coupling or communication connection between each other shown or discussed may be indirect coupling or communication connection through some interfaces, devices or modules, or may be electrical, mechanical or other forms of connection.
- the modules described as separate components may or may not be physically separated.
- the components shown as modules may or may not be physical modules, that is, they may be located in one place, or they may be distributed to multiple network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the embodiments of the present application.
- each functional module in each embodiment of the present application can be integrated into one processing module, or each module can exist physically alone, or two or more modules can be integrated into one module.
- the above integrated modules can be implemented in the form of hardware or software function modules.
- first, second and other words are used to distinguish the same or similar items with basically the same function and function. It should be understood that the terms “first”, “second” and “nth” There is no logical or sequential dependency between them, and there is no limit on the number or execution order. It should also be understood that, although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by the terms. These terms are only used to distinguish one element from another. For example, a first register may be referred to as a second register, and similarly, a second register may be referred to as a first register, without departing from the scope of various described examples.
- the size of the sequence number of each process does not mean the order of execution.
- the execution order of each process should be determined by its function and internal logic, and should not be determined by the execution order of the embodiments of the present application.
- the implementation process constitutes no limitation.
- determining B based on A does not mean determining B only based on A, and B can also be determined based on A and/or other information.
- references throughout this specification to "one embodiment,””anembodiment,” and “a possible implementation” mean that specific features, structures, or characteristics related to the embodiment or implementation are included herein. In at least one embodiment of the application. Therefore, “in one embodiment” or “in an embodiment” or “a possible implementation” appearing in various places throughout this specification do not necessarily refer to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Debugging And Monitoring (AREA)
Abstract
一种处理器、获取信息的方法、单板及网络设备,涉及计算机技术领域。处理器包括控制模块(101)、第一寄存器(102)和高速缓存(103),处理器与复位不丢失的存储介质(104,105)通信连接。控制模块(101)用于获取复位指示,复位指示是处理器运行异常生成的指示(S401);还用于基于复位指示,获取处理器的相关信息,相关信息包括第一寄存器(102)的寄存器信息或高速缓存(103)存储的数据中的至少一种(S402);还用于将相关信息存储至复位不丢失的存储介质(104,105)中(S403)。在处理器运行异常时,处理器内部的控制模块(101)能够基于复位指示获取处理器的相关信息,获取相关信息的方式的可靠性较高。并且控制模块(101)能够获取到较为全面的相关信息,在基于相关信息分析处理器运行异常的原因时,分析得到的原因的准确性较高。
Description
本申请要求于2022年7月19日提交的申请号为202210847565.8、发明名称为“一种定位处理器挂死问题的方法和计算机系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中;本申请还要求于2022年9月26日提交的申请号为202211175885.X、发明名称为“处理器、获取信息的方法、单板及网络设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及计算机技术领域,尤其涉及一种处理器、获取信息的方法、单板及网络设备。
处理器作为计算机系统中用于执行运算和控制的模块,其是否正常运行是影响计算机系统能否正常执行运算和控制的重要因素。因此,在处理器运行异常时,需要获取处理器的相关信息,以便分析处理器运行异常的原因。
发明内容
本申请提出一种处理器、获取信息的方法、单板及网络设备,用于获取处理器的相关信息,以便分析处理器运行异常的原因。
第一方面,提供了一种处理器,该处理器包括控制模块、第一寄存器和高速缓存,处理器与复位不丢失的存储介质通信连接。控制模块用于获取处理器运行异常生成的复位指示,基于复位指示,获取处理器的相关信息,该相关信息包括第一寄存器的寄存器信息或高速缓存存储的数据中的至少一种;将相关信息存储至复位不丢失的存储介质中。
在处理器运行异常时,处理器内部的控制模块能够基于复位指示获取处理器的相关信息。处理器无需依赖外部模块获取处理器的相关信息,也无需依赖于操作系统(operating system,OS)响应中断信号执行中断响应程序,处理器获取相关信息的方式的可靠性较高。并且,在控制模块获取第一寄存器的寄存器信息以及高速缓存存储的数据的情况下,获取到的相关信息较为全面。从而在基于该相关信息分析处理器运行异常的原因的情况下,分析得到的原因的准确性较高。再有,由于控制模块可以为硬件模块,控制模块获取到复位指示后,能够快速获取处理器的相关信息,从而获取相关信息的效率较高。
在一种可能的实现方式中,处理器还包括第二寄存器。第二寄存器为处理器运行时使用的寄存器,第一寄存器用于记录第二寄存器的寄存器信息,第一寄存器为复位不丢失寄存器,复位指示用于指示处理器复位。控制模块用于基于复位指示,指示第一寄存器停止记录第二寄存器的寄存器信息,在处理器基于复位指示复位后获取第一寄存器的寄存器信息。通过使用复位不丢失的第一寄存器记录第二寄存器的寄存器信息,在处理器复位后,第二寄存器的寄存器信息可以被保留在第一寄存器中,从而能够基于获取到的第一寄存器的寄存器信息分析处理器运行异常的原因。通过使用第一寄存器保留第二寄存器的寄存器信息,保留第二寄
存器的寄存器信息的方式的可靠性较高。
在一种可能的实现方式中,第二寄存器包括程序计数器(program counter,PC)、栈指针(stack pointer,SP)、帧指针(frame pointer,FP)、控制寄存器(control register,CR)或连接寄存器(link register,LR)中的至少一种。第二寄存器的类型较为丰富灵活,处理器能够保留多种类型的寄存器的寄存器信息。
在一种可能的实现方式中,处理器具有复位管脚,复位管脚用于生成复位指示,向控制模块传输复位指示。通过设置用于生成复位指示的复位管脚,处理器生成复位指示的方式较为简便。
在一种可能的实现方式中,控制模块与复位模块通信连接,复位模块用于向控制模块发送复位指示。控制模块还可以通过接收复位模块发送的复位指示实现获取复位指示,处理器获取复位指示的方式较为灵活。
在一种可能的实现方式中,复位不丢失的存储介质包括处理器内部的复位不丢失的内存、处理器内部的非易失存储介质、处理器外部的复位不丢失的内存或者处理器外部的非易失存储介质中的至少一种。复位不丢失的存储介质的类型较为灵活,从而,处理器的架构可以较为灵活多样,控制模块能够将处理器的相关信息存储至复位不丢失的存储介质中即可。
在一种可能的实现方式中,控制模块,还用于在处理器复位后,从复位不丢失的存储介质中获取该相关信息,基于该相关信息生成运行异常记录。通过生成运行异常记录,能够得到处理器运行异常的原因。并且,在该相关信息包括第一寄存器的寄存器信息和高速缓存存储的数据的情况下,分析得到的原因的准确性较高。
第二方面,提供了一种获取信息的方法,该方法应用于处理器,处理器包括控制模块、第一寄存器和高速缓存,处理器与复位不丢失的存储介质通信连接,该方法包括:控制模块获取处理器运行异常生成的复位指示,基于复位指示,获取处理器的相关信息,该相关信息包括第一寄存器的寄存器信息或高速缓存存储的数据中的至少一种;将该相关信息存储至复位不丢失的存储介质中。
在一种可能的实现方式中,处理器还包括第二寄存器,第二寄存器为处理器运行时使用的寄存器,第一寄存器用于记录第二寄存器的寄存器信息,第一寄存器为复位不丢失寄存器,复位指示用于指示处理器复位;基于复位指示,获取处理器的相关信息,包括:基于复位指示,指示第一寄存器停止记录第二寄存器的寄存器信息;在处理器基于复位指示复位后,获取第一寄存器的寄存器信息。
在一种可能的实现方式中,第二寄存器包括PC、SP、FP、CR或LR中的至少一种。
在一种可能的实现方式中,处理器具有复位管脚,该方法还包括:处理器通过复位管脚生成复位指示,复位管脚向控制模块传输复位指示;获取复位指示,包括:接收复位指示。
在一种可能的实现方式中,控制模块与复位模块通信连接,复位模块用于向控制模块发送复位指示;获取复位指示,包括:接收复位模块发送的复位指示。
在一种可能的实现方式中,复位不丢失的存储介质包括处理器内部的复位不丢失的内存、处理器内部的非易失存储介质、处理器外部的复位不丢失的内存或者处理器外部的非易失存储介质中的至少一种。
在一种可能的实现方式中,该方法还包括:在处理器复位后,控制模块从复位不丢失的存储介质中获取该相关信息,基于该相关信息生成运行异常记录。
第三方面,提供了一种单板,该单板包括上述第一方面中任一的处理器,以及与处理器通信连接的复位不丢失的存储介质。
在一种可能的实现方式中,该单板还包括复位模块,复位模块与处理器中的控制模块通信连接,复位模块用于向控制模块发送复位指示。
第四方面,提供了一种网络设备,该网络设备包括至少一个上述第一方面中任一的处理器,以及与处理器通信连接的复位不丢失的存储介质。
在一种可能的实现方式中,该网络设备还包括复位模块,复位模块与处理器中的控制模块通信连接,复位模块用于向控制模块发送复位指示。
第五方面,提供了一种网络设备,该网络设备包括至少一个上述第三方面中任一的单板。
第六方面,提供了一种芯片,该芯片包括至少一个上述第一方面中任一的处理器,以及与处理器通信连接的复位不丢失的存储介质。
在一种可能的实现方式中,该芯片还包括复位模块,复位模块与处理器中的控制模块通信连接,复位模块用于向控制模块发送复位指示。
在一种可能的实现方式中,该芯片还包括:输入接口、输出接口和存储器,该存储器包括上述复位不丢失的存储介质,输入接口、输出接口、处理器以及存储器之间通过内部连接通路相连。
应当理解的是,本申请的第二方面至第六方面的技术方案及对应的可能的实现方式所取得的有益效果可以参见上述第一方面及其对应的可能的实现方式的技术效果,此处不再赘述。
图1是本申请实施例提供的一种处理器的结构示意图;
图2是本申请实施例提供的另一种处理器的结构示意图;
图3是本申请实施例提供的又一种处理器的结构示意图;
图4是本申请实施例提供的一种获取信息的方法的流程图;
图5是本申请实施例提供的一种第一寄存器记录第二寄存器的寄存器信息的示意图;
图6是本申请实施例提供的一种获取相关信息的过程示意图;
图7是本申请实施例提供的一种网络设备的结构示意图;
图8是本申请实施例提供的另一种网络设备的结构示意图;
图9是本申请实施例提供的又一种网络设备的结构示意图;
图10是本申请实施例提供的再一种网络设备的结构示意图。
本申请的实施方式部分使用的术语仅用于对本申请的实施例进行解释,而非旨在限定本申请。下面结合附图,对本申请的实施例进行描述。
随着计算机技术的发展,处理器以多种形式应用在计算机系统中。例如,服务器、终端等设备中广泛应用具有多个内核(core)的处理器。又例如,计算机系统中应用有多个处理器组成的处理器系统,该处理器系统用于提供更强的计算性能、数据处理性能、人工智能推理性能、人工智能训练性能等。
在处理器出现运行异常的情况下,例如在处理器挂死的情况下,需要获取处理器的相关
信息,以便分析处理器运行异常的原因。其中,处理器挂死可以指由于中断嵌套、程序跑飞(program fleet)、程序死锁(deadloop)导致的处理器无法正常执行程序的情况。处理器的相关信息可以包括处理器出现运行异常的时刻的相关信息。在处理器出现运行异常的情况下,处理器的相关信息也可以称为运行异常的现场信息。由于某些运行异常会使得处理器重启,导致现场信息被冲刷掉,生成日志(log)或告警信息、以及获取内存中存放的错误记录等手段将由于无法获取到现场信息而无法分析出处理器运行异常的具体原因。因此,在处理器出现运行异常时,如何有效地获取处理器的相关信息,以便分析处理器运行异常的具体原因,是一个亟需解决的问题。
相关技术一中提出了一种基于看门狗(watchdog)芯片获取处理器的相关信息的方式。在相关技术一中,看门狗芯片与处理器的一个输入/输出(input/output,I/O)管脚相连,处理器通过该I/O管脚定时地向看门狗芯片发送高低电平跳变的输入信号。如果处理器运行异常,则无法向看门狗芯片发送输入信号,在此情况下,看门狗芯片向控制器发送复位信号。控制器收到复位信号后,先向处理器发送中断信号,运行在处理器上的OS响应于该中断信号,运行中断响应程序以收集引起处理器运行异常的寄存器信息。控制器发送中断信号的一段时间之后,向处理器发送复位信号,触发处理器,比如中央处理器(central processing unit,CPU)复位。
相关技术二中提供了一种基于协处理器获取处理器的相关信息的方式。在相关技术二中,处理器通过总线连接内置或外置的协处理器,当协处理器感知到处理器运行异常时,协处理器通过总线主动访问处理器,以获取引起处理器运行异常的寄存器信息。
相关技术一依赖于看门狗芯片和控制器的可靠运行且处理器上运行的OS能够对中断信号进行响应,在看门狗芯片或者控制器运行可靠性较低,或者处理器执行中断响应程序时出现异常的情况下,相关技术一的可靠性较低。相关技术二依赖于协处理器与处理器的可靠连接,在总线连接的可靠性较低的情况下,相关技术二的可靠性较低。再有,相关技术一和相关技术二均仅获取寄存器信息,用于分析处理器运行异常的原因的信息较为局限,分析得到的原因的准确性较低。
本申请实施例提供了一种处理器,用于在运行异常时获取较为全面的相关信息,使得分析得到的原因的准确性较高。图1是本申请实施例提供的一种处理器的结构示意图,如图1所示,处理器包括控制模块101、第一寄存器102和高速缓存(cache)103。控制模块101、第一寄存器102和高速缓存103通信连接。例如,如图1所示,控制模块101、第一寄存器102和高速缓存103通过处理器中的通信布线通信连接。控制模块101用于获取复位指示,该复位指示是处理器运行异常生成的指示。控制模块101还用于基于复位指示,获取处理器的相关信息,该相关信息包括第一寄存器102的寄存器信息或高速缓存103存储的数据中的至少一种。示例性地,处理器与复位不丢失的存储介质通信连接,控制模块101还用于将处理器的相关信息存储至复位不丢失的存储介质中。
根据处理器结构的情况,控制模块101获取复位指示的情况包括但不限于情况一和情况二。
情况一,处理器具有复位管脚,该复位管脚用于生成复位指示,向控制模块101传输复位指示。
对于情况一,控制模块101可以通过接收复位管脚传输的复位指示,实现获取复位指示。
复位管脚可以为硬复位管脚或者软复位管脚,本申请实施例对此不加以限定。该复位管脚可以用于基于低电平信号生成复位指示。例如,处理器设置在单板上,在单板感知到处理器运行异常的情况下,单板降低复位管脚的电平,使得处理器基于低电平信号,通过复位管脚生成复位指示。通过设置复位管脚,处理器能够快速响应低电平信号,提高生成复位指示的效率。在本申请实施例中,单板可以是任意一种包括处理器的电路组件,例如,单板为包括处理器、电阻和电容的电路组件。关于单板感知处理器运行异常的方式,本申请实施例不加以限定。例如,单板上还设置有看门狗芯片,该看门狗芯片用于感知处理器运行异常。处理器周期性地向看门狗芯片发送输入信号,在处理器运行异常的情况下,处理器停止向看门狗芯片发送输入信号,看门狗芯片基于未接收到输入信号生成输出信号,该输出信号用于指示处理器运行异常。
情况二,控制模块101与复位模块通信连接,复位模块用于向控制模块101发送复位指示。
对于情况二,控制模块101可以通过接收复位模块发送的复位指示,实现获取复位指示。复位模块可以位于处理器内部,也可以位于处理器外部。本申请实施例中,复位模块的设置位置较为灵活。示例性地,复位模块为处理器内部的控制逻辑或控制电路中的任一种,或者为处理器外部的控制逻辑或控制电路中的任一种。控制逻辑也可以为在由处理器中的软件模块上执行的软件程序。无论复位模块位于处理器内部还是处理器外部,该复位模块均可以用于基于第一信号生成复位指示,向控制模块101发送复位指示。例如,复位模块与看门狗芯片通信连接,看门狗芯片还与处理器通信连接。处理器按照第一时长周期性地向看门狗芯片发送输入信号,第一时长可以根据经验或实际需求进行设置,本申请实施例对此不加以限定。在处理器运行异常的情况下,处理器将停止向看门狗芯片发送输入信号或者处理器发送的两个输入信号的时间间隔将大于第一时长。看门狗芯片基于未在接收到一个输入信号后的第一时长内接收到下一个输入信号,生成输出信号,将输出信号作为第一信号向复位模块发送,使得复位模块基于第一信号生成复位指示,向控制模块101发送复位指示。
在一种可能的实现方式中,处理器还包括第二寄存器,该第二寄存器为处理器运行时使用的寄存器。第一寄存器102用于记录第二寄存器的寄存器信息,第一寄存器102为复位不丢失寄存器,控制模块101获取的复位指示用于指示处理器复位。在此情况下,控制模块101用于基于复位指示,指示第一寄存器102停止记录第二寄存器的寄存器信息,在处理器基于复位指示复位后获取第一寄存器102的寄存器信息,实现获取处理器的相关信息。关于第一寄存器102和第二寄存器的数量,本申请实施例对此不加以限定。在处理器包括多个第二寄存器的情况下,处理器还包括多个第一寄存器102,一个第一寄存器102用于记录一个或多个第二寄存器的寄存器信息。在本申请实施例中,第一寄存器102可以用于实时记录第二寄存器的寄存器信息,也就是说,在处理器运行程序的过程中,第二寄存器的寄存器信息每变化一次,该第二寄存器对应的第一寄存器102记录一次变化后的寄存器信息。例如,第二寄存器为PC,PC的寄存器信息包括PC指针。在处理器运行程序的过程中,PC指针每指向一个新的指令,PC对应的第一寄存器102记录一次变化后的PC指针。第一寄存器102实时记录第二寄存器的寄存器信息的功能可以称为第一寄存器102的实时备份记录功能。
示例性地,无论是通过复位管脚生成的复位指示,还是通过复位模块生成的复位指示,复位指示均可以用于指示处理器在参考时间段后复位。参考时间段可以根据经验或者时机需
求进行设置。例如,在获取第一寄存器的寄存器信息作为处理器的相关信息的情况下,参考时间段满足大于等于指示第一寄存器102停止记录第二寄存器的寄存器信息所需的时长。在获取高速缓存103存储的数据作为处理器的相关信息的情况下,参考时间段满足大于等于控制模块101获取高速缓存103存储的数据所需的时长。示例性地,在获取高速缓存103存储的数据作为处理器的相关信息的情况下,参考时间段大于等于第二时长,第二时长为控制模块101获取高速缓存103存储的数据所需的时长与控制模块101将高速缓存103的数据存储至复位不丢失的存储介质中所需的时长之和。通过设置大于等于第二时长的参考时间段,能够保证控制模块101将高速缓存103存储的数据存储到复位不丢失的存储介质中,从而保证获取到的数据的可靠存储。
在一种可能的实现方式中,处理器中的第二寄存器包括PC、SP、FP、CR或LR中的至少一种。PC的寄存器信息包括但不限于至少一个PC指针,至少一个PC指针可以包括当前运行的目标指令的PC指针、处于目标指令之前且与目标指令相邻的前A条指令的PC指针、以及处于目标指令之后且与目标指令相邻的后B条指令的PC指针,A和B均为正整数。SP的寄存器信息包括但不限于调用栈,FP的寄存器信息包括但不限于调用帧,CR的寄存器信息包括但不限于控制处理器操作模式和状态的系统控制标志、导致页错误(page fault)的线性地址、或者页目录表物理内存基地址(page-directory base address)中的至少一种,LR的寄存器信息包括但不限于运行异常发生时PC的值与参考值的差值。参考值可以根据经验或实际需求进行设置,本申请实施例不对参考值的大小进行限定。
与上述几种类型的第二寄存器相对应,用于记录PC的寄存器信息的第一寄存器102可以称为备份程序计数器(backup PC,BPC),用于记录SP的寄存器信息的第一寄存器102可以称为备份栈指针(backup SP,BSP),用于记录FP的寄存器信息的第一寄存器102可以称为备份帧指针(backup FP,BFP),用于记录CR的寄存器信息的第一寄存器102可以称为备份控制寄存器(backup CR,BCR),用于记录LR的寄存器信息的第一寄存器102可以称为备份连接寄存器(backup LR,BLR)。在一个第一寄存器102用于记录多种第二寄存器的寄存器信息的情况下,该第一寄存器102可以具有多种名称。例如,一个第一寄存器102既用于记录PC的寄存器信息,又用于记录SP的寄存器信息,则该第一寄存器102既可以称为BPC,也可以称为BSP。
上述几种第二寄存器旨在对处理器运行时使用的寄存器的类型进行说明。在上述几种第二寄存器之外,处理器运行时使用的其他类型的寄存器也可以作为第二寄存器。在此情况下,处理器还包括与其他类型的第二寄存器对应的第一寄存器102,以记录其他类型的第二寄存器的寄存器信息。
示例性地,与处理器通信连接的复位不丢失的存储介质包括但不限于处理器内部的复位不丢失的内存、处理器内部的非易失存储介质、处理器外部的复位不丢失的内存或者处理器外部的非易失存储介质中的至少一种。
在一种可能的实现方式中,如图1示出的处理器的结构,复位不丢失的存储介质为位于处理器内部的复位不丢失的存储介质104。该复位不丢失的存储介质104通过处理器内部的通信布线与处理器中的其他模块通信连接,从而实现与处理器的通信连接。如图1所示,复位不丢失的存储介质104通过处理器内部的通信布线与控制模块101、第一寄存器102和高速缓存103通信连接。复位不丢失的存储介质104包括但不限于复位不丢失的内存和非易失
存储介质。复位不丢失的内存包括但不限于静态随机存取存储器(static random-access memory,SRAM)。非易失存储介质包括但不限于双倍速率同步动态随机存储器(double data rate synchronous dynamic random-access memory,DDR SDRAM)、闪存卡(flash card)、安全数码内存(secure digital memory,SD)卡、串行高级技术附件(serial advanced technology attachment,SATA)卡、或通用串行总线卡(universal serial bus,USB card)中的至少一种。其中,SATA是一种基于行业标准的串行硬件的驱动器接口。
在另一种可能的实现方式中,如图2示出的处理器的结构,复位不丢失的存储介质为位于处理器外部的复位不丢失的存储介质105。复位不丢失的存储介质105包括但不限于复位不丢失的内存和非易失存储介质。复位不丢失的内存与上述说明的复位不丢失的内存原理相同,此处不再赘述。非易失存储介质包括但不限于处理器外部的DDR SDRAM、闪存卡、SD卡、SATA卡、或通用串行总线卡中的至少一种。复位不丢失的存储介质105还可以包括其他处理器或单板中的存储介质,例如,处理器通过以太网等通道与其他处理器或单板通信连接,以实现该处理器与其他处理器或单板中的存储介质的通信连接,从而实现复位不丢失的存储介质105包括其他处理器或单板中的存储介质。其他处理器或单板中的存储介质包括但不限于易失性存储介质或非易失性存储介质中的至少一种。
示例性地,控制模块101还用于在处理器复位后,从复位不丢失的存储介质中获取处理器的相关信息,基于该相关信息生成运行异常记录。例如,在复位不丢失的存储介质中存储有第一寄存器102的寄存器信息的情况下,控制模块101还用于在处理器复位后,从该存储介质中获取第一寄存器102的寄存器信息,基于第一寄存器102的寄存器信息生成运行异常记录。在该存储介质中存储有高速缓存103存储的数据的情况下,控制模块101还用于在处理器复位后,从该存储介质中获取该数据,基于该数据生成运行异常记录。
上述均以控制模块101为一个整体的模块进行了说明,本申请实施例中,控制模块101可以包括多个控制子模块,控制模块101的功能由多个控制子模块实现。例如,控制模块101包括第一控制子模块、第二控制子模块和第三控制子模块。第一控制子模块用于基于复位指示获取第一寄存器102的寄存器信息,将第一寄存器102的寄存器信息存储至复位不丢失的存储介质中。第二控制子模块用于基于复位指示获取高速缓存103存储的数据,将高速缓存103存储的数据存储至复位不丢失的存储介质中。第三控制子模块用于执行从复位不丢失的存储介质中获取处理器的相关信息,基于该相关信息生成运行异常记录。示例性地,第一控制子模块和第二控制子模块均为硬件模块,第三控制子模块为软件模块。第一控制子模块和第二控制子模块可以在同一个硬件模块中实现,也可以在不同的硬件模块中分别实现,本申请实施例对此不加以限定。
在本申请实施例提供的处理器运行异常时,处理器内部的控制模块101能够基于复位指示获取处理器的相关信息。处理器无需依赖外部模块获取处理器的相关信息,也无需依赖于OS响应中断信号执行中断响应程序,处理器获取相关信息的方式的可靠性较高。并且,在控制模块101获取第一寄存器102的寄存器信息以及高速缓存103存储的数据的情况下,获取到的相关信息较为全面。从而在基于该相关信息分析处理器运行异常的原因的情况下,分析得到的原因的准确性较高。再有,在控制模块为硬件模块的情况下,控制模块获取到复位指示后,能够快速获取处理器的相关信息,从而获取相关信息的效率较高。
本申请实施例提供的处理器可以为CPU,且处理器还可以包括上述模块以外的其他模块。
图3是本申请实施例提供的又一种处理器的结构示意图,如图3所示,CPU包括至少一个CPU内核,控制模块101、第一寄存器102和第二寄存器包括在各个CPU内核中。高速缓存103由多种级别的高速缓存共同组成,如图3所示,多种级别的高速缓存包括级别(level,L)1高速缓存、L2高速缓存、L3高速缓存和L4高速缓存。L1高速缓存和L2高速缓存处于各个CPU内核中,L3高速缓存和L4高速缓存处于各个CPU内核之外。处理器还可以包括内部缓存(buffer)或SRAM中的至少一种、用于实现硬件加速的硬加速引擎、用于控制DDR SDRAM的DDR控制器、用于与非易失存储介质通信连接的非易失存储接口、用于与控制逻辑通信连接的逻辑接口、以及用于与看门狗芯片通信连接的低速I/O或者通用I/O。处理器中的各个模块可以通过处理器中的通信布线连接。CPU内核中的各个模块通过CPU内核中的通信布线通信连接,CPU中的各个模块通过CPU中的通信布线通信连接。
本申请实施例还提供了一种获取信息的方法,该方法应用于上述实施例中示出的处理器,如图4所示,该方法包括但不限于S401-S403。
S401,控制模块获取复位指示,该复位指示是处理器运行异常生成的指示。
在一种可能的实现方式中,处理器具有复位管脚,该方法还包括:处理器通过复位管脚生成复位指示,复位管脚向控制模块传输复位指示。在此情况下,获取复位指示包括:接收复位指示。如上述实施例中对复位管脚的说明,处理器具有的复位管脚可以为硬复位管脚或者软复位管脚。在复位管脚接收到低电平的信号时,均可以基于复位管脚生成复位指示。示例性地,在处理器设置在单板上的情况下,单板可以感知处理器运行异常,从而在单板感知到处理器运行异常时,单板降低复位管脚的电平,实现向复位管脚输入低电平的信号。
在另一种可能的实现方式中,控制模块与复位模块通信连接,该复位模块用于向控制模块发送复位指示。在此情况下,获取复位指示包括:接收复位模块发送的复位指示。如上述实施例中对复位模块的说明,复位模块可以位于处理器内部,也可以位于处理器外部。复位模块还可以与看门狗芯片通信连接,看门狗芯片还与处理器通信连接。示例性地,处理器按照第一时长周期性地向看门狗芯片发送输入信号。在处理器运行异常的情况下,处理器将停止向看门狗芯片发送输入信号或者处理器发送的两个输入信号的时间间隔将大于第一时长。看门狗芯片基于未在接收到一个输入信号后的第一时长内接收到下一个输入信号,生成输出信号,将输出信号作为第一信号向复位模块发送。复位模块基于第一信号生成复位指示,向控制模块发送复位指示。控制模块接收复位模块发送的复位指示,实现获取复位指示。本申请实施例不对该看门狗芯片的位置进行限定,看门狗芯片可以设置在处理器中,也可以设置在处理器之外。
S402,控制模块基于复位指示,获取处理器的相关信息,该相关信息包括第一寄存器的寄存器信息或高速缓存存储的数据中的至少一种。
示例性地,控制模块在接收到复位指示之后,直接执行基于复位指示获取高速缓存存储的数据的操作。在控制模块获取高速缓存存储的数据之后,处理器复位。该复位指示还可以用于指示处理器在参考时间段后复位。在获取高速缓存存储的数据作为处理器的相关信息的情况下,参考时间段大于等于控制模块获取高速缓存存储的数据所需的时长。参考时间段还可以大于等于第二时长,第二时长为控制模块获取高速缓存存储的数据所需的时长与控制模块将高速缓存的数据存储至复位不丢失的存储介质中所需的时长之和。通过设置大于等于第
二时长的参考时间段,能够保证控制模块将高速缓存存储的数据存储到复位不丢失的存储介质中,从而保证获取到的数据的可靠存储。
示例性地,处理器还包括第二寄存器,第二寄存器为处理器运行时使用的寄存器,第一寄存器用于记录第二寄存器的寄存器信息,第一寄存器为复位不丢失寄存器,复位指示用于指示处理器复位。基于复位指示,获取处理器的相关信息,包括:基于复位指示,指示第一寄存器停止记录第二寄存器的寄存器信息;在处理器基于复位指示复位后,获取第一寄存器的寄存器信息。第二寄存器与上述实施例中第二寄存器的相关内容原理相同,此处不再赘述。该复位指示也可以用于指示处理器在参考时间段后复位。在获取第一寄存器的寄存器信息作为处理器的相关信息的情况下,参考时间段大于等于指示第一寄存器停止记录第二寄存器的寄存器信息所需的时长。
在本申请实施例中,控制模块可以既获取第一寄存器的寄存器信息,也获取高速缓存存储的数据。例如,控制模块基于复位指示,执行获取高速缓存存储的数据以及指示第一寄存器停止记录第二寄存器的寄存器信息,在处理器基于复位指示复位后,获取第一寄存器的寄存器信息。也就是说,在此情况下,如果复位指示用于指示处理器在参考时间段后复位,参考时间段大于等于控制模块获取高速缓存存储的数据所需的时长、且大于等于指示第一寄存器停止记录第二寄存器的寄存器信息所需的时长。当然,参考时间段也可以大于等于第二时长、且大于等于指示第一寄存器停止记录第二寄存器的寄存器信息所需的时长。
S403,控制模块将处理器的相关信息存储至复位不丢失的存储介质中。
复位不丢失的存储介质可以位于处理器内部,也可以位于处理器外部。复位不丢失的存储介质包括但不限于处理器内部的复位不丢失的内存、处理器内部的非易失存储介质、处理器外部的复位不丢失的内存或者处理器外部的非易失存储介质中的至少一种。示例性地,在复位不丢失的存储介质位于处理器内部的情况下,也即,在复位不丢失的存储介质包括处理器内部的复位不丢失的内存或处理器内部的非易失存储介质中的至少一种的情况下,控制模块通过处理器内部的通信布线将处理器的相关信息存储至复位不丢失的存储介质。在复位不丢失的存储介质位于处理器外部的情况下,也即,在复位不丢失的存储介质包括处理器外部的复位不丢失的内存或处理器外部的非易失存储介质的情况下,控制模块通过处理器与复位不丢失的存储介质的通信连接将处理器的相关信息存储至复位不丢失的存储介质。
在一种可能的实现方式中,该方法还包括:在处理器复位后,控制模块从复位不丢失的存储介质中获取处理器的相关信息,基于该相关信息生成运行异常记录。例如,控制模块从复位不丢失的存储介质中获取第一寄存器的寄存器信息和高速缓存存储的数据,基于获取的寄存器信息和数据生成运行异常记录。运行异常记录可以包括获取到的寄存器信息和数据。也就是说,在第一寄存器包括BPC、BSP、BFP、BLR和BCR的情况下,运行异常记录可以包括获取到的数据、BPC的寄存器信息、BSP的寄存器信息、BFP的寄存器信息、BLR的寄存器信息和BCR的寄存器信息。其中,BPC的寄存器信息包括但不限于至少一个PC指针,BSP的寄存器信息包括但不限于调用栈,BFP的寄存器信息包括但不限于调用帧,BLR的寄存器信息包括但不限于运行异常发生时PC的值与参考值的差值,BCR的寄存器信息包括但不限于控制处理器操作模式和状态的系统控制标志、导致页错误的线性地址、或者页目录表物理内存基地址中的至少一种。
本申请实施例提供的方法中,在处理器运行异常时,处理器中的控制模块能够基于获取
到的复位指示获取处理器的相关信息。从而,无需依赖处理器的外部模块获取处理器的相关信息,也无需依赖于OS响应中断信号执行中断响应程序,获取相关信息的方式的可靠性较高。并且,在获取第一寄存器的寄存器信息以及高速缓存存储的数据的情况下,该方法能够获取到较为全面的相关信息。从而在基于该相关信息分析处理器运行异常的原因的情况下,分析得到的原因的准确性较高。再有,在控制模块为硬件模块的情况下,控制模块获取到复位指示后,能够快速获取处理器的相关信息,从而获取相关信息的效率较高。
本申请实施例还提供了一种单板,该单板包括上述任一的处理器,以及与该处理器通信连接的复位不丢失的存储介质。示例性地,单板包括指示模块,该指示模块用于向处理器的控制模块发送第一指令,第一指令用于指示控制模块指示第一寄存器开始记录第二寄存器的寄存器信息。图5示出了一种第一寄存器记录第二寄存器的寄存器信息的示意图,参见图5,控制模块包括多个第四控制子模块,一个第四控制子模块用于指示一个第一寄存器开始记录该第一寄存器对应的第二寄存器的寄存器信息。单板的指示模块用于同步地向多个第四控制子模块发送第一指令,从而,多个第四控制子模块可以同步指示多个第一寄存器开始记录多个第一寄存器对应的第二寄存器的寄存器信息。如图5所示,多个第一寄存器包括BPC、BCR、BSP、BLR和BFP,其中,BPC对应PC,BCR对应CR,BSP对应SP,BLR对应LR,BFP对应FP。需要说明的是,图5中示出的第一寄存器旨在对如何实现记录第二寄存器的寄存器信息进行说明,并不用于对第一寄存器的类型以及各个类型的第一寄存器的数量进行限定,同一类型的第一寄存器的数量可以为一个或者多个。
在一种可能的实现方式中,单板还包括存储模块,该存储模块用于存储运行异常记录。例如,处理器的控制模块获取运行异常记录之后,可以将运行异常记录传输至单板的存储模块,单板的存储模块存储运行异常记录。示例性地,单板的存储模块为处理器外部的复位不丢失的内存或处理器外部的非易失存储介质中的至少一种。单板还可以包括上述实施例中的复位模块,此处不再对复位模块进行赘述。
接下来,以单板包括一个上述处理器为例,对获取处理器的相关信息的过程进行说明。图6是本申请实施例提供的一种获取相关信息的过程示意图。如图6所示,该方法包括但不限于S601-S607。
S601,单板的启动模块获取启动指示。
启动模块可以是单板的启动接口,启动接口用于接收启动指示。启动指示包括但不限于冷启动指示和热启动指示。冷启动指示是指在单板断电的情况下启动接口接收到的启动指示,例如,在单板断电的情况下启动接口接通电源,从而接收到冷启动指示。热启动指示是指在单板不断电的情况下启动接口接收到的启动指示,例如,当单板上设置的看门狗芯片感知到处理器运行异常时,看门狗芯片生成输出信号,该输出信号作为热启动指示。看门狗芯片感知处理器运行异常的方式请详见上述实施例中的说明,此处不再赘述。响应于获取到冷启动指示,执行S602。响应于获取到热启动指示,执行S603。
S602,单板的指示模块向处理器的控制模块发送第一指令。
示例性地,响应于获取到冷启动指示,指示模块向控制模块发送第一指令。第一指令用于指示控制模块指示第一寄存器开始记录第二寄存器的寄存器信息,从而控制模块能够基于第一指令指示第一寄存器开始记录第二寄存器的寄存器信息。
S603,单板的复位模块或者控制电路向控制模块传输复位指示。
在一种可能的实现方式中,单板上设置的看门狗芯片还用于作为上述实施例中的复位模块。也就是说,看门狗芯片与控制模块通信连接,看门狗芯片生成热启动指示后,基于热启动指示向控制模块发送复位指示。在另一种可能的实现方式中,看门狗芯片仅用于感知处理器运行异常,在感知到处理器运行异常时生成热启动指示。单板还具有控制电路,该控制电路用于基于热启动指示,降低处理器的复位管脚的电平,使得处理器基于低电平信号通过复位管脚生成复位信号,复位管脚向控制模块传输复位指示。
在本申请实施例中,由于复位指示是基于热启动指示得到的,复位指示还可以称为热复位指示,处理器基于热复位指示执行的复位可以称为处理器热复位。
S604,控制模块基于复位指示获取处理器的相关信息。
S604与上述S402的相关内容原理相同,此处不再赘述。
S605,控制模块将处理器的相关信息存储至复位不丢失的存储介质中。
S605与上述S403原理相同,此处不再赘述。
S606,指示模块向控制模块发送第一指令。
S606与S602的相关内容原理相同,此处不再赘述。
S607,单板启动。
示例性地,无论是冷启动指示还是热启动指示,在执行S602或S603-S606之后,控制模块指示第一寄存器开始记录第二寄存器的寄存器信息。在控制模块指示第一寄存器开始记录第二寄存器的寄存器信息之后,单板启动。单板启动是指单板的各个模块和组件均启动运行。
本申请实施例提供的单板包括上述任一的处理器,在处理器运行异常时,处理器中的控制模块能够基于获取到的复位指示获取处理器的相关信息。从而,无需依赖处理器的外部模块获取处理器的相关信息,也无需依赖于OS响应中断信号执行中断响应程序,获取相关信息的方式的可靠性较高。并且,在获取第一寄存器的寄存器信息以及高速缓存存储的数据的情况下,获取到的相关信息较为全面。从而在基于该相关信息分析处理器运行异常的原因的情况下,分析得到的原因的准确性较高。在控制模块为硬件模块的情况下,控制模块获取到复位指示后,能够快速获取处理器的相关信息,从而获取相关信息的效率较高。再有,由于无需设置用于获取处理器的相关信息的协处理器,单板的设计复杂度较低,成本也较低。
本申请实施例还提供了一种网络设备,该网络设备包括至少一个上述实施例中的处理器,以及与处理器通信连接的复位不丢失的存储介质。示例性地,网络设备还包括复位模块,复位模块与处理器中的控制模块通信连接,复位模块用于向控制模块发送复位指示。由于本申请实施例提供的处理器还可以设置在单板上,本申请实施例还提供了一种网络设备,该网络设备包括至少一个上述实施例中的单板。网络设备可以为盒式设备,盒式设备是指仅包括一块上述单板的网络设备。网络设备还可以为框式设备,框式设备是指包括主控板和至少一块上述单板的网络设备,主控板与至少一个单板通过板间管理通道连接,主控板用于控制网络设备中的至少一个单板。
根据网络设备的不同情况,获取信息的过程包括但不限于如下情况A和情况B。
情况A,网络设备为盒式设备。
图7是本申请实施例提供的一种网络设备的结构示意图,该网络设备包括一个单板,也
即该网络设备为盒式设备。单板包括的处理器可以为微控制单元(microcontroller unit,MCU)或者CPU。在处理器为MCU的情况下,MCU内置有CPU内核、复位易失的内存和复位不丢失的存储介质。在处理器为CPU的情况下,CPU内置有CPU内核。CPU还可以内置有复位易失的内存和复位不丢失的存储介质中的至少一种。当然,复位易失的内存和复位不丢失的存储介质中的至少一种也可以位于CPU之外。如图7所示,复位不丢失的存储介质包括复位不丢失的内存和非易失存储介质。CPU内核、复位易失的内存和复位不丢失的存储介质均为硬件模块。
如图7所示,CPU内核包括高速缓存、PC、CR、SP、BPC、BCR、BSP以及控制模块1至控制模块4。控制模块1用于指示BPC开始记录PC的寄存器信息或停止记录PC的寄存器信息,基于复位指示获取BPC的寄存器信息,将BPC的寄存器信息存储至复位不丢失的存储介质中。控制模块2用于指示BCR开始记录CR的寄存器信息或停止记录CR的寄存器信息,基于复位指示获取BCR的寄存器信息,将BCR的寄存器信息存储至复位不丢失的存储介质中。控制模块3用于指示BSP开始记录SP的寄存器信息或停止记录SP的寄存器信息,基于复位指示获取BSP的寄存器信息,将BSP的寄存器信息存储至复位不丢失的存储介质中。控制模块4用于基于复位指示获取高速缓存存储的数据,将高速缓存存储的数据存储至复位不丢失的存储介质中。单板包括的启动模块、指示模块、看门狗芯片、复位模块和控制电路未在图7中示出。
上述第一寄存器、第二寄存器和控制模块的数量以及类型仅用于举例说明,并不用于对网络设备包括的第一寄存器、第二寄存器和控制模块的数量以及类型进行限定。
示例性地,对于情况A,获取信息的过程包括如下S701至S707。
S701,单板的启动模块获取冷启动指示。
S701与上述S601中获取冷启动指示的相关内容原理相同,此处不再赘述。
S702,单板的指示模块向处理器的控制模块发送第一指令。
响应于接收到冷启动指示,执行S702,S702与上述S602的相关内容原理相同,此处不再赘述。示例性地,在第一寄存器包括BPC、BCR和BSP,第二寄存器包括PC、CP和SP的情况下,第一指令用于指示控制模块1指示BPC开始记录PC的寄存器信息,还用于指示控制模块2指示BCR开始记录CR的寄存器信息,还用于指示BSP开始记录SP的寄存器信息。
S703,控制模块基于第一指令指示第一寄存器开始记录第二寄存器的寄存器信息。
示例性地,控制模块基于第一指令向第一寄存器发送触发信号,该触发信号用于指示第一寄存器开始记录第二寄存器的寄存器信息。本申请不对触发信号的形式进行限定。对于图7示出的网络设备,控制模块1基于第一指令向BPC发送触发信号,该触发信号用于指示BPC开始记录PC的寄存器信息。控制模块2基于第一指令向BCR发送触发信号,该触发信号用于指示BCR开始记录CR的寄存器信息。控制模块3基于第一指令向BSP发送触发信号,该触发信号用于指示BSP开始记录SP的寄存器信息。
S704,单板启动。
S704与上述S607的相关内容原理相同,此处不再赘述。
S705,单板中的看门狗芯片感知到CPU运行异常,单板的复位模块或者控制电路向控制模块传输复位指示。
S705与上述S603的相关内容原理相同,此处不再赘述。
S706,控制模块基于复位指示获取处理器的相关信息。
示例性地,该S706与上述S402和S604的相关内容原理相同。例如,对于图7示出的网络设备,控制模块4基于复位指示获取高速缓存存储的数据。控制模块1基于复位指示,指示BPC停止记录PC的寄存器信息;控制模块2基于复位指示,指示BCR停止记录CR的寄存器信息;控制模块3基于复位指示,指示BSP停止记录SP的寄存器信息。处理器基于复位指示复位。在处理器基于复位指示复位后,控制模块1获取BPC的寄存器信息,控制模块2获取BCR的寄存器信息,控制模块3获取BSP的寄存器信息。
S707,控制模块将处理器的相关信息存储至复位不丢失的存储介质中。
示例性地,S707与上述S403和S605的相关内容原理相同。例如,对于图7示出的网络设备,控制模块1将BPC的寄存器信息存储至复位不丢失的存储介质中,控制模块2将BCR的寄存器信息存储在复位不丢失的存储介质中,控制模块3将BSP的寄存器信息存储在复位不丢失的存储介质中,控制模块4将高速缓存存储的数据存储至复位不丢失的存储介质中。
无论是哪一种相关信息,控制模块将相关信息存储在复位不丢失的存储介质中时,可以先将相关信息存储在复位易失的内存中,再通过复位易失的内存存储在复位不丢失的存储介质中。
在一种可能的实现方式中,控制模块将处理器的相关信息存储在复位不丢失的存储介质中之后,控制模块还获取该存储介质中存储的相关信息,基于该相关信息生成运行异常记录。例如,控制模块从该存储介质中获取存储的至少一个PC指针、调用栈以及页目录表物理内存基地址,基于获取到的至少一个PC指针、调用栈以及页目录表物理内存基地址生成运行异常记录。示例性地,控制模块生成运行异常记录之后,将运行异常记录也存储至复位不丢失的存储介质中。
情况B,网络设备为框式设备。
图8是本申请实施例提供的另一种网络设备的结构示意图,该网络设备包括主控板和一个单板,也即该网络设备为框式设备。参见图8,主控板包括CPU和非易失存储介质,该非易失存储介质可以用于存储日志。单板包括CPU和复位不丢失的存储介质。复位不丢失的存储介质包括复位不丢失的内存和非易失存储介质。其中,CPU、复位不丢失的内存和非易失存储介质的位置与上述图7中的相关内容原理相同,此处不再赘述。CPU和复位不丢失的存储介质均为硬件模块。如图8所示,CPU包括高速缓存、第一寄存器、第二寄存器和控制模块。单板包括的启动模块、指示模块、看门狗芯片、复位模块和控制电路未在图8中示出。
示例性地,对于情况B,获取信息的过程包括如下S801至S807。
S801,单板的启动模块获取冷启动指示。
S801与上述S601和S701中获取冷启动指示的相关内容原理相同。示例性地,对于图8示出的网络设备,主控板通过板间管理通道向单板的启动模块通电,实现向单板的启动模块传输冷启动指示,从而单板的启动模块能够获取冷启动指示。
S802,单板的指示模块向处理器的控制模块发送第一指令。
响应于接收到冷启动指示,执行S802。例如,对于图8示出的网络设备,冷启动指示还用于指示单板的指示模块向处理器的控制模块发送第一指令。也就是说,主控板通过板间管理通道指示单板的指示模块向处理器的控制模块发送第一指令。当然,指示模块也可以基于
接收到的冷启动指示自动执行向控制模块发送第一指令。触发指示模块向控制模块发送第一指令的方式较为灵活。
S803,控制模块基于第一指令指示第一寄存器开始记录第二寄存器的寄存器信息。
S803与上述S703的相关内容原理相同,此处不再赘述。
S804,单板启动。
S804与上述S607和S704的相关内容原理相同,此处不再赘述。
S805,单板中的看门狗芯片感知到CPU运行异常,单板的复位模块或者控制电路向控制模块传输复位指示。
S805与上述S603和S705的相关内容原理相同,此处不再赘述。
S806,控制模块基于复位指示获取处理器的相关信息。
S806与上述S402、S604和S706的相关内容原理相同,此处不再赘述。
S807,控制模块将处理器的相关信息存储至复位不丢失的存储介质中。
示例性地,S807与上述S403和S605的相关内容原理相同。例如,对于图8示出的网络设备,控制模块将第一寄存器的寄存器信息存储至单板的复位不丢失的存储介质中,控制模块还将高速缓存存储的数据存储至单板的复位不丢失的存储介质中。
在一种可能的实现方式中,控制模块将处理器的相关信息存储在复位不丢失的存储介质中之后,控制模块还获取该存储介质中存储的相关信息,基于该相关信息生成运行异常记录。示例性地,控制模块生成运行异常记录之后,将运行异常记录也存储至复位不丢失的存储介质中。例如,对于图8示出的网络设备,控制模块生成运行异常记录之后,通过板间管理通道将运行异常记录存储至主控板的非易失存储介质中。在该非易失存储介质中,运行异常记录可以以日志的形式存储。
本申请实施例提供的网络设备包括上述任一的处理器,在处理器运行异常时,处理器中的控制模块能够基于获取到的复位指示获取处理器的相关信息。从而,无需依赖处理器的外部模块获取处理器的相关信息,也无需依赖于OS响应中断信号执行中断响应程序,获取相关信息的方式的可靠性较高。并且,在获取第一寄存器的寄存器信息以及高速缓存存储的数据的情况下,获取到的相关信息较为全面。从而在基于该相关信息分析处理器运行异常的原因的情况下,分析得到的原因的准确性较高。在控制模块为硬件模块的情况下,控制模块获取到复位指示后,能够快速获取处理器的相关信息,从而获取相关信息的效率较高。再有,由于无需设置用于获取处理器的相关信息的协处理器,本申请实施例提供的单板的设计复杂度较低,成本也较低。从而,包括该单板的网络设备的设计复杂度较低,成本也较低。
参见图9,图9是本申请实施例提供的又一种网络设备的结构示意图。图9所示的网络设备2000配置有上述图1-3任一所示的处理器,处理器用于执行上述图4所示的获取信息的方法。网络设备2000例如是交换机、路由器、服务器或终端等,网络设备2000可以由一般性的总线体系结构来实现。
如图9所示,网络设备2000包括至少一个处理器2001、存储器2003以及至少一个通信接口2004。处理器2001可以是上述图1-3任一所示的处理器。
处理器2001例如是CPU、数字信号处理器(digital signal processor,DSP)、网络处理器(network processer,NP)、图形处理器(graphics processing unit,GPU)、神经网络处理器(neural-
network processing units,NPU)、数据处理单元(data processing unit,DPU)、微处理器或者一个或多个用于实现本申请方案的集成电路。例如,处理器2001包括专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。PLD例如是复杂可编程逻辑器件(complex programmable logic device,CPLD)、现场可编程逻辑门阵列(field-programmable gate array,FPGA)、通用阵列逻辑(generic array logic,GAL)或其任意组合。其可以实现或执行结合本申请实施例公开内容所描述的各种逻辑方框、模块和电路。处理器也可以是实现计算功能的组合,例如包括一个或多个微处理器组合,DSP和微处理器的组合等等。
可选的,网络设备2000还包括总线。总线用于在网络设备2000的各组件之间传送信息。总线可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图9中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
存储器2003例如是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其它类型的静态存储设备,又如是随机存取存储器(random access memory,RAM)或者可存储信息和指令的其它类型的动态存储设备,又如是电可擦可编程只读存储器(electrically erasable programmable read-only Memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其它光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其它磁存储设备,或者是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其它介质,但不限于此。存储器2003例如是独立存在,并通过总线与处理器2001相连接。存储器2003也可以和处理器2001集成在一起。
通信接口2004使用任何收发器一类的装置,用于与其它设备或通信网络通信,通信网络可以为以太网、无线接入网(RAN)或无线局域网(wireless local area networks,WLAN)等。通信接口2004可以包括有线通信接口,还可以包括无线通信接口。具体的,通信接口2004可以为以太(Ethernet)接口、快速以太(fast Ethernet,FE)接口、千兆以太(gigabit Ethernet,GE)接口,异步传输模式(asynchronous transfer mode,ATM)接口,WLAN接口,蜂窝网络通信接口或其组合。以太网接口可以是光接口,电接口或其组合。在本申请实施例中,通信接口2004可以用于网络设备2000与其他设备进行通信。
在具体实现中,作为一种实施例,处理器2001可以包括一个或多个CPU,如图9中所示的CPU0和CPU1。这些处理器中的每一个可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。
在具体实现中,作为一种实施例,网络设备2000可以包括多个处理器,如图9中所示的处理器2001和处理器2005。这些处理器中的每一个可以是一个单核处理器(single-CPU),也可以是一个多核处理器(multi-CPU)。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(如计算机程序指令)的处理核。
在具体实现中,作为一种实施例,网络设备2000还可以包括输出设备和输入设备。输出
设备和处理器2001通信,可以以多种方式来显示信息。例如,输出设备可以是液晶显示器(liquid crystal display,LCD)、发光二级管(light emitting diode,LED)显示设备、阴极射线管(cathode ray tube,CRT)显示设备或投影仪(projector)等。输入设备和处理器2001通信,可以以多种方式接收用户的输入。例如,输入设备可以是鼠标、键盘、触摸屏设备或传感设备等。
在一些实施例中,存储器2003用于存储执行本申请方案的程序代码2010,处理器2001可以执行存储器2003中存储的程序代码2010。程序代码2010中可以包括一个或多个软件模块。可选地,处理器2001自身也可以存储执行本申请方案的程序代码或指令。
其中,图4所示的获取信息的方法的各步骤通过处理器的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤,为避免重复,这里不再详细描述。
示例性地,图10是本申请实施例提供的再一种网络设备的结构示意图。该网络设备包括图1-3任一所示的处理器,该处理器用于执行上述图4所示的获取信息的方法的各个步骤。示例性地,网络设备例如是服务器或者终端,网络设备可因配置或性能不同而产生比较大的差异,可以包括一个或多个处理器1001和一个或多个存储器1002,其中,一个或多个存储器1002中存储有至少一条计算机程序,至少一条计算机程序由一个或多个处理器1001加载并执行。当然,网络设备还可以具有有线或无线网络接口、键盘以及输入输出接口等部件,以便进行输入输出,该网络设备还可以包括其他用于实现设备功能的部件,在此不做赘述。
本申请实施例还提供了一种通信装置,该装置包括:收发器、存储器和处理器。其中,收发器、存储器和处理器通过内部连接通路互相通信,存储器用于存储指令,处理器用于执行存储器存储的指令,以控制收发器接收信号,并控制收发器发送信号,并且当处理器执行存储器存储的指令时,使得处理器执行获取信息的方法。
应理解的是,上述处理器可以是CPU,还可以是其他通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者是任何常规的处理器等。值得说明的是,处理器可以是支持进阶精简指令集机器(advanced RISC machines,ARM)架构的处理器。
进一步地,在一种可选的实施例中,上述存储器可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。存储器还可以包括非易失性随机存取存储器。例如,存储器还可以存储设备类型的信息。
存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用。例如,静态随机存取存储器(static RAM,SRAM)、动态
随机存取存储器(dynamic random access memory,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
本申请实施例还提供一种芯片,该芯片包括至少一个上述实施例中的处理器,以及与处理器通信连接的复位不丢失的存储介质。示例性地,该芯片还包括复位模块,复位模块与处理器中的控制模块通信连接,复位模块用于向控制模块发送复位指示。示例性地,该芯片还包括:输入接口、输出接口和存储器,存储器包括上述复位不丢失的存储介质,输入接口、输出接口、处理器以及存储器之间通过内部连接通路相连。
在上述方法实施例中,可以全部或部分地通过硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序或者计算机程序产品的形式实现。计算机程序或者计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请所述的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,数字通用光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各实施例的步骤及组成。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
用于实现本申请实施例的方法的计算机程序代码可以用一种或多种编程语言编写。这些计算机程序代码可以提供给通用计算机、专用计算机或其他可编程的数据转发装置的处理器,使得程序代码在被计算机或其他可编程的数据转发装置执行的时候,引起在流程图和/或框图中规定的功能/操作被实施。程序代码可以完全在计算机上、部分在计算机上、作为独立的软件包、部分在计算机上且部分在远程计算机上或完全在远程计算机或服务器上执行。
在本申请实施例的上下文中,计算机程序代码或者相关数据可以由任意适当载体承载,以使得设备、装置或者处理器能够执行上文描述的各种处理和操作。载体的示例包括信号、计算机可读介质等等。信号的示例可以包括电、光、无线电、声音或其它形式的传播信号,诸如载波、红外信号等。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的处理器、单板、设备和模块的具体工作过程,可以参见前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的处理器、单板、设备和方法,可以通过其它的方式实现。例如,以上所描述的设备实施例仅仅是示意性的,各个模块的划
分仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、设备或模块的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
该作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本申请实施例方案的目的。
另外,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以是两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
本申请中术语“第一”、“第二”等字样用于对作用和功能基本相同的相同项或相似项进行区分,应理解,“第一”、“第二”、“第n”之间不具有逻辑或时序上的依赖关系,也不对数量和执行顺序进行限定。还应理解,尽管以下描述使用术语第一、第二等来描述各种元素,但这些元素不应受术语的限制。这些术语只是用于将一元素与另一元素区别分开。例如,在不脱离各种所述示例的范围的情况下,第一寄存器可以被称为第二寄存器,并且类似地,第二寄存器可以被称为第一寄存器。
还应理解,在本申请的各个实施例中,各个过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
本申请中术语“至少一个”的含义是指一个或多个,本申请中术语“多个”的含义是指两个或两个以上,例如,多个第二报文是指两个或两个以上的第二报文。本文中术语“系统”和“网络”经常可互换使用。
应理解,在本文中对各种所述示例的描述中所使用的术语只是为了描述特定示例,而并非旨在进行限制。如在对各种所述示例的描述和所附权利要求书中所使用的那样,单数形式“一个(“a”,“an”)”和“该”旨在也包括复数形式,除非上下文另外明确地指示。
还应理解,术语“包括”(也称“includes”、“including”、“comprises”和/或“comprising”)当在本说明书中使用时指定存在所陈述的特征、整数、步骤、操作、元素、和/或部件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元素、部件、和/或其分组。
还应理解,根据上下文,短语“若确定...”或“若检测到[所陈述的条件或事件]”可被解释为意指“在确定...时”或“响应于确定...”或“在检测到[所陈述的条件或事件]时”或“响应于检测到[所陈述的条件或事件]”。
应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。
还应理解,说明书通篇中提到的“一个实施例”、“一实施例”、“一种可能的实现方式”意味着与实施例或实现方式有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”、“一种可能的实现方式”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。
Claims (21)
- 一种处理器,其特征在于,所述处理器包括控制模块、第一寄存器和高速缓存,所述处理器与复位不丢失的存储介质通信连接;所述控制模块,用于获取复位指示,所述复位指示是所述处理器运行异常生成的指示;所述控制模块,还用于基于所述复位指示,获取所述处理器的相关信息,所述相关信息包括所述第一寄存器的寄存器信息或所述高速缓存存储的数据中的至少一种;所述控制模块,还用于将所述相关信息存储至所述复位不丢失的存储介质中。
- 根据权利要求1所述的处理器,其特征在于,所述处理器还包括第二寄存器,所述第二寄存器为所述处理器运行时使用的寄存器,所述第一寄存器用于记录所述第二寄存器的寄存器信息,所述第一寄存器为复位不丢失寄存器,所述复位指示用于指示所述处理器复位;所述控制模块,用于基于所述复位指示,指示所述第一寄存器停止记录所述第二寄存器的寄存器信息,在所述处理器基于所述复位指示复位后,获取所述第一寄存器的寄存器信息。
- 根据权利要求2所述的处理器,其特征在于,所述第二寄存器包括程序计数器PC、栈指针SP、帧指针FP、控制寄存器CR或连接寄存器LR中的至少一种。
- 根据权利要求1-3中任一所述的处理器,其特征在于,所述处理器具有复位管脚,所述复位管脚用于生成所述复位指示,向所述控制模块传输所述复位指示。
- 根据权利要求1-3中任一所述的处理器,其特征在于,所述控制模块与复位模块通信连接,所述复位模块用于向所述控制模块发送所述复位指示。
- 根据权利要求1-5中任一所述的处理器,其特征在于,所述复位不丢失的存储介质包括所述处理器内部的复位不丢失的内存、所述处理器内部的非易失存储介质、所述处理器外部的复位不丢失的内存或者所述处理器外部的非易失存储介质中的至少一种。
- 根据权利要求1-6中任一所述的处理器,其特征在于,所述控制模块,还用于在所述处理器复位后,从所述复位不丢失的存储介质中获取所述相关信息,基于所述相关信息生成运行异常记录。
- 一种获取信息的方法,其特征在于,所述方法应用于处理器,所述处理器包括控制模块、第一寄存器和高速缓存,所述处理器与复位不丢失的存储介质通信连接,所述方法包括:所述控制模块获取复位指示,所述复位指示是所述处理器运行异常生成的指示;所述控制模块基于所述复位指示,获取所述处理器的相关信息,所述相关信息包括所述第一寄存器的寄存器信息或所述高速缓存存储的数据中的至少一种;所述控制模块将所述相关信息存储至所述复位不丢失的存储介质中。
- 根据权利要求8所述的方法,其特征在于,所述处理器还包括第二寄存器,所述第二寄存器为所述处理器运行时使用的寄存器,所述第一寄存器用于记录所述第二寄存器的寄存器信息,所述第一寄存器为复位不丢失寄存器,所述复位指示用于指示所述处理器复位;所述基于所述复位指示,获取所述处理器的相关信息,包括:基于所述复位指示,指示所述第一寄存器停止记录所述第二寄存器的寄存器信息;在所述处理器基于所述复位指示复位后,获取所述第一寄存器的寄存器信息。
- 根据权利要求9所述的方法,其特征在于,所述第二寄存器包括程序计数器PC、栈指针SP、帧指针FP、控制寄存器CR或连接寄存器LR中的至少一种。
- 根据权利要求8-10中任一所述的方法,其特征在于,所述处理器具有复位管脚,所述方法还包括:所述处理器通过所述复位管脚生成所述复位指示;所述复位管脚向所述控制模块传输所述复位指示;所述获取复位指示,包括:接收所述复位指示。
- 根据权利要求8-10中任一所述的方法,其特征在于,所述控制模块与复位模块通信连接,所述复位模块用于向所述控制模块发送所述复位指示;所述获取复位指示,包括:接收所述复位模块发送的所述复位指示。
- 根据权利要求8-12中任一所述的方法,其特征在于,所述复位不丢失的存储介质包括所述处理器内部的复位不丢失的内存、所述处理器内部的非易失存储介质、所述处理器外部的复位不丢失的内存或者所述处理器外部的非易失存储介质中的至少一种。
- 根据权利要求8-13中任一所述的方法,其特征在于,所述方法还包括:在所述处理器复位后,所述控制模块从所述复位不丢失的存储介质中获取所述相关信息,基于所述相关信息生成运行异常记录。
- 一种单板,其特征在于,所述单板包括如权利要求1-7中任一所述的处理器,以及与所述处理器通信连接的复位不丢失的存储介质。
- 根据权利要求15所述的单板,其特征在于,所述单板还包括复位模块,所述复位模块与所述处理器中的控制模块通信连接,所述复位模块用于向所述控制模块发送复位指示。
- 一种网络设备,其特征在于,所述网络设备包括至少一个如权利要求1-7中任一所述的处理器,以及与所述处理器通信连接的复位不丢失的存储介质。
- 根据权利要求17所述的网络设备,其特征在于,所述网络设备还包括复位模块,所述复位模块与所述处理器中的控制模块通信连接,所述复位模块用于向所述控制模块发送复位指示。
- 一种网络设备,其特征在于,所述网络设备包括至少一个如权利要求15或16所述的单板。
- 一种芯片,其特征在于,所述芯片包括至少一个如权利要求1-7中任一所述的处理器,以及与所述处理器通信连接的复位不丢失的存储介质。
- 根据权利要求20所述的芯片,其特征在于,所述芯片还包括复位模块,所述复位模块与所述处理器中的控制模块通信连接,所述复位模块用于向所述控制模块发送复位指示。
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210847565 | 2022-07-19 | ||
CN202210847565.8 | 2022-07-19 | ||
CN202211175885.XA CN117453439A (zh) | 2022-07-19 | 2022-09-26 | 处理器、获取信息的方法、单板及网络设备 |
CN202211175885.X | 2022-09-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024016864A1 true WO2024016864A1 (zh) | 2024-01-25 |
Family
ID=89593461
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/098211 WO2024016864A1 (zh) | 2022-07-19 | 2023-06-05 | 处理器、获取信息的方法、单板及网络设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN117453439A (zh) |
WO (1) | WO2024016864A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118519817A (zh) * | 2024-07-19 | 2024-08-20 | 浙江大华技术股份有限公司 | 一种算力cpu的异常数据获取方法、装置和计算机设备 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030120968A1 (en) * | 2001-08-31 | 2003-06-26 | Bull Hn Information Systems Inc. | Preserving dump capability after a fault-on-fault or related type failure in a fault tolerant computer system |
CN101122865A (zh) * | 2007-04-26 | 2008-02-13 | 晶天电子(深圳)有限公司 | 一种使用相变存储器的计算机主板快速挂起和恢复装置 |
US20130145137A1 (en) * | 2011-12-02 | 2013-06-06 | Qualcomm Incorporated | Methods and Apparatus for Saving Conditions Prior to a Reset for Post Reset Evaluation |
US20180341537A1 (en) * | 2017-05-26 | 2018-11-29 | Intel Corporation | Disambiguation of error logging during system reset |
CN111208893A (zh) * | 2020-01-13 | 2020-05-29 | 深圳震有科技股份有限公司 | 一种cpu复位的控制方法、系统及存储介质 |
CN113448421A (zh) * | 2021-05-27 | 2021-09-28 | 山东英信计算机技术有限公司 | 一种设备掉电管理方法和装置 |
CN113835923A (zh) * | 2020-06-24 | 2021-12-24 | 华为技术有限公司 | 一种复位系统、数据处理系统以及相关设备 |
-
2022
- 2022-09-26 CN CN202211175885.XA patent/CN117453439A/zh active Pending
-
2023
- 2023-06-05 WO PCT/CN2023/098211 patent/WO2024016864A1/zh unknown
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030120968A1 (en) * | 2001-08-31 | 2003-06-26 | Bull Hn Information Systems Inc. | Preserving dump capability after a fault-on-fault or related type failure in a fault tolerant computer system |
CN101122865A (zh) * | 2007-04-26 | 2008-02-13 | 晶天电子(深圳)有限公司 | 一种使用相变存储器的计算机主板快速挂起和恢复装置 |
US20130145137A1 (en) * | 2011-12-02 | 2013-06-06 | Qualcomm Incorporated | Methods and Apparatus for Saving Conditions Prior to a Reset for Post Reset Evaluation |
US20180341537A1 (en) * | 2017-05-26 | 2018-11-29 | Intel Corporation | Disambiguation of error logging during system reset |
CN111208893A (zh) * | 2020-01-13 | 2020-05-29 | 深圳震有科技股份有限公司 | 一种cpu复位的控制方法、系统及存储介质 |
CN113835923A (zh) * | 2020-06-24 | 2021-12-24 | 华为技术有限公司 | 一种复位系统、数据处理系统以及相关设备 |
WO2021259351A1 (zh) * | 2020-06-24 | 2021-12-30 | 华为技术有限公司 | 一种复位系统、数据处理系统以及相关设备 |
CN113448421A (zh) * | 2021-05-27 | 2021-09-28 | 山东英信计算机技术有限公司 | 一种设备掉电管理方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
CN117453439A (zh) | 2024-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019095655A1 (zh) | 一种数据交互方法和计算设备 | |
US7970958B2 (en) | Peripheral interface alert message for downstream device | |
US9734031B2 (en) | Synchronous input/output diagnostic controls | |
US10114723B2 (en) | Synchronous input/output measurement data | |
WO2013188332A1 (en) | Software handling of hardware error handling in hypervisor-based systems | |
WO2024016864A1 (zh) | 处理器、获取信息的方法、单板及网络设备 | |
US10157005B2 (en) | Utilization of non-volatile random access memory for information storage in response to error conditions | |
US20230367508A1 (en) | Complex programmable logic device and communication method | |
TWI777628B (zh) | 電腦系統及其專用崩潰轉存硬體裝置與記錄錯誤資料之方法 | |
US8346975B2 (en) | Serialized access to an I/O adapter through atomic operation | |
US11163644B2 (en) | Storage boost | |
WO2020238746A1 (zh) | 日志信息处理系统、日志信息处理方法及装置和交换机 | |
US8689059B2 (en) | System and method for handling system failure | |
US20140223066A1 (en) | Multi-Node Management Mechanism | |
CN113010303A (zh) | 一种处理器间的数据交互方法、装置以及服务器 | |
US7770054B2 (en) | Apparatus, system, and method to prevent queue stalling | |
TWI795950B (zh) | 硬碟監控方法、電子裝置及存儲介質 | |
CN117873853B (zh) | 数据记录方法、装置、电子设备及介质 | |
US20210208869A1 (en) | System and method to provide smm runtime telemetry support | |
CN117971497B (zh) | 一种数据处理方法、系统、设备、服务器及介质 | |
US20060107004A1 (en) | Recovery from failure in data storage systems | |
JP3261665B2 (ja) | データ転送方法及びデータ処理システム | |
JPH11143789A (ja) | バストレース装置 | |
JPH05224964A (ja) | バス異常通知方式 | |
CN115686896A (zh) | 扩展内存错误处理方法、系统、电子设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23841931 Country of ref document: EP Kind code of ref document: A1 |