US20210149760A1 - Method and apparatus to identify a problem area in an information handling system based on latencies - Google Patents
Method and apparatus to identify a problem area in an information handling system based on latencies Download PDFInfo
- Publication number
- US20210149760A1 US20210149760A1 US16/685,303 US201916685303A US2021149760A1 US 20210149760 A1 US20210149760 A1 US 20210149760A1 US 201916685303 A US201916685303 A US 201916685303A US 2021149760 A1 US2021149760 A1 US 2021149760A1
- Authority
- US
- United States
- Prior art keywords
- ihs
- client
- threshold values
- information
- latency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
- G06F11/0754—Error or fault detection not based on redundancy by exceeding limits
- G06F11/0757—Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/079—Root cause analysis, i.e. error or fault diagnosis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0793—Remedial or corrective actions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/86—Event-based monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- the present disclosure generally relates to information handling systems, and more particularly relates to identifying a problem area in an information handling system based on latencies.
- An information handling system generally processes, compiles, stores, or communicates information or data for business, personal, or other purposes.
- Technology and information handling needs and requirements can vary between different applications.
- information handling systems can also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information can be processed, stored, or communicated.
- the variations in information handling systems allow information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications.
- information handling systems can include a variety of hardware and software resources that can be configured to process, store, and communicate information and can include one or more computer systems, graphics interface systems, data storage systems, networking systems, and mobile communication systems.
- Information handling systems can also implement various virtualized architectures. Data and voice communications among information handling systems may be via networks that are wired, wireless, or some combination.
- An information handling system may obtain timing information for processing among layers of a first client-side information handling system, and compare the timing information to threshold values to provide a comparison.
- the information handling system may use the comparison to identify an area of the first client-side information handling system in which a problem exists, and initiate remedial action directed to the problem.
- FIG. 1 is a block diagram illustrating an information handling system according to an embodiment of the present disclosure
- FIG. 2 is a block diagram illustrating layers of an information handling system according to an embodiment of the present disclosure
- FIG. 3 is a block diagram illustrating a latencies between layers of an information handling system according to an embodiment of the present disclosure.
- FIG. 4 is a flow diagram illustrating a method according to an embodiment of the present disclosure.
- FIG. 1 shows a generalized embodiment of information handling system 100 .
- information handling system 100 can include any instrumentality or aggregate of instrumentalities operable to compute, classify, process, transmit, receive, retrieve, originate, switch, store, display, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data for business, scientific, control, entertainment, or other purposes.
- information handling system 100 can be a personal computer, a laptop computer, a smart phone, a tablet device or other consumer electronic device, a network server, a network storage device, a switch router or other network communication device, or any other suitable device and may vary in size, shape, performance, functionality, and price.
- information handling system 100 can include processing resources for executing machine-executable code, such as a central processing unit (CPU), a programmable logic array (PLA), an embedded device such as a System-on-a-Chip (SoC), or other control logic hardware.
- Information handling system 100 can also include one or more computer-readable medium for storing machine-executable code, such as software or data.
- Additional components of information handling system 100 can include one or more storage devices that can store machine-executable code, one or more communications ports for communicating with external devices, and various input and output (I/O) devices, such as a keyboard, a mouse, and a video display.
- Information handling system 100 can also include one or more buses operable to transmit information between the various hardware components.
- Information handling system 100 can include devices or modules that embody one or more of the devices or modules described above, and operates to perform one or more of the methods described above.
- Information handling system 100 includes a processors 102 and 104 , a chipset 110 , a memory 120 , a graphics adapter 130 , include a basic input and output system/extensible firmware interface (BIOS/EFI) module 140 , a disk controller 150 , a disk emulator 160 , an input/output (I/O) interface 170 , and a network interface 180 .
- BIOS/EFI basic input and output system/extensible firmware interface
- Processor 102 is connected to chipset 110 via processor interface 106
- processor 104 is connected to chipset 110 via processor interface 108 .
- Memory 120 is connected to chipset 110 via a memory bus 122 .
- Graphics adapter 130 is connected to chipset 110 via a graphics interface 132 , and provides a video display output 136 to a video display 134 .
- information handling system 100 includes separate memories that are dedicated to each of processors 102 and 104 via separate memory interfaces.
- An example of memory 120 includes random access memory (RAM) such as static RAM (SRAM), dynamic RAM (DRAM), non-volatile RAM (NV-RAM), or the like, read only memory (ROM), another type of memory, or a combination thereof.
- RAM random access memory
- SRAM static RAM
- DRAM dynamic RAM
- NV-RAM non-volatile RAM
- ROM read only memory
- BIOS/EFI module 140 , disk controller 150 , and I/O interface 170 are connected to chipset 110 via an I/O channel 112 .
- I/O channel 112 includes a Peripheral Component Interconnect (PCI) interface, a PCI-Extended (PCI-X) interface, a high-speed PCI-Express (PCIe) interface, another industry standard or proprietary communication interface, or a combination thereof.
- Chipset 110 can also include one or more other I/O interfaces, including an Industry Standard Architecture (ISA) interface, a Small Computer Serial Interface (SCSI) interface, an Inter-Integrated Circuit (I 2 C) interface, a Serial Peripheral Interface (SPI), a Universal Serial Bus (USB), another interface, or a combination thereof.
- ISA Industry Standard Architecture
- SCSI Small Computer Serial Interface
- I 2 C Inter-Integrated Circuit
- SPI Serial Peripheral Interface
- USB Universal Serial Bus
- BIOS/EFI module 140 includes BIOS/EFI code operable to detect resources within information handling system 100 , to provide drivers for the resources, initialize the resources, and access the resources. BIOS/EFI module 140 includes code that operates to detect resources within information handling system 100 , to provide drivers for the resources, to initialize the resources, and to access the resources.
- Disk controller 150 includes a disk interface 152 that connects the disc controller to a hard disk drive (HDD) 154 , to an optical disk drive (ODD) 156 , and to disk emulator 160 .
- disk interface 152 includes an Integrated Drive Electronics (IDE) interface, an Advanced Technology Attachment (ATA) such as a parallel ATA (PATA) interface or a serial ATA (SATA) interface, a SCSI interface, a USB interface, a proprietary interface, or a combination thereof.
- Disk emulator 160 permits a solid-state drive 164 to be connected to information handling system 100 via an external interface 162 .
- An example of external interface 162 includes a USB interface, an IEEE 1194 (Firewire) interface, a proprietary interface, or a combination thereof.
- solid-state drive 164 can be disposed within information handling system 100 .
- I/O interface 170 includes a peripheral interface 172 that connects the I/O interface to an add-on resource 174 and to network interface 180 .
- Peripheral interface 172 can be the same type of interface as I/O channel 112 , or can be a different type of interface.
- I/O interface 170 extends the capacity of I/O channel 112 when peripheral interface 172 and the I/O channel are of the same type, and the I/O interface translates information from a format suitable to the I/O channel to a format suitable to the peripheral channel 172 when they are of a different type.
- Add-on resource 174 can include a data storage system, an additional graphics interface, a network interface card (NIC), a sound/video processing card, another add-on resource, or a combination thereof.
- Add-on resource 174 can be on a main circuit board, on separate circuit board or add-in card disposed within information handling system 100 , a device that is external to the information handling system, or a combination thereof.
- Network interface 180 represents a NIC disposed within information handling system 100 , on a main circuit board of the information handling system, integrated onto another component such as chipset 110 , in another suitable location, or a combination thereof.
- Network interface device 180 includes network channels 182 and 184 that provide interfaces to devices that are external to information handling system 100 .
- network channels 182 and 184 are of a different type than peripheral channel 172 and network interface 180 translates information from a format suitable to the peripheral channel to a format suitable to external devices.
- An example of network channels 182 and 184 includes InfiniB and channels, Fibre Channel channels, Gigabit Ethernet channels, proprietary channel architectures, or a combination thereof.
- Network channels 182 and 184 can be connected to external network resources (not illustrated).
- the network resource can include another information handling system, a data storage system, another network, a grid management system, another suitable resource, or a combination thereof.
- BMC 190 is connected by a management interface 192 to a plurality of system components, such as processor 102 , processor 104 , memory 120 , chipset 110 , graphics adapter 130 , I/O interface 170 , disk controller 150 , NVRAM module 140 , TPM 176 , network interface 180 , and add-on resource 174 .
- BMC 190 is connected to an external management interface 194 for platform management by an external IHS.
- FIG. 2 shows layers of an information handling system according to an embodiment of the present disclosure.
- Information handling system environment 200 includes user 204 and information handling system 205 .
- Information handling system 205 includes hardware 201 , operating system 202 , and software application 203 .
- a request 211 can be passed from user 204 to software application 203 . Consequently, a request 212 can be passed from software application 203 to operating system (OS) 202 . Consequently, a request 213 can be passed from OS 202 to hardware 201 .
- OS operating system
- a response 214 can be passed from hardware 201 to OS 202 . Consequently, a response 215 can be passed from OS 202 to software application 203 . Consequently, a response 216 can be passed from software application 203 to user 204 .
- FIG. 3 shows latencies between layers of an information handling system according to an embodiment of the present disclosure.
- a latency 301 having a time period T 1 exists for a request to pass from software application 203 to OS 202 .
- a latency 302 having a time period T 2 exists for a request to pass from OS 202 to hardware 201 .
- a latency 303 having a time period T 3 exists for a response to pass from hardware 201 to OS 202 .
- a latency 304 having a time period T 4 exists for a response to pass from OS 202 to software application 203 .
- FIG. 4 shows a method 400 that begins at block 401 and continues to block 402 .
- a client-event-based collection is performed.
- Information as to events occurring on a client-side information handling system is collected.
- error messages for errors occurring on the client-side information handling system can be collected.
- method 400 continues to block 403 .
- timing information such as latency information, for operations between two or more layers of the client-side information handling system is collected from the client-side information handling system.
- method 400 continues to block 404 .
- obtained values are compared with a user-specific threshold value. From block 404 , method 400 continues to decision block 405 .
- a decision is made as to whether or not a difference is found between the obtained values and the user-specific threshold value (e.g., whether or not the user-specific threshold value has been exceeded). If not, method 400 continues to block 408 . If so, method 400 continues to block 406 . At block 406 , a comparison of the TUSER(X) values of different layers of the client-side information handling system is performed. From block 406 , method 400 continues to decision block 407 . At decision block 407 , based on the comparison of block 406 , a decision is made as to whether or not execution of software on the client-side information handling system has failed. If so, method 400 continues to block 409 . If not, method 400 continues to block 408 .
- old behavior e.g., behavior as indicated by values obtained prior to any indication of failure
- method 400 continues to block 409 .
- a root cause of failure of execution of software on the client-side information handling system is detected in a server-side (e.g., backend) information handling system.
- method 400 continues to block 410 .
- a solution is proposed and the teaching of a machine is started via ML. By starting the ML, relationships between such data elements as the obtained values, the detected root cause, and the proposed solution can be learned to improve future performance of the system when faced with similar (or different) problems.
- method 400 continues to block 411 .
- the results are passed to resolution entity.
- method 400 continues to block 412 .
- the resolution entity analyzes the data and solves the problem.
- method 400 continues to block 413 , where method 400 ends.
- one of the time-consuming tasks is to determine the area in which a problem lies and to forward it to a relevant solution mechanism. If the area in which a problem lies is incorrectly determined, the problem can be dispatched to the wrong solution mechanism for a different type of problem, and solution of the problem can be delayed and complicated, which can result in increasing the turnaround time for providing an appropriate solution to the problem.
- a method and apparatus are provided to identify the area of an IHS within which a problem lies.
- the area may be selected from a group consisting of the hardware of the IHS, the operating system of the IHS, and a software application of the IHS.
- determination of the area in which problem lies can be followed by selective automatic dispatch of the problem to a particular problem resolution entity among a plurality of problem resolution entities.
- a method and apparatus are provided for a client-side IHS to gather information indicative of a location of a problem on the client-side IHS.
- information regarding the flow of a user's request and the response of the IHS to the request can be collected.
- the user's request may, for example, be a request made via a software application.
- the software application may process the request and initiate action to be taken by the OS.
- the OS may, in turn, initiate action to be taken in a hardware-level component of the IHS.
- the hardware-level component may return a result to the OS.
- the OS may, in turn, return the result to the software application.
- the software application may cause action observable to the user.
- the time gaps between the different layers for a single user's request and its response can be measured and collected, for example, by a system management application operating on the IHS.
- the timing data can be used to establish default threshold values of at least one combination of at least two of a time t 1 for a request to be sent from a software application to the OS, a time t 2 for a request to be sent from the OS to the hardware of the IHS, a time t 3 for a reply to be sent from the hardware of the IHS to the OS, and a time t 4 for a reply to be sent from the OS to the software application.
- the default threshold values can be specific to a particular user's IHS, generalized for a plurality of users' IHSs, or generalized for a plurality of users' IHSs but adjusted to tailor them to a particular user IHS.
- the default threshold values can be promulgated in conjunction with a system management application.
- the default threshold values can change, for example, being modified according to a particular user's way of using that particular user's IHS. Thus, user-specific threshold values can be utilized.
- a system management application can collect measurements of the time periods between different layers for a combination of at least two requests or replies.
- the system management application can upload the measurements to a backend server for storage, analysis, and determination of default threshold values.
- the collected values can be analyzed with respect to an operational status of a client-side IHS from which they are obtained.
- the client-side IHS may have a fully operational status, from which a first set of measurements are collected, but then descend into a degraded performance status, from which a second set of measurements are collected.
- a relationship of the second set of measurements to the first set of measurements can be used to determine the default threshold values.
- Such measurements and determinations can be made with respect to a particular user's IHS or to a plurality of users' IHSs.
- the plurality of data sets can be considered as part of an overall data set collected from different IHSs to analyze and determine the generic default threshold values applicable to a plurality of IHSs.
- the generic default threshold values can be adjusted or replaced to assure applicability to a particular IHS.
- the generic default threshold values are obtained (e.g., averaged) from a large number of measurements of a large number of IHSs, there may be outlier IHSs for which adjusted values may provide better results.
- a determination of whether to use the generic default threshold values or to adjust the generic default threshold values can be made by comparing the measurements obtained from a particular user's IHS to the default threshold values.
- the default threshold values may not be ideal for determining when measurements of the particular user's IHS indicate abnormal operation, and the decision can be made to adjust the generic default threshold values to provide adjusted generic default threshold values tailored to the particular user's IHS.
- the collection of measurement data and the use of generic default threshold values can begin immediately upon implementation of an embodiment, and the adjusted generic default threshold values can continue to improve performance as usage of the particular user's IHS continues.
- the adjusted generic default threshold values can be used as specific threshold values, or specific threshold values can be established via a different route, for example, by evaluating measurement data from the particular user's IHS without dependence on generic default threshold values.
- the adaptive capabilities of at least one embodiment can provide improved results over time.
- a method and apparatus are provided for a server-side IHS to receive information indicative of a location of a problem on the client-side IHS.
- the information may be in the form of measurements of timing of processing of information for different areas of the client-side IHS.
- the information may include the timing information comprising at least two of the t 1 , t 2 , t 3 , and t 4 time periods described above.
- the timing information can be obtained during both a fully operational status and a degraded performance status of a client-side IHS.
- the timing information can be received from a single IHS or from a plurality of IHSs. Collected values from a client-side IHS can be used by a server-side (e.g., backend) IHS to guide dispatch of a problem with the client-side IHS for resolution of the problem.
- the following are collected values which can be used, as may be obtained from measurements made on a client-side IHS:
- TUSER(X) can have values for X as follow:
- an area of a problem within an IHS can be localized quickly and efficiently without having to perform traditional responsive remote access on the IHS to begin a process of finding the area of the problem within the IHS.
- Implementation of at least one embodiment can greatly reduce the time needed to solve a problem with an IHS by eliminating a need to examine areas of the IHS where the problem does not exist but by instead identifying the specific area of the IHS where the problem does exist.
- the area in which the problem exists can be identified.
- the layer (e.g., software application, OS, or hardware) of the IHS at which the problem exists can be identified.
- the type of problem can, in at least some cases, be identified.
- the determination of the type of the problem can be correlated with the identified area of the problem, and correlation between the type and the area can provide confirmation (e.g., cross-confirmation) of those determinations.
- a lack of correlation between the type and the area can be used to cause further measurement, further analysis, indication of the lack of correlation, the like, or combinations thereof.
- a method comprises obtaining timing information for processing among layers of a first client-side information handling system (IHS); comparing the timing information to threshold values to provide a comparison; using the comparison to identify an area of the first client-side IHS in which a problem exists in the first client-side IHS; and initiating remedial action directed to the problem in the area of the first client-side IHS.
- the area is selected from a group consisting of a software application, an operating system (OS), and a hardware component of the first client-side IHS.
- the timing information comprises a first time period representing a latency of a first request from the software application to the OS and a second time period representing a latency of a second request from the OS to the hardware component.
- the timing information comprises a third time period representing a latency of a first response from the hardware component to the OS and a fourth time period representing a latency of a second response from the OS to the software application.
- the threshold values are generic default threshold values generated from processing of information regarding timing obtained from a plurality of client-side IHSs.
- the threshold values are user-specific threshold values generated by modifying generic default threshold values generated from the processing of information regarding timing obtained from a plurality of client-side IHSs, wherein the modifying comprises adjusting the generic default threshold values based on specific information of the first client-side IHS.
- the initiating remedial action directed to the problem in the area of the first client-side IHS is performed using machine learning (ML).
- an information handling system comprises a memory; and a processor, the processor configured to obtain timing information for processing among layers of a first client-side information handling system (IHS), to compare the timing information to threshold values to provide a comparison, to use the comparison to identify an area of the first client-side IHS in which a problem exists in the first client-side IHS, and to initiate remedial action directed to the problem in the area of the first client-side IHS.
- the area is selected from a group consisting of a software application, an operating system (OS), and a hardware component of the first client-side IHS.
- the timing information comprises a first time period representing a latency of a first request from the software application to the OS and a second time period representing a latency of a second request from the OS to the hardware component.
- the timing information comprises a third time period representing a latency of a first response from the hardware component to the OS and a fourth time period representing a latency of a second response from the OS to the software application.
- the threshold values are generic default threshold values generated from processing of information regarding timing obtained from a plurality of client-side IHSs.
- the threshold values are user-specific threshold values generated by modifying generic default threshold values generated from the processing of information regarding timing obtained from a plurality of client-side IHSs, wherein the modifying comprises adjusting the generic default threshold values based on specific information of the first client-side IHS.
- the initiating remedial action directed to the problem in the area of the first client-side IHS is performed using machine learning (ML).
- a method comprises obtaining timing information for processing among layers of a first client-side information handling system (IHS), the timing information comprising at least two of a first time period representing a latency of a first request from a software application of the client-side IHS to an operating system (OS) of the client-side IHS, a second time period representing a latency of a second request from the OS to a hardware component of the client-side IHS, a third time period representing a latency of a first response from the hardware component to the OS, and a fourth time period representing a latency of a second response from the OS to the software application; comparing the timing information to threshold values to provide a comparison; using the comparison to identify an area of the first client-side IHS in which a problem exists in the first client-side IHS; and initiating remedial action directed to the problem in the area of the first client-side IHS.
- IHS client-side information handling system
- the area is selected from a group consisting of the software application, the operating system (OS), and the hardware component of the first client-side IHS.
- the threshold values are generic default threshold values generated from processing of information regarding timing obtained from a plurality of client-side IHSs.
- the threshold values are user-specific threshold values generated by modifying generic default threshold values generated from the processing of information regarding timing obtained from a plurality of client-side IHSs, wherein the modifying comprises adjusting the generic default threshold values based on specific information of the first client-side IHS.
- the initiating remedial action directed to the problem in the area of the first client-side IHS is performed using machine learning (ML).
- the generic default threshold values are downloaded to the client-side IHS.
- an information handling system device may be hardware such as, for example, an integrated circuit (such as an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a structured ASIC, or a device embedded on a larger chip), a card (such as a Peripheral Component Interface (PCI) card, a PCI-express card, a Personal Computer Memory Card International Association (PCMCIA) card, or other such expansion card), or a system (such as a motherboard, a system-on-a-chip (SoC), or a stand-alone device).
- an integrated circuit such as an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a structured ASIC, or a device embedded on a larger chip
- a card such as a Peripheral Component Interface (PCI) card, a PCI-express card, a Personal Computer Memory Card International Association (PCMCIA) card, or other such expansion card
- PCI Peripheral Component Interface
- the methods described herein may be implemented by software programs executable by a computer system.
- implementations can include distributed processing, component/object distributed processing, and parallel processing.
- virtual computer system processing can be constructed to implement one or more of the methods or functionality as described herein.
- the present disclosure contemplates a computer-readable medium that includes instructions or receives and executes instructions responsive to a propagated signal; so that a device connected to a network can communicate voice, video or data over the network. Further, the instructions may be transmitted or received over the network via the network interface device.
- While the computer-readable medium is shown to be a single medium, the term “computer-readable medium” includes a single medium or multiple media, such as a centralized or distributed database, and/or associated caches and servers that store one or more sets of instructions.
- the term “computer-readable medium” shall also include any medium that is capable of storing, encoding or carrying a set of instructions for execution by a processor or that cause a computer system to perform any one or more of the methods or operations disclosed herein.
- the computer-readable medium can include a solid-state memory such as a memory card or other package that houses one or more non-volatile read-only memories.
- the computer-readable medium can be a random access memory or other volatile re-writable memory.
- the computer-readable medium can include a magneto-optical or optical medium, such as a disk or tapes or other storage device to store information received via carrier wave signals such as a signal communicated over a transmission medium.
- a digital file attachment to an e-mail or other self-contained information archive or set of archives may be considered a distribution medium that is equivalent to a tangible storage medium. Accordingly, the disclosure is considered to include any one or more of a computer-readable medium or a distribution medium and other equivalents and successor media, in which data or instructions may be stored.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
- The present disclosure generally relates to information handling systems, and more particularly relates to identifying a problem area in an information handling system based on latencies.
- As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option is an information handling system. An information handling system generally processes, compiles, stores, or communicates information or data for business, personal, or other purposes. Technology and information handling needs and requirements can vary between different applications. Thus information handling systems can also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information can be processed, stored, or communicated. The variations in information handling systems allow information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems can include a variety of hardware and software resources that can be configured to process, store, and communicate information and can include one or more computer systems, graphics interface systems, data storage systems, networking systems, and mobile communication systems. Information handling systems can also implement various virtualized architectures. Data and voice communications among information handling systems may be via networks that are wired, wireless, or some combination.
- An information handling system may obtain timing information for processing among layers of a first client-side information handling system, and compare the timing information to threshold values to provide a comparison. The information handling system may use the comparison to identify an area of the first client-side information handling system in which a problem exists, and initiate remedial action directed to the problem.
- It will be appreciated that for simplicity and clarity of illustration, elements illustrated in the Figures are not necessarily drawn to scale. For example, the dimensions of some elements may be exaggerated relative to other elements. Embodiments incorporating teachings of the present disclosure are shown and described with respect to the drawings herein, in which:
-
FIG. 1 is a block diagram illustrating an information handling system according to an embodiment of the present disclosure; -
FIG. 2 is a block diagram illustrating layers of an information handling system according to an embodiment of the present disclosure; -
FIG. 3 is a block diagram illustrating a latencies between layers of an information handling system according to an embodiment of the present disclosure; and -
FIG. 4 is a flow diagram illustrating a method according to an embodiment of the present disclosure. - The use of the same reference symbols in different drawings indicates similar or identical items.
- The following description in combination with the Figures is provided to assist in understanding the teachings disclosed herein. The description is focused on specific implementations and embodiments of the teachings, and is provided to assist in describing the teachings. This focus should not be interpreted as a limitation on the scope or applicability of the teachings.
-
FIG. 1 shows a generalized embodiment ofinformation handling system 100. For purpose of this disclosureinformation handling system 100 can include any instrumentality or aggregate of instrumentalities operable to compute, classify, process, transmit, receive, retrieve, originate, switch, store, display, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data for business, scientific, control, entertainment, or other purposes. For example,information handling system 100 can be a personal computer, a laptop computer, a smart phone, a tablet device or other consumer electronic device, a network server, a network storage device, a switch router or other network communication device, or any other suitable device and may vary in size, shape, performance, functionality, and price. Further,information handling system 100 can include processing resources for executing machine-executable code, such as a central processing unit (CPU), a programmable logic array (PLA), an embedded device such as a System-on-a-Chip (SoC), or other control logic hardware.Information handling system 100 can also include one or more computer-readable medium for storing machine-executable code, such as software or data. Additional components ofinformation handling system 100 can include one or more storage devices that can store machine-executable code, one or more communications ports for communicating with external devices, and various input and output (I/O) devices, such as a keyboard, a mouse, and a video display.Information handling system 100 can also include one or more buses operable to transmit information between the various hardware components. -
Information handling system 100 can include devices or modules that embody one or more of the devices or modules described above, and operates to perform one or more of the methods described above.Information handling system 100 includes aprocessors chipset 110, amemory 120, agraphics adapter 130, include a basic input and output system/extensible firmware interface (BIOS/EFI)module 140, adisk controller 150, adisk emulator 160, an input/output (I/O)interface 170, and anetwork interface 180.Processor 102 is connected tochipset 110 viaprocessor interface 106, andprocessor 104 is connected tochipset 110 viaprocessor interface 108.Memory 120 is connected tochipset 110 via amemory bus 122.Graphics adapter 130 is connected tochipset 110 via agraphics interface 132, and provides avideo display output 136 to avideo display 134. In a particular embodiment,information handling system 100 includes separate memories that are dedicated to each ofprocessors memory 120 includes random access memory (RAM) such as static RAM (SRAM), dynamic RAM (DRAM), non-volatile RAM (NV-RAM), or the like, read only memory (ROM), another type of memory, or a combination thereof. - BIOS/
EFI module 140,disk controller 150, and I/O interface 170 are connected tochipset 110 via an I/O channel 112. An example of I/O channel 112 includes a Peripheral Component Interconnect (PCI) interface, a PCI-Extended (PCI-X) interface, a high-speed PCI-Express (PCIe) interface, another industry standard or proprietary communication interface, or a combination thereof.Chipset 110 can also include one or more other I/O interfaces, including an Industry Standard Architecture (ISA) interface, a Small Computer Serial Interface (SCSI) interface, an Inter-Integrated Circuit (I2C) interface, a Serial Peripheral Interface (SPI), a Universal Serial Bus (USB), another interface, or a combination thereof. BIOS/EFI module 140 includes BIOS/EFI code operable to detect resources withininformation handling system 100, to provide drivers for the resources, initialize the resources, and access the resources. BIOS/EFI module 140 includes code that operates to detect resources withininformation handling system 100, to provide drivers for the resources, to initialize the resources, and to access the resources. -
Disk controller 150 includes adisk interface 152 that connects the disc controller to a hard disk drive (HDD) 154, to an optical disk drive (ODD) 156, and todisk emulator 160. An example ofdisk interface 152 includes an Integrated Drive Electronics (IDE) interface, an Advanced Technology Attachment (ATA) such as a parallel ATA (PATA) interface or a serial ATA (SATA) interface, a SCSI interface, a USB interface, a proprietary interface, or a combination thereof.Disk emulator 160 permits a solid-state drive 164 to be connected toinformation handling system 100 via anexternal interface 162. An example ofexternal interface 162 includes a USB interface, an IEEE 1194 (Firewire) interface, a proprietary interface, or a combination thereof. Alternatively, solid-state drive 164 can be disposed withininformation handling system 100. - I/
O interface 170 includes aperipheral interface 172 that connects the I/O interface to an add-onresource 174 and tonetwork interface 180.Peripheral interface 172 can be the same type of interface as I/O channel 112, or can be a different type of interface. As such, I/O interface 170 extends the capacity of I/O channel 112 whenperipheral interface 172 and the I/O channel are of the same type, and the I/O interface translates information from a format suitable to the I/O channel to a format suitable to theperipheral channel 172 when they are of a different type. Add-onresource 174 can include a data storage system, an additional graphics interface, a network interface card (NIC), a sound/video processing card, another add-on resource, or a combination thereof. Add-onresource 174 can be on a main circuit board, on separate circuit board or add-in card disposed withininformation handling system 100, a device that is external to the information handling system, or a combination thereof. -
Network interface 180 represents a NIC disposed withininformation handling system 100, on a main circuit board of the information handling system, integrated onto another component such aschipset 110, in another suitable location, or a combination thereof.Network interface device 180 includesnetwork channels 182 and 184 that provide interfaces to devices that are external toinformation handling system 100. In a particular embodiment,network channels 182 and 184 are of a different type thanperipheral channel 172 andnetwork interface 180 translates information from a format suitable to the peripheral channel to a format suitable to external devices. An example ofnetwork channels 182 and 184 includes InfiniB and channels, Fibre Channel channels, Gigabit Ethernet channels, proprietary channel architectures, or a combination thereof.Network channels 182 and 184 can be connected to external network resources (not illustrated). The network resource can include another information handling system, a data storage system, another network, a grid management system, another suitable resource, or a combination thereof. - BMC 190 is connected by a
management interface 192 to a plurality of system components, such asprocessor 102,processor 104,memory 120,chipset 110,graphics adapter 130, I/O interface 170,disk controller 150, NVRAMmodule 140, TPM 176,network interface 180, and add-onresource 174. BMC 190 is connected to anexternal management interface 194 for platform management by an external IHS. -
FIG. 2 shows layers of an information handling system according to an embodiment of the present disclosure. Informationhandling system environment 200 includesuser 204 andinformation handling system 205.Information handling system 205 includeshardware 201,operating system 202, andsoftware application 203. Arequest 211 can be passed fromuser 204 tosoftware application 203. Consequently, arequest 212 can be passed fromsoftware application 203 to operating system (OS) 202. Consequently, arequest 213 can be passed fromOS 202 tohardware 201. In reply, aresponse 214 can be passed fromhardware 201 toOS 202. Consequently, aresponse 215 can be passed fromOS 202 tosoftware application 203. Consequently, aresponse 216 can be passed fromsoftware application 203 touser 204. -
FIG. 3 shows latencies between layers of an information handling system according to an embodiment of the present disclosure. Alatency 301 having a time period T1 exists for a request to pass fromsoftware application 203 toOS 202. Alatency 302 having a time period T2 exists for a request to pass fromOS 202 tohardware 201. Alatency 303 having a time period T3 exists for a response to pass fromhardware 201 toOS 202. Alatency 304 having a time period T4 exists for a response to pass fromOS 202 tosoftware application 203. -
FIG. 4 shows amethod 400 that begins atblock 401 and continues to block 402. Atblock 402, a client-event-based collection is performed. Information as to events occurring on a client-side information handling system is collected. As an example, error messages for errors occurring on the client-side information handling system can be collected. Fromblock 402,method 400 continues to block 403. Atblock 403, timing information, such as latency information, for operations between two or more layers of the client-side information handling system is collected from the client-side information handling system. Fromblock 403,method 400 continues to block 404. Atblock 404, obtained values are compared with a user-specific threshold value. Fromblock 404,method 400 continues todecision block 405. At decision block, a decision is made as to whether or not a difference is found between the obtained values and the user-specific threshold value (e.g., whether or not the user-specific threshold value has been exceeded). If not,method 400 continues to block 408. If so,method 400 continues to block 406. Atblock 406, a comparison of the TUSER(X) values of different layers of the client-side information handling system is performed. Fromblock 406,method 400 continues todecision block 407. Atdecision block 407, based on the comparison ofblock 406, a decision is made as to whether or not execution of software on the client-side information handling system has failed. If so,method 400 continues to block 409. If not,method 400 continues to block 408. Atblock 408, old behavior (e.g., behavior as indicated by values obtained prior to any indication of failure) is analyzed using machine learning (ML). Fromblock 408,method 400 continues to block 409. Atblock 409, a root cause of failure of execution of software on the client-side information handling system is detected in a server-side (e.g., backend) information handling system. Fromblock 409,method 400 continues to block 410. Atblock 410, based on the detected root cause, a solution is proposed and the teaching of a machine is started via ML. By starting the ML, relationships between such data elements as the obtained values, the detected root cause, and the proposed solution can be learned to improve future performance of the system when faced with similar (or different) problems. Fromblock 410,method 400 continues to block 411. Atblock 411, the results are passed to resolution entity. Fromblock 411,method 400 continues to block 412. Atblock 412, the resolution entity analyzes the data and solves the problem. Fromblock 412,method 400 continues to block 413, wheremethod 400 ends. - In diagnosing problems with technological systems, for example, information handling systems, one of the time-consuming tasks is to determine the area in which a problem lies and to forward it to a relevant solution mechanism. If the area in which a problem lies is incorrectly determined, the problem can be dispatched to the wrong solution mechanism for a different type of problem, and solution of the problem can be delayed and complicated, which can result in increasing the turnaround time for providing an appropriate solution to the problem.
- Much time addressing technological problems is spent finding out area of the problem, for example, whether the problem is a hardware problem, an operating system (OS) problem, or an application software problem. Any delay in determining the area of the problem leads to increased turnaround time to find the root cause.
- In accordance with at least one embodiment, a method and apparatus are provided to identify the area of an IHS within which a problem lies. In accordance with at least one embodiment, the area may be selected from a group consisting of the hardware of the IHS, the operating system of the IHS, and a software application of the IHS. In accordance with at least one embodiment, determination of the area in which problem lies can be followed by selective automatic dispatch of the problem to a particular problem resolution entity among a plurality of problem resolution entities.
- In accordance with at least one embodiment, a method and apparatus are provided for a client-side IHS to gather information indicative of a location of a problem on the client-side IHS. On a client-side IHS, information regarding the flow of a user's request and the response of the IHS to the request can be collected. The user's request may, for example, be a request made via a software application. The software application may process the request and initiate action to be taken by the OS. The OS may, in turn, initiate action to be taken in a hardware-level component of the IHS. The hardware-level component may return a result to the OS. The OS may, in turn, return the result to the software application. The software application may cause action observable to the user. As such a sequence progresses, periods of time elapse from one stage to the next. The time gaps between the different layers for a single user's request and its response can be measured and collected, for example, by a system management application operating on the IHS. As the timing data are measured and collected during operation, they can be used to establish default threshold values of at least one combination of at least two of a time t1 for a request to be sent from a software application to the OS, a time t2 for a request to be sent from the OS to the hardware of the IHS, a time t3 for a reply to be sent from the hardware of the IHS to the OS, and a time t4 for a reply to be sent from the OS to the software application. The default threshold values can be specific to a particular user's IHS, generalized for a plurality of users' IHSs, or generalized for a plurality of users' IHSs but adjusted to tailor them to a particular user IHS. The default threshold values can be promulgated in conjunction with a system management application. The default threshold values can change, for example, being modified according to a particular user's way of using that particular user's IHS. Thus, user-specific threshold values can be utilized.
- In accordance with at least one embodiment, a system management application can collect measurements of the time periods between different layers for a combination of at least two requests or replies. The system management application can upload the measurements to a backend server for storage, analysis, and determination of default threshold values. For example, the collected values can be analyzed with respect to an operational status of a client-side IHS from which they are obtained. For example, the client-side IHS may have a fully operational status, from which a first set of measurements are collected, but then descend into a degraded performance status, from which a second set of measurements are collected. A relationship of the second set of measurements to the first set of measurements can be used to determine the default threshold values. Such measurements and determinations can be made with respect to a particular user's IHS or to a plurality of users' IHSs. In the latter case, the plurality of data sets can be considered as part of an overall data set collected from different IHSs to analyze and determine the generic default threshold values applicable to a plurality of IHSs.
- In accordance with at least one embodiment, the generic default threshold values can be adjusted or replaced to assure applicability to a particular IHS. As the generic default threshold values are obtained (e.g., averaged) from a large number of measurements of a large number of IHSs, there may be outlier IHSs for which adjusted values may provide better results. A determination of whether to use the generic default threshold values or to adjust the generic default threshold values can be made by comparing the measurements obtained from a particular user's IHS to the default threshold values. If the comparison indicates the particular user's IHS has measurements during normal operation that are inconsistent with the measurements during normal operation of a plurality of IHSs from which the default threshold values are obtained, it can be concluded that the default threshold values may not be ideal for determining when measurements of the particular user's IHS indicate abnormal operation, and the decision can be made to adjust the generic default threshold values to provide adjusted generic default threshold values tailored to the particular user's IHS. The collection of measurement data and the use of generic default threshold values can begin immediately upon implementation of an embodiment, and the adjusted generic default threshold values can continue to improve performance as usage of the particular user's IHS continues. The adjusted generic default threshold values can be used as specific threshold values, or specific threshold values can be established via a different route, for example, by evaluating measurement data from the particular user's IHS without dependence on generic default threshold values. The adaptive capabilities of at least one embodiment can provide improved results over time.
- In accordance with at least one embodiment, a method and apparatus are provided for a server-side IHS to receive information indicative of a location of a problem on the client-side IHS. As an example, the information may be in the form of measurements of timing of processing of information for different areas of the client-side IHS. For example, the information may include the timing information comprising at least two of the t1, t2, t3, and t4 time periods described above. The timing information can be obtained during both a fully operational status and a degraded performance status of a client-side IHS. The timing information can be received from a single IHS or from a plurality of IHSs. Collected values from a client-side IHS can be used by a server-side (e.g., backend) IHS to guide dispatch of a problem with the client-side IHS for resolution of the problem.
- In accordance with at least one embodiment, the following are collected values which can be used, as may be obtained from measurements made on a client-side IHS:
- T User (Current)={t2, t3, t4}->user's current behavior
- T Normal=user specific threshold value
- Therefore, the likelihood percentage will be calculated as follows:
-
Probability (%) that issue is with the app [((TUSER{1}−TNORMAL)/TUSER {1)]*100 -
Probability (%) that issue is with the OS [(TUSER{2}−TNORMAL)/TUSER {2)]*100 -
Probability (%) that issue is with the h/w [(TUSER{3}−TNORMAL)/TUSER {3)]*100 - where TUSER(X) can have values for X as follow:
-
- X=1: time lapse between Application and OS layer
- X=2: time lapse between OS and hardware layer
- In accordance with at least one embodiment, an area of a problem within an IHS can be localized quickly and efficiently without having to perform traditional responsive remote access on the IHS to begin a process of finding the area of the problem within the IHS. Implementation of at least one embodiment can greatly reduce the time needed to solve a problem with an IHS by eliminating a need to examine areas of the IHS where the problem does not exist but by instead identifying the specific area of the IHS where the problem does exist.
- In accordance with at least one embodiment, at least two aspects of problem solving can be provided to expedite and simplify the solution of a problem in the IHS. Firstly, the area in which the problem exists can be identified. For example, the layer (e.g., software application, OS, or hardware) of the IHS at which the problem exists can be identified. Secondly, as the particular nature of the problem may have characteristics that yield recognizable effects on the timing measurements, the type of problem can, in at least some cases, be identified. The determination of the type of the problem can be correlated with the identified area of the problem, and correlation between the type and the area can provide confirmation (e.g., cross-confirmation) of those determinations. A lack of correlation between the type and the area can be used to cause further measurement, further analysis, indication of the lack of correlation, the like, or combinations thereof.
- In accordance with at least one embodiment, a method comprises obtaining timing information for processing among layers of a first client-side information handling system (IHS); comparing the timing information to threshold values to provide a comparison; using the comparison to identify an area of the first client-side IHS in which a problem exists in the first client-side IHS; and initiating remedial action directed to the problem in the area of the first client-side IHS. In accordance with at least one embodiment, the area is selected from a group consisting of a software application, an operating system (OS), and a hardware component of the first client-side IHS. In accordance with at least one embodiment, the timing information comprises a first time period representing a latency of a first request from the software application to the OS and a second time period representing a latency of a second request from the OS to the hardware component. In accordance with at least one embodiment, the timing information comprises a third time period representing a latency of a first response from the hardware component to the OS and a fourth time period representing a latency of a second response from the OS to the software application. In accordance with at least one embodiment, the threshold values are generic default threshold values generated from processing of information regarding timing obtained from a plurality of client-side IHSs. In accordance with at least one embodiment, the threshold values are user-specific threshold values generated by modifying generic default threshold values generated from the processing of information regarding timing obtained from a plurality of client-side IHSs, wherein the modifying comprises adjusting the generic default threshold values based on specific information of the first client-side IHS. In accordance with at least one embodiment, the initiating remedial action directed to the problem in the area of the first client-side IHS is performed using machine learning (ML).
- In accordance with at least one embodiment, an information handling system (IHS) comprises a memory; and a processor, the processor configured to obtain timing information for processing among layers of a first client-side information handling system (IHS), to compare the timing information to threshold values to provide a comparison, to use the comparison to identify an area of the first client-side IHS in which a problem exists in the first client-side IHS, and to initiate remedial action directed to the problem in the area of the first client-side IHS. In accordance with at least one embodiment, the area is selected from a group consisting of a software application, an operating system (OS), and a hardware component of the first client-side IHS. In accordance with at least one embodiment, the timing information comprises a first time period representing a latency of a first request from the software application to the OS and a second time period representing a latency of a second request from the OS to the hardware component. In accordance with at least one embodiment, the timing information comprises a third time period representing a latency of a first response from the hardware component to the OS and a fourth time period representing a latency of a second response from the OS to the software application. In accordance with at least one embodiment, the threshold values are generic default threshold values generated from processing of information regarding timing obtained from a plurality of client-side IHSs. In accordance with at least one embodiment, the threshold values are user-specific threshold values generated by modifying generic default threshold values generated from the processing of information regarding timing obtained from a plurality of client-side IHSs, wherein the modifying comprises adjusting the generic default threshold values based on specific information of the first client-side IHS. In accordance with at least one embodiment, the initiating remedial action directed to the problem in the area of the first client-side IHS is performed using machine learning (ML).
- In accordance with at least one embodiment, a method comprises obtaining timing information for processing among layers of a first client-side information handling system (IHS), the timing information comprising at least two of a first time period representing a latency of a first request from a software application of the client-side IHS to an operating system (OS) of the client-side IHS, a second time period representing a latency of a second request from the OS to a hardware component of the client-side IHS, a third time period representing a latency of a first response from the hardware component to the OS, and a fourth time period representing a latency of a second response from the OS to the software application; comparing the timing information to threshold values to provide a comparison; using the comparison to identify an area of the first client-side IHS in which a problem exists in the first client-side IHS; and initiating remedial action directed to the problem in the area of the first client-side IHS. In accordance with at least one embodiment, the area is selected from a group consisting of the software application, the operating system (OS), and the hardware component of the first client-side IHS. In accordance with at least one embodiment, the threshold values are generic default threshold values generated from processing of information regarding timing obtained from a plurality of client-side IHSs. In accordance with at least one embodiment, the threshold values are user-specific threshold values generated by modifying generic default threshold values generated from the processing of information regarding timing obtained from a plurality of client-side IHSs, wherein the modifying comprises adjusting the generic default threshold values based on specific information of the first client-side IHS. In accordance with at least one embodiment, the initiating remedial action directed to the problem in the area of the first client-side IHS is performed using machine learning (ML). In accordance with at least one embodiment, the generic default threshold values are downloaded to the client-side IHS.
- When referred to as a “device,” a “module,” a “unit,” a “controller,” or the like, the embodiments described herein can be configured as hardware. For example, a portion of an information handling system device may be hardware such as, for example, an integrated circuit (such as an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a structured ASIC, or a device embedded on a larger chip), a card (such as a Peripheral Component Interface (PCI) card, a PCI-express card, a Personal Computer Memory Card International Association (PCMCIA) card, or other such expansion card), or a system (such as a motherboard, a system-on-a-chip (SoC), or a stand-alone device).
- In accordance with various embodiments of the present disclosure, the methods described herein may be implemented by software programs executable by a computer system. Further, in an exemplary, non-limited embodiment, implementations can include distributed processing, component/object distributed processing, and parallel processing. Alternatively, virtual computer system processing can be constructed to implement one or more of the methods or functionality as described herein.
- The present disclosure contemplates a computer-readable medium that includes instructions or receives and executes instructions responsive to a propagated signal; so that a device connected to a network can communicate voice, video or data over the network. Further, the instructions may be transmitted or received over the network via the network interface device.
- While the computer-readable medium is shown to be a single medium, the term “computer-readable medium” includes a single medium or multiple media, such as a centralized or distributed database, and/or associated caches and servers that store one or more sets of instructions. The term “computer-readable medium” shall also include any medium that is capable of storing, encoding or carrying a set of instructions for execution by a processor or that cause a computer system to perform any one or more of the methods or operations disclosed herein.
- In a particular non-limiting, exemplary embodiment, the computer-readable medium can include a solid-state memory such as a memory card or other package that houses one or more non-volatile read-only memories.
- Further, the computer-readable medium can be a random access memory or other volatile re-writable memory. Additionally, the computer-readable medium can include a magneto-optical or optical medium, such as a disk or tapes or other storage device to store information received via carrier wave signals such as a signal communicated over a transmission medium. A digital file attachment to an e-mail or other self-contained information archive or set of archives may be considered a distribution medium that is equivalent to a tangible storage medium. Accordingly, the disclosure is considered to include any one or more of a computer-readable medium or a distribution medium and other equivalents and successor media, in which data or instructions may be stored.
- Although only a few exemplary embodiments have been described in detail above, those skilled in the art will readily appreciate that many modifications are possible in the exemplary embodiments without materially departing from the novel teachings and advantages of the embodiments of the present disclosure. Accordingly, all such modifications are intended to be included within the scope of the embodiments of the present disclosure as defined in the following claims. In the claims, means-plus-function clauses are intended to cover the structures described herein as performing the recited function and not only structural equivalents, but also equivalent structures.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/685,303 US11334421B2 (en) | 2019-11-15 | 2019-11-15 | Method and apparatus to identify a problem area in an information handling system based on latencies |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/685,303 US11334421B2 (en) | 2019-11-15 | 2019-11-15 | Method and apparatus to identify a problem area in an information handling system based on latencies |
Publications (2)
Publication Number | Publication Date |
---|---|
US20210149760A1 true US20210149760A1 (en) | 2021-05-20 |
US11334421B2 US11334421B2 (en) | 2022-05-17 |
Family
ID=75910004
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/685,303 Active 2040-06-18 US11334421B2 (en) | 2019-11-15 | 2019-11-15 | Method and apparatus to identify a problem area in an information handling system based on latencies |
Country Status (1)
Country | Link |
---|---|
US (1) | US11334421B2 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11934302B2 (en) * | 2022-01-05 | 2024-03-19 | Dell Products L.P. | Machine learning method to rediscover failure scenario by comparing customer's server incident logs with internal test case logs |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7725571B1 (en) | 1999-05-24 | 2010-05-25 | Computer Associates Think, Inc. | Method and apparatus for service analysis in service level management (SLM) |
US6886112B2 (en) * | 2002-06-28 | 2005-04-26 | Microsoft Corporation | Recovering from device failure |
US9060451B2 (en) | 2007-02-26 | 2015-06-16 | Google Inc. | Targeted cooling for datacenters |
US8220001B2 (en) * | 2009-02-13 | 2012-07-10 | Oracle International Corporation | Adaptive cluster timer manager |
WO2010113212A1 (en) * | 2009-03-31 | 2010-10-07 | 富士通株式会社 | Memory leak monitoring device and method |
ES2620311T3 (en) * | 2009-11-05 | 2017-06-28 | Amadeus S.A.S. | Method and system to adapt a session expiration period |
US10102491B2 (en) | 2014-05-27 | 2018-10-16 | Genesys Telecommunications Laboratories, Inc. | System and method for bridging online customer experience |
US9798624B2 (en) | 2015-06-23 | 2017-10-24 | Dell Products, L.P. | Automated fault recovery |
US9794158B2 (en) | 2015-09-08 | 2017-10-17 | Uber Technologies, Inc. | System event analyzer and outlier visualization |
EP3388944A1 (en) * | 2017-04-13 | 2018-10-17 | TTTech Computertechnik AG | Method for error detection within an operating system |
EP3690652B1 (en) * | 2017-10-13 | 2023-08-30 | Huawei Technologies Co., Ltd. | Fault processing method for terminal device and terminal device |
US10360012B2 (en) | 2017-11-09 | 2019-07-23 | International Business Machines Corporation | Dynamic selection of deployment configurations of software applications |
US11593583B2 (en) * | 2019-06-28 | 2023-02-28 | Oracle International Corporation | Method and system to implement cluster failure prediction to facilitate split brain resolution |
-
2019
- 2019-11-15 US US16/685,303 patent/US11334421B2/en active Active
Also Published As
Publication number | Publication date |
---|---|
US11334421B2 (en) | 2022-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10599536B1 (en) | Preventing storage errors using problem signatures | |
US10310749B2 (en) | System and method for predicting disk failure | |
US7340649B2 (en) | System and method for determining fault isolation in an enterprise computing system | |
US10146651B2 (en) | Member replacement in an array of information storage devices | |
US10891181B2 (en) | Smart system dump | |
US10037238B2 (en) | System and method for encoding exception conditions included at a remediation database | |
US8195619B2 (en) | Extent reference count update system and method | |
US9697068B2 (en) | Building an intelligent, scalable system dump facility | |
US7613861B2 (en) | System and method of obtaining error data within an information handling system | |
US11500707B2 (en) | Controller, memory controller, storage device, and method of operating the controller | |
US7870441B2 (en) | Determining an underlying cause for errors detected in a data processing system | |
US20140143768A1 (en) | Monitoring updates on multiple computing platforms | |
US11334421B2 (en) | Method and apparatus to identify a problem area in an information handling system based on latencies | |
US10768853B2 (en) | Information handling system with memory flush during shut down | |
US9792168B2 (en) | System and method for cloud remediation of a client with a non-bootable storage medium | |
US10635554B2 (en) | System and method for BIOS to ensure UCNA errors are available for correlation | |
US9411695B2 (en) | Provisioning memory in a memory system for mirroring | |
US10817365B2 (en) | Anomaly detection for incremental application deployments | |
US20120023379A1 (en) | Storage device, storage system, and control method | |
US10534683B2 (en) | Communicating outstanding maintenance tasks to improve disk data integrity | |
US20230035666A1 (en) | Anomaly detection in storage systems | |
JP6946716B2 (en) | Storage controller, storage control program and storage control method | |
US20230409423A1 (en) | Collection of forensic data after a processor freeze | |
US11481305B2 (en) | Method and apparatus for detecting a monitoring gap for an information handling system | |
CN111831389B (en) | Data processing method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: DELL PRODUCTS, LP, TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SETHI, PARMINDER SINGH;SANTOSH, ABHISHEK;SAXENA, ANSHUL;REEL/FRAME:051021/0777 Effective date: 20191107 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS COLLATERAL AGENT, TEXAS Free format text: PATENT SECURITY AGREEMENT (NOTES);ASSIGNORS:DELL PRODUCTS L.P.;EMC IP HOLDING COMPANY LLC;REEL/FRAME:052216/0758 Effective date: 20200324 |
|
AS | Assignment |
Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, NORTH CAROLINA Free format text: SECURITY AGREEMENT;ASSIGNORS:DELL PRODUCTS L.P.;EMC IP HOLDING COMPANY LLC;REEL/FRAME:052243/0773 Effective date: 20200326 |
|
AS | Assignment |
Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., TEXAS Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:053546/0001 Effective date: 20200409 |
|
AS | Assignment |
Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS COLLATERAL AGENT, TEXAS Free format text: SECURITY INTEREST;ASSIGNORS:DELL PRODUCTS L.P.;EMC CORPORATION;EMC IP HOLDING COMPANY LLC;REEL/FRAME:053311/0169 Effective date: 20200603 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
AS | Assignment |
Owner name: EMC IP HOLDING COMPANY LLC, TEXAS Free format text: RELEASE OF SECURITY INTEREST AF REEL 052243 FRAME 0773;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058001/0152 Effective date: 20211101 Owner name: DELL PRODUCTS L.P., TEXAS Free format text: RELEASE OF SECURITY INTEREST AF REEL 052243 FRAME 0773;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058001/0152 Effective date: 20211101 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: EMC IP HOLDING COMPANY LLC, TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053311/0169);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060438/0742 Effective date: 20220329 Owner name: EMC CORPORATION, MASSACHUSETTS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053311/0169);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060438/0742 Effective date: 20220329 Owner name: DELL PRODUCTS L.P., TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053311/0169);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060438/0742 Effective date: 20220329 Owner name: EMC IP HOLDING COMPANY LLC, TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (052216/0758);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060438/0680 Effective date: 20220329 Owner name: DELL PRODUCTS L.P., TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (052216/0758);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060438/0680 Effective date: 20220329 |
|
CC | Certificate of correction |