EP1381952A2 - Paniknachrichtanalysegerät - Google Patents
PaniknachrichtanalysegerätInfo
- Publication number
- EP1381952A2 EP1381952A2 EP01973104A EP01973104A EP1381952A2 EP 1381952 A2 EP1381952 A2 EP 1381952A2 EP 01973104 A EP01973104 A EP 01973104A EP 01973104 A EP01973104 A EP 01973104A EP 1381952 A2 EP1381952 A2 EP 1381952A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- message
- bugs
- customer
- database
- version
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Prevention of errors by analysis, debugging or testing of software
- G06F11/362—Debugging of software
- G06F11/366—Debugging of software using diagnostics
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
- G06F11/2294—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing by remote test
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/32—Monitoring with visual or acoustical indication of the functioning of the machine
- G06F11/324—Display of status information
- G06F11/327—Alarm or error message display
Definitions
- This invention relates to analysis of panic messages from network servers.
- a first known method to enable reporting of a software application error is to provide a pre-public release of a software package to a select group customers for "beta testing.” During this trial period, customers report to the company any problems that they encounter and the software engineers at the company fix the bugs and provide updated versions of the software to the beta testers who continue testing with the new version. This process continues for a short testing period until the software is hopefully error free. While this first known method provides reporting of software bugs to a manufacturer it suffers from several drawbacks. First, it provides no method for automatically reporting the problem to the manufacturer. It relies solely on the beta tester to inform the manufacturer. Second, it provides no automated analysis of a problem identified by a beta tester. That is, it requires an employee at the manufacturer to determine whether the problem has already been reported, fixed, or is a new problem. Third, it provides no method for delivery of updated software to a user who is determined to be using older software with an identified and fixed problem.
- a second known method of reporting computer system errors is to rely on the end user to call the manufacturer and report a problem when it occurs.
- the customer is provided a customer support line that they may call to report problems they are having.
- the manufacturer may conclude there is a problem with some portion of a program.
- While this second known method provides reporting of software bugs to a manufacturer it suffers from several drawbacks.
- the customer may decide not to call as customer support calls tend to involve long waits on hold listening to musak and often provides no relief as the manufacturer has no formal structure in place to coordinate and analyze the calls they receive.
- the customer may not be knowledgeable enough to provide the manufacturer with the necessary information they need to diagnose the problem, or worse, they may misinform the manufacturer as to the origin of the problem.
- the invention includes a system and method for analyzing panic messages from computer systems that have suffered failures.
- a filer server dedicated to file storage and retrieval
- This message is indicative of the problem that caused the filer to crash.
- This message is sent to the manufacturer via a communications network such as the Internet.
- the message also includes other information, such as the user's name, the version of the software, a back trace, and a mini core dump.
- automatic analysis commences to determine if the bug can be identified.
- the panic message is analyzed by comparing it against a database of panic messages that correspond with known bugs. If successful, automated housekeeping occurs which includes updating this instance in a tracking database, delivery of an answer to the customer (including solutions), updating analysis statistics, and additional activities. If unsuccessful the process continues.
- a back trace analyzer analyzes the back trace using an expression algorithm that looks for exact matches on function names and recognized sequences of matches that correspond to known bugs. If successful, automated housekeeping occurs as indicated above. If unsuccessful, the process continues.
- a core script analyzer analyzes a core dump for recognizable patterns of code that correspond to known bugs. If successful, automated housekeeping occurs as indicated above If unsuccessful the process continues.
- Figure 1 illustrates a block diagram of a system for a panic message analyzer.
- Figure 2 illustrates a panic message analyzer process in a system for a panic message analyzer.
- Figure 4 illustrates a core dump process in a system for a panic message analyzer.
- Embodiment of the invention can be implemented using general purpose processors or special purpose processors operating under program control, or other circuits, adapted to particular process steps and data structures described herein. Implementation of the process steps and data structures described herein would not require undue experimentation or further investigation.
- filer - This term refers to a file server.
- a file server is a computer and storage device dedicated to data storage and retrieval.
- Core dump - A core dump is the printing or the copying to a more permanent medium (such as a hard disk) the contents of random access memory at one moment in time.
- Figure 1 shows a block diagram of a system for a panic message analyzer.
- a system 100 includes a client device 110 associated with a customer, a communications link 120, a communications network 130, a server device 140 associated with a manufacturer, a mass storage 150, a housekeeping database 151, a bugs database 152, and a core dump 160.
- the client device 110 includes a processor, a main memory, and software for executing instructions (not shown, but understood by one skilled in the art). Although the client device 110 and server device 140 are shown as separate devices there is no requirement that they be separate devices.
- the communications link 120 operates to couple the client device 110 to the communications network 130.
- the server device 140 includes a processor, a main memory, software for executing instructions (not shown, but understood by one skilled in the art), and a mass storage 150.
- client device 110 and server device 140 are shown as separate devices there is no requirement that they be separate devices.
- server device 140 and mass storage 150 are shown as combined there is no requirement that they be combined. They could be separate devices.
- the mass storage 150 includes the housekeeping database 151 and bugs database 152.
- the core dump 160 includes a mini core dump 161, a back-trace 162, and a panic message 163.
- FIG. 2 illustrates a panic message analyzer process, indicated by general reference character 200.
- the manual panic message analyzer process 200 initiates at a 'start' terminal 201.
- the panic message analyzer process 200 continues to a 'panic message created' procedure 203 which allows the customer's device to create a panic message 163 prior to failure.
- a 'customer submits panic message' procedure 205 allows the customer to submit the panic message 163 for analysis utilizing the client device 110 to transmit the panic message 163 to the server device 140.
- the customer submits the message via interaction and transfer over an Internet connection which is well- known in the art. There is, however, no requirement the panic message 163 be transferred by this method as long as it is delivered to the manufacturer.
- An 'analyze panic message' procedure 207 allows the panic message 163 to be analyzed by comparing recognized data elements it contains (a panic message includes the address of where a system was last operating, line numbers, text and source code filenames, and other data) against known data elements that correspond to known bugs in the bugs database 152 on the server device 140.
- a 'known bug?' decision procedure 209 determines whether the panic message identifies a known bug. If the "known bug?' decision procedure 209 determines that the bug is a known bug, the panic message analyzer process 200 continues to a "solution to customer" procedure 213.
- the 'solution to customer' procedure 213 extracts a solution from the database which is associated with the bug identified by the 'known bug' decision procedure
- the solution provided to the customer can be written instructions detailing how to fix and avoid further occurrences, a copy of a software program to fix the problem, or recommendations for the purchase of additional products from the manufacturer that fix the problem.
- An 'automatic housekeeping' procedure 215 records all relevant information regarding identification/non-identification of the bug, the solution sent to the customer (if any), and statistics relating to these events in the housekeeping database 151. If the panic message analyzer failed to diagnose the problem, the 'automatic housekeeping' procedure leaves the case active (i.e. marked as unresolved).
- FIG. 3 illustrates an auto support process, indicated by general reference character 300.
- the auto support process 300 initiates at a 'start' terminal 301.
- the auto support process 300 continues to an 'auto support message sent' procedure 303 which allows the client device 110 to automatically send a message to the sever device 140 containing a copy of the panic message 163 and mini core dump 161.
- An 'auto support message received' procedure 305 allows the server device 140 to receive the panic message 163 and mini core dump 161 from the client device 110.
- An 'analyze panic message' procedure 307 allows the panic message 163 to be analyzed by comparing recognized data elements it contains (a panic message includes the address of where a system was last operating, line numbers, text and source code filenames, and other data) against known data elements that correspond to known bugs in the bugs database 152 on the server device 140.
- a 'known panic bug?' decision procedure 309 determines whether the panic message identifies a known bug. If the "known bug?' decision procedure 209 determines that the bug is a known bug, the panic message analyzer process 200 continues to a "discard mini core dump" procedure 321.
- An 'extract back-trace' procedure 311 extracts the back-trace 162 from the mini core dump 161.
- An 'analyze back-trace' procedure 313 allows the back-trace 162 to be analyzed using an expression algorithm that looks for exact matches on function names and recognized sequences of function names that correspond to known bugs in the bugs database 152 on the server device 140.
- a 'known back-trace bug?' decision procedure 315 determines whether the back-trace 162 identifies a known bug. If the 'known back-trace bug?' decision procedure 315 determines that the bug is a known bug, the auto support process 300 continues to a "discard mini core dump" procedure 321.
- a 'request core dump' 317 procedure notifies the customer that a core dump
- This notification includes all the instructions necessary to create the core dump 160 and deliver it to the manufacturer.
- the notification would be sent electronically to the customer; however, there is no requirement that notification be accomplished in this manner.
- An 'automatic housekeeping' procedure 319 records all relevant information regarding identification/non-identification of the bug, the solution sent to the customer (if any), and statistics relating to these events in the housekeeping database 151. If the panic message analyzer failed to diagnose the problem, the 'automatic housekeeping' procedure leaves the case active (i.e. marked as unresolved).
- the panic message analyzer would not identify it in version two if the bug now appeared at line 20 due to the exact matching methodology used.
- the back-trace analyzer might identify the bug as it uses a more sophisticated approach, and it would then pass this information to the panic message analyzer.
- the auto support process 300 terminates through an 'end' terminal 325.
- a 'discard mini core dump' procedure 321 causes the mini core dump 161 to be discarded as it is no longer needed due to identification of the bug.
- a 'solution sent to customer' procedure 323 causes a solution to be extracted from the bugs database 152 which is associated with the identified bug.
- the solution provided to the customer varies depending on the bug identified. For example, it can be written instructions detailing how to fix and avoid further occurrences, a copy of a software program to fix the problem, or recommendations for the purchase of additional products from the manufacturer that fix the problem.
- the auto support process 300 continues to an 'automatic housekeeping' procedure 319.
- FIG 4 illustrates a core dump process, indicated by general reference character 400.
- the core dump process 400 initiates at a 'start' terminal 401.
- the core dump process 400 continues to a 'core arrives from customer' procedure 403 which allows analysis of the core dump 160 to begin.
- the core dump 160 is requested by a' request core dump' procedure 317 (illustrated in Figure 3) when prior analysis of the panic message 163 and back-trace 162 have failed.
- An 'analyze panic message' procedure 405 allows the panic message 163 to be analyzed by comparing recognized data elements it contains (a panic message includes the address of where a system was last operating, line numbers, text and source code filenames, and other data) against known data elements that correspond to known bugs in the bugs database 152 on the server device 140.
- a 'known panic bug?' decision procedure 407 determines whether the panic message identifies a known bug. If the "known bug?' decision procedure 407 determines that the bug is a known bug, the core dump process 400 continues to a "store core dump" procedure 423.
- An 'extract back-trace' procedure 409 extracts the back-trace 162 from the core dump 160.
- An 'analyze back-trace' procedure 411 allows the back-trace 162 to be analyzed using an expression algorithm that looks for exact matches on function names and recognized sequences of function names that correspond to known bugs within the bugs database 152.
- a 'known back-trace bug?' decision procedure 413 determines whether the back-trace 162 identifies a known bug. If the 'known back-trace bug?' decision procedure 413 determines that the bug is a known bug, the core dump process 400 continues to a 'store core dump' procedure 423.
- a 'core script analyzer' procedure 415 automatically analyzes the core dump
- a 'known core bug?' decision procedure 417 determines whether core script analysis has identified a known bug. If the 'known core bug?' decision procedure 417 determines it has identified a known core bug, the core dump process 400 continues to a 'store core dump' procedure 423.
- a 'manual core dump analysis' procedure 419 allows the core dump 160 to be analyzed manually by personnel at the manufacturer.
- a 'manual solution sent to customer' procedure 421 allows personnel at the manufacturer to send a solution to the customer based on the manual analysis of the core dump 160.
- the core dump process 400 continues to a "automatic housekeeping" procedure
- a 'store core dump' procedure 423 allows the mini core dump 161 to be moved to a storage location.
- a 'solution sent to customer' procedure 425 causes a solution to be extracted from the bugs database 152 which is associated with the identified bug.
- the solution provided to the customer varies depending on the bug identified. For example, it can be written instructions detailing how to fix and avoid further occurrences, a copy of a software program to fix the problem, or recommendations for the purchase of additional products from the manufacturer that fix the problem.
- An 'automatic housekeeping' procedure 427 records all relevant information regarding identification/non-identification of the bug, the solution sent to the customer (if any), statistics relating to these events, and any entries necessary to the bugs database 152.
- functionality exists that allows the back-trace analyzer to teach the panic message analyzer about the bug. This allows future instances of the bug to be resolved at an earlier stage.
- functionality exists that allows the core to teach the back-trace analyzer and panic message analyzer about the bug. This allows future instances of the bug to be resolved at an earlier stage.
- the core dump process 400 terminates through an 'end' terminal 429.
- the invention has general applicability to various fields of use, not necessarily related to the services described above.
- these fields of use can include one or more of, or some combination of, the following:
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
- Automatic Analysis And Handling Materials Therefor (AREA)
- Stored Programmes (AREA)
- Information Transfer Between Computers (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US65820800A | 2000-09-08 | 2000-09-08 | |
| US658208 | 2000-09-08 | ||
| PCT/US2001/029049 WO2002021281A2 (en) | 2000-09-08 | 2001-09-10 | Panic message analyzer |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| EP1381952A2 true EP1381952A2 (de) | 2004-01-21 |
Family
ID=24640348
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP01973104A Ceased EP1381952A2 (de) | 2000-09-08 | 2001-09-10 | Paniknachrichtanalysegerät |
Country Status (4)
| Country | Link |
|---|---|
| EP (1) | EP1381952A2 (de) |
| JP (1) | JP4979176B2 (de) |
| CA (1) | CA2420008C (de) |
| WO (1) | WO2002021281A2 (de) |
Families Citing this family (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7174352B2 (en) | 1993-06-03 | 2007-02-06 | Network Appliance, Inc. | File system image transfer |
| US6138126A (en) | 1995-05-31 | 2000-10-24 | Network Appliance, Inc. | Method for allocating files in a file system integrated with a raid disk sub-system |
| US7343529B1 (en) | 2004-04-30 | 2008-03-11 | Network Appliance, Inc. | Automatic error and corrective action reporting system for a network storage appliance |
| EP2232367A4 (de) | 2007-12-12 | 2011-03-09 | Univ Washington | Deterministische mehrfachverarbeitung |
| WO2009114645A1 (en) * | 2008-03-11 | 2009-09-17 | University Of Washington | Efficient deterministic multiprocessing |
| US8453120B2 (en) | 2010-05-11 | 2013-05-28 | F5 Networks, Inc. | Enhanced reliability using deterministic multiprocessing-based synchronized replication |
| CN109542657A (zh) * | 2018-10-16 | 2019-03-29 | 深圳壹账通智能科技有限公司 | 系统异常的处理方法及服务器 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5111384A (en) * | 1990-02-16 | 1992-05-05 | Bull Hn Information Systems Inc. | System for performing dump analysis |
| US5293612A (en) * | 1989-05-11 | 1994-03-08 | Tandem Computers Incorporated | Selective dump method and apparatus |
| EP0586767A1 (de) * | 1992-09-11 | 1994-03-16 | International Business Machines Corporation | Selektive Datenerfassung für Software-Ausnahmezustände |
Family Cites Families (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH0291735A (ja) * | 1988-09-28 | 1990-03-30 | Tohoku Nippon Denki Software Kk | リモート障害保守管理システム |
| JPH04335449A (ja) * | 1991-05-13 | 1992-11-24 | Nec Corp | 端末障害情報採取方式 |
| SE470031B (sv) * | 1991-06-20 | 1993-10-25 | Icl Systems Ab | System och metod för övervakning och förändring av driften av ett datorsystem |
| JPH05334135A (ja) * | 1992-05-28 | 1993-12-17 | Nec Corp | プログラム異常終了時のエラー情報表示方式 |
| US5761407A (en) * | 1993-03-15 | 1998-06-02 | International Business Machines Corporation | Message based exception handler |
| JP2701807B2 (ja) * | 1995-09-13 | 1998-01-21 | 日本電気株式会社 | 障害通知装置 |
| JPH10228395A (ja) * | 1997-02-17 | 1998-08-25 | Sekisui Chem Co Ltd | 制御用コントローラの異常診断装置 |
| US6073255A (en) * | 1997-05-13 | 2000-06-06 | Micron Electronics, Inc. | Method of reading system log |
| JPH1124961A (ja) * | 1997-07-08 | 1999-01-29 | Nippon Denki Joho Service Kk | コンピュータ保守システム |
| JPH1139259A (ja) * | 1997-07-15 | 1999-02-12 | Casio Comput Co Ltd | 情報処理装置、及びプログラムを記録した記録媒体 |
| JP3525410B2 (ja) * | 1998-12-16 | 2004-05-10 | 富士通株式会社 | 障害復旧方法およびそのためのコンピュータ読み取り可能なプログラム記録媒体 |
| JP2000181734A (ja) * | 1998-12-16 | 2000-06-30 | Fujitsu Ltd | プログラム参照領域の修復方法、修復システム、プログラム走行側装置およびプログラム障害対処装置ならびにそのためのコンピュ−タ読み取り可能なプログラム記録媒体 |
-
2001
- 2001-09-10 EP EP01973104A patent/EP1381952A2/de not_active Ceased
- 2001-09-10 WO PCT/US2001/029049 patent/WO2002021281A2/en not_active Ceased
- 2001-09-10 JP JP2002524828A patent/JP4979176B2/ja not_active Expired - Fee Related
- 2001-09-10 CA CA2420008A patent/CA2420008C/en not_active Expired - Fee Related
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5293612A (en) * | 1989-05-11 | 1994-03-08 | Tandem Computers Incorporated | Selective dump method and apparatus |
| US5111384A (en) * | 1990-02-16 | 1992-05-05 | Bull Hn Information Systems Inc. | System for performing dump analysis |
| EP0586767A1 (de) * | 1992-09-11 | 1994-03-16 | International Business Machines Corporation | Selektive Datenerfassung für Software-Ausnahmezustände |
Non-Patent Citations (1)
| Title |
|---|
| See also references of WO0221281A3 * |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2002021281A3 (en) | 2003-11-06 |
| CA2420008C (en) | 2012-04-03 |
| CA2420008A1 (en) | 2002-03-14 |
| WO2002021281A2 (en) | 2002-03-14 |
| JP4979176B2 (ja) | 2012-07-18 |
| JP2004524596A (ja) | 2004-08-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US7475387B2 (en) | Problem determination using system run-time behavior analysis | |
| US7984007B2 (en) | Proactive problem resolution system, method of proactive problem resolution and program product therefor | |
| US7328376B2 (en) | Error reporting to diagnostic engines based on their diagnostic capabilities | |
| US6859893B2 (en) | Service guru system and method for automated proactive and reactive computer system analysis | |
| US8140565B2 (en) | Autonomic information management system (IMS) mainframe database pointer error diagnostic data extraction | |
| US7080287B2 (en) | First failure data capture | |
| US8250563B2 (en) | Distributed autonomic solutions repository | |
| US8244792B2 (en) | Apparatus and method for information recovery quality assessment in a computer system | |
| US7007200B2 (en) | Error analysis fed from a knowledge base | |
| US7305465B2 (en) | Collecting appliance problem information over network and providing remote technical support to deliver appliance fix information to an end user | |
| US20050081118A1 (en) | System and method of generating trouble tickets to document computer failures | |
| US20050022176A1 (en) | Method and apparatus for monitoring compatibility of software combinations | |
| US20040236843A1 (en) | Online diagnosing of computer hardware and software | |
| US20160026547A1 (en) | Generating predictive diagnostics via package update manager | |
| US20070038896A1 (en) | Call-stack pattern matching for problem resolution within software | |
| JPH0644242B2 (ja) | コンピュータ・システムにおける問題解決方法 | |
| CN101918922A (zh) | 用于计算机网络中的自动数据异常修正的系统和方法 | |
| NZ526097A (en) | Online diagnosing of computer hardware and software from a remote location without requiring human assistance | |
| US6957366B1 (en) | System and method for an interactive web-based data catalog for tracking software bugs | |
| US20060088027A1 (en) | Dynamic log for computer systems of server and services | |
| CN111444101A (zh) | 自动创建产品测试缺陷的方法及装置 | |
| CA2420008C (en) | Panic message analyzer | |
| US20070011541A1 (en) | Methods and systems for identifying intermittent errors in a distributed code development environment | |
| JP2003345628A (ja) | 障害調査資料採取方法及びその実施システム並びにその処理プログラム | |
| CN114371870B (zh) | 代码扫描、提交方法及代码扫描服务器、客户端和服务端 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| 17P | Request for examination filed |
Effective date: 20030404 |
|
| AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): DE FR GB IT NL |
|
| RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: NETWORK APPLIANCE, INC. |
|
| REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
| 18R | Application refused |
Effective date: 20111206 |