US7925745B2 - Monitoring apparatus, executive program, and information processing system - Google Patents
Monitoring apparatus, executive program, and information processing system Download PDFInfo
- Publication number
- US7925745B2 US7925745B2 US12/230,412 US23041208A US7925745B2 US 7925745 B2 US7925745 B2 US 7925745B2 US 23041208 A US23041208 A US 23041208A US 7925745 B2 US7925745 B2 US 7925745B2
- Authority
- US
- United States
- Prior art keywords
- breakdown
- data
- section
- storage
- information processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0766—Error or fault reporting or storing
- G06F11/0781—Error filtering or prioritizing based on a policy defined by the user or on a policy defined by a hardware/software module, e.g. according to a severity level
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0748—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a remote unit communicating with a single-box computer node experiencing an error/fault
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3476—Data logging
Definitions
- the present invention relates to a monitoring apparatus that monitors operations of an information processing apparatus, an executive program that causes a computer to operate as the monitoring apparatus, and an information processing system comprising an information processing apparatus and a monitoring apparatus that monitors operations of the information processing apparatus.
- a server system in which an information processing apparatus having a server function and a monitoring apparatus that always observes the state of the information processing apparatus are installed in one case (refer to non-patent document 1, for instance).
- the information processing apparatus when the breakdown occurs in the machine, notifies the monitoring apparatus installed in the server system of the breakdown generation. Upon receipt of the notification, the monitoring apparatus gathers various types of sub-data representative of the operation states of individual section in the information processing apparatus when the breakdown occurs for instance, through access to individual section in the information processing apparatus. Then the monitoring apparatus stores two or more sets of sub-data thus gathered in a predetermined storage area of the monitoring apparatus in form of state data representative of the state of the information processing apparatus at the time when the breakdown occurs.
- the information processing apparatus is often designed so as to continuously operate even if the breakdown occurs in the machine, in order to manage the client-server system as continuously as possible.
- the monitoring apparatus gathers and stores the state data from the information processing apparatus whenever the breakdown occurs. And, the state data stored by that time is analyzed at regular maintenance etc. At that time, for instance, when the part in the state that causes the significant drop of the processing performance and the down of the server is found, treatment such as the repair of the part or the exchanges with the new article is taken.
- FIG. 13 is a view useful for understanding an example of a conventional storage method for the state data of the monitoring apparatus.
- An storage area 800 shown in FIG. 13 is divided into the division of N pieces of the mutually same size, in which individual one state data is stored.
- the number is applied to individual division like #1, #2, #3, . . . , #N.
- the state data which is gathered in the monitoring apparatus whenever the breakdown generates, is sequentially stored from the division corresponding to a young number in order. Moreover, at that time, the serial number is applied to the state data like log 1, log 2, log 3, . . . , log N.
- Part (a) of FIG. 13 shows the state that all divisions have become empty.
- Part (b) of FIG. 13 shows the state that only one division is occupied with a piece of state data.
- Part (c) of FIG. 13 shows the state that all divisions of store N pieces of state data.
- Part (d) of FIG. 13 shows the state that the state data of piece N+1 is stored following the state shown in part (c) of FIG. 13 .
- state data J n+1 of piece N+1 gathered exceeding N piece that is the number of maximum storage of the storage area 800 is overwritten in state data J 1 of piece first.
- state data of piece N+2 is overwritten in state data of piece second
- state data of piece N+3 is overwritten in state data of piece third.
- Non-patent document 1 Internet ⁇ URL:http://primerserver.fujitsu.com/primepower/news/article/20170111/on “High trust and solution that PRIMEPOWER (registered trademark) and PRIMECLUSTER (registered trademark) weave high available” online and Jan. 11, 2005, FUJITSU Ltd. and retrieval on Feb. 1, 2006.
- the state data gathered in relation to the breakdown that causes the significant drop in the processing performance and the down of the server may be lost by overwrite with the state data gathered afterwards. Thus, when maintaining it, such a serious breakdown might be overlooked.
- the present invention provides a monitoring apparatus that monitors operations of an information processing apparatus that notifies occurrence of breakdown which occurs during execution of a predetermined information processing operation, the monitoring apparatus comprising:
- a data deriving section that derives from the information processing apparatus state data representative of an apparatus state of the information processing apparatus, when the data deriving section receives a notification of occurrence of a breakdown by the information processing apparatus;
- a breakdown classification section that classifies the breakdown associated with the notification into a breakdown type corresponding to a seriousness of the breakdown, of two or more breakdown types that are mutually different in seriousness of breakdown;
- a data storage section that stores the state data derived by the data deriving section in a storage area corresponding to the breakdown type classified by the breakdown classification section, of two or more storage areas associated with the two or more breakdown types, respectively.
- Some of the breakdowns which occurs in the above-mentioned information processing apparatus may cause great decrease in processing performance and server down, and there is a breakdown low in seriousness wherein it involves little substantial damage, and the failure analysis is omitted because it is almost clear as for the cause, as well as a breakdown high in seriousness which needs a detailed failure analysis when maintaining it. And, generally, there is frequent such a case that occurrence frequency of the breakdown which is high in seriousness is low as compared with occurrence frequency of the breakdown which is low in seriousness.
- the state data that is many in number and is high in frequency of overwriting, which is obtained on the breakdown which is low in seriousness is stored in a storage area different from a storage area for a state data obtained on the breakdown that is high in seriousness, which is desired to avoid the superscription as much as possible.
- This feature makes it possible to enhance protection ability of the state data which are obtained on the breakdown of the serious breakdown type.
- it is possible to discover with the greater accuracy the breakdown of the information processing apparatus in accordance with such state data when the maintenance and the like are performed in the future.
- the data deriving section derives from the information processing apparatus a set of sub-data each representing a component state of two or more components constituting the information processing apparatus respectively, as the state data, and the data storage section stores each of the sub-data constituting the state data in a storage section corresponding to each datum size of the sub-data, of two or more storage sections each associated with data sizes different from one another in the storage area, when the data storage section stores the state data in the storage area.
- the storage area can be effectively used in accordance with the data size of the sub-data.
- This feature makes it possible to suppress the generation of a useless unused area in the storage area, so that the area of the storage area can be made the best use of enough. Consequently, it is possible to store the above-mentioned sub-data as a lot as possible and thus to store the above-mentioned state data as a lot as possible. As a result, the frequency of the superscription generation in the storage area is suppressed, and thus the protection for the state data in the storage area can be improved further.
- the data deriving section derives from the information processing apparatus a set of sub-data each representing a component state of two or more components constituting the information processing apparatus respectively, as the state data
- the data storage section stores each of the sub-data constituting the state data in a storage section corresponding to each datum size of the sub-data, of two or more storage sections each associated with data sizes different from one another in the storage area, when the data storage section stores the state data in the storage area
- the monitoring apparatus further comprises an alteration section that alters a maximum storage number of the sub-data in the storage section by altering an area of the storage section to an area according to an operation.
- the alteration section expands an area of a desired storage section, so that the maximum storage number of the sub-data in the storage section is increased.
- This feature makes it possible to suppress the frequency of the superscription generation in the storage area, so that the protection for the sub-data in the storage area can be improved.
- the present invention provides a computer-readable storage medium storing a monitoring program that is incorporated in a computer to be executed in the computer, the monitoring program causing the computer to monitor operations of an information processing apparatus that notifies occurrence of breakdown which occurs during execution of a predetermined information processing operation, wherein the monitoring program constitutes in the computer:
- a data deriving section that derives from the information processing apparatus state data representative of an apparatus state of the information processing apparatus, when the data deriving section receives a notification of occurrence of a breakdown by the information processing apparatus;
- a breakdown classification section that classifies the breakdown associated with the notification into a breakdown type corresponding to a seriousness of the breakdown, of two or more breakdown types that are mutually different in seriousness of breakdown;
- a data storage section that stores the state data derived from the data deriving section in a storage area corresponding to the breakdown type classified by the breakdown classification section, of two or more storage areas associated with the two or more breakdown types, respectively.
- the executive program of the present invention it is possible to easily implement a monitoring apparatus capable of discovering the breakdown of the information processing apparatus with the greater accuracy.
- an information processing system comprising:
- a monitoring apparatus including:
- a data deriving section that derives from the information processing apparatus state data representative of an apparatus state of the information processing apparatus, when the data deriving section receives a notification of occurrence of a breakdown by the information processing apparatus;
- a breakdown classification section that classifies the breakdown associated with the notification into a breakdown type corresponding to a seriousness of the breakdown, of two or more breakdown types that are mutually different in seriousness of breakdown;
- a data storage section that stores the state data derived from the data deriving section in a storage area corresponding to the breakdown type classified by the breakdown classification section, of two or more storage areas associated with the two or more breakdown types, respectively.
- the executive program of the present invention and the information processing system of the present invention include not only the basic aspects, but also various aspects corresponding to the above-mentioned aspects of the monitoring apparatus of the present invention as mentioned above.
- a monitoring apparatus capable of discovering the breakdown of the information processing apparatus with the greater accuracy
- an executive program which causes a computer to operate as such a monitoring apparatus
- an information processing system capable of discovering the breakdown of the information processing apparatus with the greater accuracy
- FIG. 1 is a view showing an example of a client-server system including an embodiment of the present invention.
- FIG. 2 is a typical illustration showing a hardware structure of a server system 100 .
- FIG. 3 is a conceptual view showing ROM 121 b that stores an embodiment of an executive program of the present invention.
- FIG. 4 is a functional block diagram useful for understanding a function of a monitoring apparatus 120 which is implemented when CPU 121 a executes an executive program 500 shown in FIG. 3 .
- FIG. 5 is a flowchart of a main routine in processing of gathering and storage of the state data, which is to be executed by the monitoring apparatus 120 .
- FIG. 6 is a flowchart of a subroutine which performs gathering and storage of the state data.
- FIG. 7 is a flowchart of a subroutine which performs gathering of the sub-data.
- FIG. 8 is a flowchart of a subroutine which performs storage processing for one sub-data to a memory 122 .
- FIG. 9 is a typical illustration useful for understanding an internal structure of the memory 122 .
- FIG. 10 is a typical illustration useful for understanding an internal structure of a storage area for serious breakdown shown in FIG. 9 .
- FIG. 11 is a typical illustration useful for understanding an internal structure of a storage area for negligible breakdown shown in FIG. 9 .
- FIG. 12 is an illustration useful for understanding the situation in which the size of Major 2K part 122 a _ 1 is increased, and the size of Allscan 2K part 122 a _ 9 is decreased by just that much.
- FIG. 13 is a view useful for understanding an example of a conventional storage method for the state data of the monitoring apparatus.
- FIG. 1 is a view showing an example of a client-server system including an embodiment of the present invention.
- a client-server system 10 shown in FIG. 1 is composed of a server system 100 , and two or more client computers 200 , 300 , 400 , . . . .
- the server system 100 operates as a server that manages the client-server system 10 in its entirety, and corresponds to one embodiment of the information processing system referred to the present invention.
- FIG. 2 is a typical illustration showing a hardware structure of a server system 100 .
- the server system 100 is equipped with an information processing apparatus 110 that executes various management for the client-server system 10 , and substantially serves as a server of the client-server system 10 , and a monitoring apparatus 120 that monitors operations of the information processing apparatus 110 .
- the information processing apparatus 110 corresponds to an embodiment of the information processing apparatus referred to in the present invention.
- the monitoring apparatus 120 corresponds to an embodiment of the monitoring apparatus referred to in the present invention.
- the information processing apparatus 110 has five sorts of boards 111 , 112 , 113 , 114 and 115 .
- the first board 111 is equipped with three sorts of LSI such as a system controller (SC) 111 a , a memory access controller (MAC) 111 b , and a central processing unit (CPU) 111 c.
- SC system controller
- MAC memory access controller
- CPU central processing unit
- the SC 111 a is LSI that mediates data between the CPU 111 c and other parts, and implements the transfer of smooth data.
- the MAC 111 b is LSI that controls read and write operations for data to a memory (not illustrated) of the information processing apparatus 110 .
- the CPU 111 c is LSI that controls the operation of the information processing apparatus 110 in its entirety.
- the second board 112 is equipped with two sorts of LSI such as an I/O controller 112 a , and an I/O bridge 112 b.
- the I/O controller 112 a is LSI that executes the transfer of data between the information processing apparatus 110 and the exterior.
- the I/O bridge 112 b is LSI that applies a mutual conversion between a parallel format and a serial format to a data format of data which is an object of the transfer to be executed by the I/O controller 112 a.
- the third board 113 is equipped with LSI such as a crossbar (XB) 113 a .
- the fourth board 114 is equipped with LSI such as a clock generation element (CLK) 114 a.
- the XB 113 a is LSI that mediates data between the SC 111 a and the I/O controller 112 a , and implements the transfer of smooth data.
- the CLK 114 a is LSI that generates a reference clock which is used on a common basis for operation of the information processing apparatus 110 and is applied to individual portions of the information processing apparatus 110 .
- the fifth board 115 is equipped with a breakdown notification circuit 115 a that executes the following breakdown notifications for the monitoring apparatus 120 .
- FIG. 2 chiefly shows, on the information processing apparatus 110 , the above-mentioned LSI's which are objects for monitor with the monitoring apparatus 120 , and the above-mentioned breakdown notification circuit 115 a , and an illustration is omitted regarding other parts and the circuits of the information processor 110 .
- the monitoring apparatus 120 has a processing board 121 , which is equipped with CPU 121 a , ROM 121 b , and RAM 121 c , and a memory 122 .
- the processing board 121 substantially has a function of monitoring the above-mentioned LSI's.
- the memory 122 stores a result of the monitoring by the processing board 121 .
- the ROM 121 b which is loaded on the processing board 121 , stores an executive program of the present invention.
- the CPU 121 a which is loaded on the processing board 121 , operates in accordance with the program stored in the ROM 121 b , so that the monitoring function is implemented.
- FIG. 3 is a conceptual view showing the ROM 121 b that stores an embodiment of an executive program of the present invention.
- An executive program 500 is composed of a data deriving section 510 , a breakdown classification section 520 , a data storage section 530 , and an alteration section 540 .
- the executive program 500 shown in FIG. 3 is properly developed on the RAM 121 c , and the CPU 121 a executes the executive program 500 developed on the RAM 121 c .
- the above-mentioned function of the monitoring apparatus 120 is implemented. Details of the effect of individual elements of the executive program 500 will be described later.
- FIG. 4 is a functional block diagram useful for understanding a function of the monitoring apparatus 120 which is implemented when CPU 121 a executes the executive program 500 shown in FIG. 3 .
- the function of the monitoring apparatus 120 is composed of functional blocks of a data deriving section 610 , a breakdown classification section 620 , a data storage section 630 , and an alteration section 640 .
- the CPU 121 a of the monitoring apparatus 120 shown in FIG. 2 executes the executive program 500 shown in FIG. 3 , the data deriving section 510 , the breakdown classification section 520 , the data storage section 530 , and the alteration section 540 , which constitute the executive program 500 , construct the data deriving section 610 , the breakdown classification section 620 , the data storage section 630 , and the alteration section 640 , respectively, which are shown in FIG. 4 .
- the data deriving section 610 corresponds to the data deriving section, the breakdown classification section, the data storage section, and the alteration section, respectively, which are referred to in the present invention.
- the state data representative of the state of the information processing apparatus 110 is derived from the information processing apparatus 110 .
- the state data consists of sub-data representative of operational states of individual LSI's each internally generated in the associated LSI.
- the deriving of the state data is performed in such a way that the data deriving section 610 suitably gathers the sub-data from individual LSI's shown in FIG. 2 .
- the breakdown classification section 620 classifies the breakdown that occurs in the information processing apparatus 110 into a type corresponding to a seriousness of the breakdown, of two breakdown types that are mutually different in seriousness of the breakdown between a serious breakdown type that causes a serious drop of processing performance and a down of the server and a negligible breakdown type with little in a substantial damage.
- the data storage section 630 stores the state data derived by the data deriving section 610 in storage areas of the memory 122 associated with the breakdown type classified by the breakdown classification section 620 .
- the storage area is subdivided to two or more storage sections corresponding to a mutually different data size.
- two or more sub-data constituting the state data are stored in the storage sections of the storage area, each associated with the data size of the sub-data, respectively.
- the alteration section 640 alters the number of maximum storage of sub-data in a desired storage section among two or more store sections by changing the size of the desired storage section by the user's operation. According to the present embodiment, the alteration of the size is performed through the user's operation for terminal equipment that is electrically connected to the server system 100 shown in FIG. 1 and FIG. 2 .
- FIG. 4 shows the state that a personal computer 700 of the note type is connected with the alteration section 640 of the monitoring apparatus 120 via the server system 100 .
- FIG. 5 is a flowchart of a main routine in processing of gathering and storage of the state data, which is to be executed by the monitoring apparatus 120 .
- the main routine shown in FIG. 5 starts when a power source of the monitoring apparatus 120 turns on.
- a standby loop step S 101
- step S 102 and step S 200 are omitted and the processing returns to the step S 101 through a loop edge (step S 103 ).
- step S 102 When the breakdown generation is notified from the breakdown notification circuit 115 a , the notification is received with the monitoring apparatus 120 (step S 102 ).
- step S 200 the sub-routine for gathering and storage of the state data is carried out by the data deriving section 610 , the breakdown classification section 620 , and the data storage section 630 .
- the processing returns to the step S 101 through the loop edge (step S 103 ).
- step S 200 the sub-routine for performing gathering and storage of the state data.
- FIG. 6 is a flowchart of a subroutine which performs gathering and storage of the state data.
- each LSI to be observed has a function of detecting in detail abnormality caused self-internally and notifying the breakdown notification circuit 115 a of the abnormality.
- the breakdown notification circuit 115 a overall analyzes the notification of abnormality where each LSI originates, and judges whether the information processing apparatus 110 is out of order as the device.
- the breakdown notification circuit 115 a recognizes the state of breakdown what type of breakdown the breakdown is concerned with, and describes the recognized state of breakdown into the memory of the breakdown notification circuit 115 a .
- the breakdown state of this breakdown is specified from the content of the description of the memory of the breakdown notification circuit 115 a.
- serial number (LOG-ID) is numbered for this notification (step S 202 ). This LOG-ID will be described later in detail.
- deriving of the state data from the information processing apparatus 110 is performed as mentioned above in such a way that the data deriving section 610 suitably gathers sub-data from individual LSI.
- LSI mainly causes the breakdown among seven kinds of LSI shown in FIG. 2 for each breakdown of a variety of breakdowns generated in the information processing apparatus 110 .
- Two or more kinds of sub-data which are internally generated in each LSI exist as described later.
- Table 1 is a table where the association among individual LSI, various types of breakdown states caused by individual LSI, and kinds of sub-data to be gathered to indicate the breakdown states is shown as to the breakdown of a serious breakdown type (serious breakdown type) that is high in seriousness.
- Table 2 is a table where the association similar to Table 1 is shown as to the breakdown of a negligible breakdown type (negligible breakdown type) that is low in seriousness.
- the above-mentioned individual LSI internally generates and stores six kinds of sub-data like Major information J 1 , Minor information J 2 , Allscan information J 3 , History information J 4 , Config information J 5 , and Analyze information J 6 , which are shown in Table 1 and Table 2, as sub-data that indicates the operation of the self.
- the six kinds of sub-data shown in Table 1 and Table 2 correspond to the sub-data referred to in the present invention, respectively.
- Major information J 1 is sub-data which indicate the presence of abnormality as to the main parts of two or more minute parts that compose LSI.
- Minor information J 2 is sub-data which indicate the presence of abnormality as to parts of the remainder with a relatively low importance of two or more minute parts that compose LSI.
- Allscan information J 3 is sub-data which indicate what processing is executed by the LSI when abnormality is generated in the LSI.
- History information J 4 is sub-data that indicate the operation history for a certain period until abnormal generating in the LSI.
- Config information J 5 is sub-data that indicate high/low state in prescribed part of LSI at the time when abnormality is generated in the LSI.
- Analyze information J 6 is a flag set at the time when abnormality is generated in the LSI.
- kinds of sub-data to be gathered from individual LSI on various types of breakdown states are determined as shown in Table 1 and Table 2.
- Table 1 and Table 2 as to the various types of breakdown states, the sub-data that should be gathered is expressed by a mark “ ⁇ ”, and the sub-data not gathered is expressed by a mark “x”.
- the data deriving section 610 stores such tables in form of data form.
- step S 203 LSI that may be a cause of the main breakdown, and the kind of the sub-data that should be gathered from the LSI are specified by referring to the above-mentioned tables (step S 203 ).
- the data deriving section 610 of the monitoring apparatus 120 executes the sub-routine to perform gathering of the sub-data based on the specifying result (step S 300 ).
- the processing is returned to the main routine shown in FIG. 5 .
- step S 300 the sub-routine to perform gathering of the sub-data.
- FIG. 7 is a flowchart of a subroutine which performs gathering of the sub-data.
- the repetitive loop for the following gathering processing of the number of sub-data that should be gathered worth is begun (step S 301 ). For instance, regarding the breakdown state E 1 of “CD Error”, because eight sub-data in total are gathered from three LSI, the repetition loop eight times altogether is begun.
- a set of two or more pieces of sub-data which are gathered as to a certain breakdown state for instance, eight pieces of sub-data which are gathered as to the breakdown state E 1 of “CD Error”, is treated as the state data representative of the state of the information processing apparatus 110 where the breakdown of the breakdown state occurs.
- the state data consisting of two or more pieces of sub-data corresponds to an example of the state data referred to in the present invention.
- step S 301 When a piece of sub-data is gathered in the step S 301 , there is executed a sub-routine to perform processing for storage of the gathered sub-data into the memory 122 of the monitoring apparatus 120 (step S 400 ).
- step S 301 When the piece of sub-data is stored in memory 122 , the processing is returned via a loop end (step S 301 ) to the step S 301 to execute gathering (step S 302 ) and storage (step S 400 ) on the subsequent sub-data in a predetermined order of priority.
- step S 302 when gathering (step S 302 ) and storage (step S 400 ) are repeated by the number of sub-data that should be gathered, processing returns to the sub-routine shown in FIG. 6 .
- step S 400 the sub-routine (step S 400 ) of performing the processing for storage of one sub-data into the memory 122 .
- FIG. 8 is a flowchart of a subroutine which performs processing for storage of one sub-data into the memory 122 .
- step S 401 it is confirmed which one of two breakdown types of the serious breakdown type and the negligible breakdown type is concerned with the type of the breakdown related to one sub-data to be stored. It is noted that this one sub-data is obtained through gathering as sub-data necessary for presenting the breakdown state specified in the step S 201 of the flowchart of FIG. 6 . That is, relations to the breakdown of what breakdown state of this one sub-data are already-known in the step S 401 . Thus, in the step S 401 , it is confirmed whether the already-known breakdown state belongs to Table 1 corresponding to the breakdown of the serious breakdown type or Table 2 corresponding to the breakdown of the negligible breakdown type.
- step S 402 the data size of one sub-data to be stored is confirmed.
- data sizes are either one of the following size on each the above-mentioned six kinds of sub-data gathered from each LSI.
- the Major information J 1 it is understood that it is either about 2 kilobytes, about 1 kilobyte or about 0.5 kilobytes.
- the Major information J 2 too, it is understood that it is either one of those three kinds of size.
- the Allscan information J 3 it is understood that it is either about 8 kilobytes, about 4 kilobyte, about 2 kilobytes, about 1 kilobyte or about 0.5 kilobytes.
- the Config information J 5 it is understood that it is either about 4.4 kilobytes or about 0.7 kilobytes.
- the History information J 4 and the Analyze information J 6 it is understood that they are one kind of size decided respectively almost.
- step S 402 it is confirmed that which size of the two or more kind of sizes as mentioned above is concerned with the data size of one sub-data to be stored.
- step S 403 a way of storage to the memory 122 is determined in accordance with the breakdown type confirmed up to the step S 402 as follows (step S 403 ).
- FIG. 9 is a typical illustration useful for understanding an internal structure of the memory 122 .
- the interior of the memory 122 consists of a storage area (a storage area for a serious breakdown) for sub-data (serious breakdown data) related to the breakdown of the serious breakdown type, and a storage area (a storage area for a negligible breakdown) for sub-data (negligible breakdown data) related to the breakdown of the negligible breakdown type.
- the internals of the storage area for a serious breakdown and the storage area for a negligible breakdown are subdivided, respectively as follows.
- FIG. 10 is a typical illustration useful for understanding an internal structure of a storage area for serious breakdown shown in FIG. 9 .
- FIG. 11 is a typical illustration useful for understanding an internal structure of a storage area for negligible breakdown shown in FIG. 9 .
- a storage area 122 a for a serious breakdown is subdivided into two or more storage sections as shown in FIG. 10 .
- sections for storing Major information there are prepared Major 2K section 122 a _ 1 , Major 1K section 122 a _ 2 , and Major 0.5K section 122 a _ 3 which store Major information of sizes of about 2 kilobytes, about 1 kilobyte, and about 0.5 kilobytes, respectively.
- Minor 2K section 122 a _ 4 As sections for storing Minor information, there are prepared Minor 2K section 122 a _ 4 , Minor 1K section 122 a _ 5 , and Minor 0.5K section 122 a _ 6 which store Minor information of sizes of about 2 kilobytes, about 1 kilobyte, and about 0.5 kilobytes, respectively.
- Allscan 8K section 122 a _ 7 As sections for storing Allscan information, there are prepared Allscan 8K section 122 a _ 7 , Allscan 4K section 122 a _ 8 , Allscan 2K section 122 a _ 9 , Allscan 1K section 122 a _ 10 , and Allscan 0.5K section 122 a _ 11 which store Allscan information of sizes of about 8 kilobytes, about 4 kilobytes, about 2 kilobytes, about 1 kilobyte, and about 0.5 kilobytes, respectively.
- Config 4.4K section 122 a _ 13 As sections for storing Config information, there are prepared Config 4.4K section 122 a _ 13 , and Config 0.7K section 122 a _ 14 which store Config information of sizes of about 4.4 kilobytes, and about 0.7 kilobytes, respectively.
- Config 0.7K section 122 a _ 14 As sections for storing History information and Analyze information, there are prepared History section 122 a _ 12 , and Analyze section 122 a _ 15 , respectively.
- each of 15 kinds of storage sections of the above-mentioned storage area 122 a is divided into two or more pieces of division each storing a piece of sub-data.
- a size of a piece of division of Major 2K section 122 a _ 1 is about 2 kilobytes according to the size of Major information of about 2 kilobytes to be stored in the division.
- a size of a piece of division of Allscan 8K section 122 a _ 7 is about 8 kilobytes according to the size of Allscan information of about 8 kilobytes to be stored in the division.
- a size of a piece of division of individual storage section is a size according to the type of the associated storage section.
- the number of divisions that compose each storage section is also different depending on the kind of the storage section.
- Divisions which compose each storage section are numbered as #1, #2, and #3 . . . #N 1 .
- Sub-data that are gathered in the monitoring apparatus are stored in storage sections associated with kinds of the sub-data and data sizes in order of younger number of division.
- a storage area 122 b for negligible breakdown stores sub-data that is gathered in the monitoring apparatus 122 , which is concerned with two types of information of Major information and Analyze information. Accordingly, as seen from FIG. 11 , the storage area 122 b is provided with three types of storage sections 122 b _ 1 , 122 b _ 2 , and 122 b _ 3 for Major information, and one type of storage section 122 b _ 4 . With respect to the structure of individual storage section, the redundant explanation will be omitted, since the same parts have been denoted by the same reference numbers as those of the storage area 122 a for a serious breakdown shown in FIG. 10 .
- the processing of the step S 403 determines the storage section of the memory 122 in accordance with the breakdown type of the breakdown involved in sub-data to be stored in the memory 122 , and the data size of the sub-data.
- the breakdown type of the breakdown involved in the Major information is the serious breakdown type since the breakdown state E 1 of “CD Error” is described in Table 1.
- the major information is to be stored in the storage area 122 a for a serious breakdown of the memory 122 .
- the data size of the Major information is about two kilobytes for instance, it is decided in accordance with the data size that the Major information is to be stored in the Major 2K section 122 a _ 1 of the storage area 122 a for a serious breakdown.
- LOG-ID which is numbered in the step S 202 of the flowchart of FIG. 6 , is given to the sub-data to be stored, and the sub-data provided with LOG-ID is stored in the decided storage section (step S 404 ). Thereafter, the processing returns to the subroutine of FIG. 7 .
- gathering and storage of the sub-data are repeated by the number of sub-data to be gathered.
- the repetition makes it possible, with respect to a certain breakdown state, for instance, “CD Error” and “Minor Face”, to gather and store the state data representative of the device state of the information processing apparatus 110 at the time when the breakdown of the breakdown state occurs in form of a set of two or more pieces of sub-data to which a common LOG-ID is given.
- the state data consisting of two or more pieces of sub-data is stored in either one of two storage areas in accordance with whether the breakdown type of the breakdown involved in the state data is a serious breakdown type or a negligible breakdown type.
- This feature makes it possible to avoid a frequent overwriting by state data, which are obtained on the breakdown of the negligible breakdown type that occurs frequently as compared with state data which are obtained on the breakdown of the serious breakdown type. Accordingly, this feature makes it possible to enhance protection ability of the state data which are obtained on the breakdown of the serious breakdown type.
- two or more pieces of sub-data constitute one state data are each stored in the storage section according to the data size of the sub-data, of the storage area.
- the above-mentioned storage area is effectively used in accordance with the data size of individual sub-data, and the generation of a useless unused area in the storage area is suppressed.
- This feature makes it possible to make the best use of the area of the storage area.
- the sub-data can be stored as a lot as possible and, consequently, the above-mentioned state data can be stored as a lot as possible.
- the frequency of the superscription generation in the storage area is suppressed. Accordingly, this feature makes it possible to more enhance protection ability of the state data in the storage area.
- the breakdowns of the breakdown states shown in Table 1 and Table 2 might bring about vary in the occurrence of the breakdown owing to environments of use of the server system 100 and the manufacturing error margin etc. of LSI such as SC 111 a and CPU 111 c .
- the occurrence of such a variation enhances ratio in which specific kind of sub-data is gathered.
- the frequency of the superscription of the specific kind of sub-data rises.
- the alteration section 640 has such a function.
- the alteration of the size of the storage section there is adopted a method in which because the storage capacity of the memory 122 is constant, for instance, when it is wished that the size of a desired storage section is increased, the size of other storage section is decreased properly, and the decreased amount is allocated to the desired storage section.
- an instruction of such a size alteration is performed via the terminal equipment (the personal computer 700 of the note type in an example of FIG. 4 ) which is connected to the alteration section 640 .
- a user inputs through an operating screen (not illustrated) displayed on the display screen of the personal computer 700 of the note type a new size, which is lager than the existing size, as to a desired storage section.
- the user inputs a new size which is decreased by the correspondence on which a size of the desired storage section is increased.
- the personal computer 700 of the note type transmits to the alteration section 640 a new size of individual one of those two storage sections and a section of size of individual storage section.
- the memory of the personal computer 700 of the note type previously stores a section of size of individual storage section of two or more storage sections shown in FIG. 10 .
- the personal computer 700 of the note type transmits to the alteration section 640 a size of the storage section inputted by the user, and a section of size of storage section stored therein.
- FIG. 12 is an illustration useful for understanding the situation in which the size of Major 2K section 122 a _ 1 is increased, and the size of Allscan 2K section 122 a _ 9 is decreased by just that much.
- increment L 1 is expressed by the following equation.
- the size before the alteration is S 1
- the new size is S 2 and a section of size Sa.
- the alteration of the size causes a number of sections of Major 2K section 122 a _ 1 to increase by the increment L 1 expressed by the following equation.
- L 1 ( S 2 ⁇ S 1)/ Sa
- increment L 2 is expressed by the following equation.
- the size before the alteration is S 3
- the new size is S 4 and a section of size Sb.
- the alteration of the size causes a number of sections of Major 2K section 122 a _ 1 to increase by the increment L 2 expressed by the following equation.
- a storage section of size is reduced to provide a new storage section having the corresponding size.
- the new storage section it is necessary for a user to input the size of the storage section and in addition a section of size of the storage section.
- the server system 100 which manages the client-server system in its entirety
- the monitoring apparatus of the present invention there is disclosed the monitoring apparatus 120 that monitors the operation of the information processing apparatus 110 having the management function for the client-server system in the server system 100 .
- the present invention is not restricted to those embodiments. Any one is acceptable, as an information processing system of the present invention, which comprises some information processing apparatus and a monitoring apparatus that observes the operation of the information processing apparatus. And, any one is acceptable, as a monitoring apparatus of the present invention, which observes the operation of some information processing apparatus.
- the data deriving section referred to the present invention there is raised the data deriving section 610 that gathers sub-data through accessing to seven kinds of LSI shown in FIG. 2 .
- the present invention is not restricted to this embodiment. It is acceptable that the data deriving section referred to the present invention gathers sub-data through accessing to other LSI than those seven kinds of LSI.
- the data deriving section 610 that gathers six kinds of sub-data shown in Table 1 and Table 2.
- the present invention is not restricted to this embodiment. It is acceptable that the data deriving section referred to the present invention gathers sub-data other sub-data than those six kinds of sub-data.
- the data storage section 630 that stores two or more sub-data constituting a piece of state data in a storage area according to the breakdown type of the associated breakdown through dispersion according to the data size of the individual sub-data.
- the present invention is not restricted to this embodiment. It is acceptable that the data storage section referred to the present invention stores two or more sub-data constituting a piece of state data in one section of a storage area according to the breakdown type of the associated breakdown on a batch basis.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
L 1=(S2−S1)/Sa
On the other hand, with respect to
L 2=(S3−S4)/Sb=(S2−S1)/Sb
According to the present embodiment, such a size alteration increases the number of sections of a desired storage section, or the maximum storage number of the sub-data of the storage section. This feature causes the frequency of the superscription in the storage section to be lowered, and thus makes it possible to enhance protection ability of the sub-data in the storage section.
Claims (6)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2006/303724 WO2007099593A1 (en) | 2006-02-28 | 2006-02-28 | Monitor device, monitor program, and information processing system |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2006/303724 Continuation WO2007099593A1 (en) | 2006-02-28 | 2006-02-28 | Monitor device, monitor program, and information processing system |
Publications (2)
Publication Number | Publication Date |
---|---|
US20090013075A1 US20090013075A1 (en) | 2009-01-08 |
US7925745B2 true US7925745B2 (en) | 2011-04-12 |
Family
ID=38458716
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/230,412 Expired - Fee Related US7925745B2 (en) | 2006-02-28 | 2008-08-28 | Monitoring apparatus, executive program, and information processing system |
Country Status (3)
Country | Link |
---|---|
US (1) | US7925745B2 (en) |
JP (1) | JP4478196B2 (en) |
WO (1) | WO2007099593A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8874610B2 (en) * | 2011-12-06 | 2014-10-28 | International Business Machines Corporation | Pattern-based stability analysis of complex data sets |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH02297228A (en) | 1989-05-11 | 1990-12-07 | Fujitsu Ltd | Fault information storing system |
JPH04369067A (en) | 1991-06-18 | 1992-12-21 | Hitachi Ltd | Fault processor |
JPH07162420A (en) | 1993-12-03 | 1995-06-23 | Mitsubishi Electric Corp | Network monitor system |
JPH10283230A (en) | 1997-03-31 | 1998-10-23 | Nec Corp | File data storage device and machine-readable recording medium with program recorded |
US20010011358A1 (en) * | 2000-01-27 | 2001-08-02 | Shinichi Ochiai | Fault handling system and fault handling method |
US20020051050A1 (en) * | 2000-10-30 | 2002-05-02 | Masayuki Hachinoda | Printing apparatus and communication apparatus and information processing apparatus having the same |
JP2002229816A (en) | 2001-01-31 | 2002-08-16 | Fujitsu Ltd | Fault information acquiring system |
JP2002366396A (en) | 2001-06-06 | 2002-12-20 | Nec Corp | System and program for automatically collecting fault analysis information |
JP2003015912A (en) | 2001-06-22 | 2003-01-17 | Internatl Business Mach Corp <Ibm> | System and method for control of fragmentation in message logging |
US20040194107A1 (en) * | 2003-03-27 | 2004-09-30 | Yoshimasa Masuoka | Method for generating policy rules and method for controlling jobs using the policy rules |
US20060031711A1 (en) * | 2004-08-06 | 2006-02-09 | Canon Kabushiki Kaisha | Information processing apparatus and information notification method therefor, and control program |
JP4369067B2 (en) | 2001-02-15 | 2009-11-18 | ヤンマー株式会社 | Engine cylinder block machining method |
-
2006
- 2006-02-28 JP JP2008502580A patent/JP4478196B2/en not_active Expired - Fee Related
- 2006-02-28 WO PCT/JP2006/303724 patent/WO2007099593A1/en active Application Filing
-
2008
- 2008-08-28 US US12/230,412 patent/US7925745B2/en not_active Expired - Fee Related
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH02297228A (en) | 1989-05-11 | 1990-12-07 | Fujitsu Ltd | Fault information storing system |
JPH04369067A (en) | 1991-06-18 | 1992-12-21 | Hitachi Ltd | Fault processor |
JPH07162420A (en) | 1993-12-03 | 1995-06-23 | Mitsubishi Electric Corp | Network monitor system |
JPH10283230A (en) | 1997-03-31 | 1998-10-23 | Nec Corp | File data storage device and machine-readable recording medium with program recorded |
US20010011358A1 (en) * | 2000-01-27 | 2001-08-02 | Shinichi Ochiai | Fault handling system and fault handling method |
US20020051050A1 (en) * | 2000-10-30 | 2002-05-02 | Masayuki Hachinoda | Printing apparatus and communication apparatus and information processing apparatus having the same |
JP2002229816A (en) | 2001-01-31 | 2002-08-16 | Fujitsu Ltd | Fault information acquiring system |
JP4369067B2 (en) | 2001-02-15 | 2009-11-18 | ヤンマー株式会社 | Engine cylinder block machining method |
JP2002366396A (en) | 2001-06-06 | 2002-12-20 | Nec Corp | System and program for automatically collecting fault analysis information |
JP2003015912A (en) | 2001-06-22 | 2003-01-17 | Internatl Business Mach Corp <Ibm> | System and method for control of fragmentation in message logging |
US20040194107A1 (en) * | 2003-03-27 | 2004-09-30 | Yoshimasa Masuoka | Method for generating policy rules and method for controlling jobs using the policy rules |
US20060031711A1 (en) * | 2004-08-06 | 2006-02-09 | Canon Kabushiki Kaisha | Information processing apparatus and information notification method therefor, and control program |
Non-Patent Citations (3)
Title |
---|
International Search Report mailed Jun. 6, 2006, issued in corresponding International Application No. PCT/JP2006/303724. |
Translation of the International Preliminary Report on Patentability, mailed Sep. 12, 2008, issued in corresponding International Application No. PCT/JP2006/303724. |
URL:http://primeserver.fujitsu.com/primepower/news/article/05/0111/. |
Also Published As
Publication number | Publication date |
---|---|
WO2007099593A1 (en) | 2007-09-07 |
JP4478196B2 (en) | 2010-06-09 |
US20090013075A1 (en) | 2009-01-08 |
JPWO2007099593A1 (en) | 2009-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9471457B2 (en) | Predictive alert threshold determination tool | |
US7783744B2 (en) | Facilitating root cause analysis for abnormal behavior of systems in a networked environment | |
US8181161B2 (en) | System for automatically collecting trace detail and history data | |
US8000932B2 (en) | System and method for statistical performance monitoring | |
US8122158B1 (en) | Method for improving I/O performance of host systems by applying future time interval policies when using external storage systems | |
US20100153431A1 (en) | Alert triggered statistics collections | |
US7769562B2 (en) | Method and apparatus for detecting degradation in a remote storage device | |
US7409604B2 (en) | Determination of related failure events in a multi-node system | |
US10168921B1 (en) | Systems and methods for storing time-series data | |
US20020178396A1 (en) | Systems and methods for providing automated diagnostic services for a cluster computer system | |
WO2003073203A2 (en) | System and method for analyzing input/output activity on local attached storage | |
JP2007207173A (en) | Performance analysis program, performance analysis method, and performance analysis device | |
US8788527B1 (en) | Object-level database performance management | |
CN101385276A (en) | Apparatus, system and method for error assessment over a communication link | |
WO2007068667A1 (en) | Method and apparatus for analyzing the effect of different execution parameters on the performance of a database query | |
CN108509634A (en) | Jitterbug monitoring method, monitoring device and computer readable storage medium | |
US10684933B2 (en) | Smart self-healing service for data analytics systems | |
US20200327036A1 (en) | Topology aware real time gpu-to-gpu traffic monitoring method and analyzing tools | |
US20090157923A1 (en) | Method and System for Managing Performance Data | |
US20100011100A1 (en) | Health Check System, Server Apparatus, Health Check Method, and Storage Medium | |
US7925745B2 (en) | Monitoring apparatus, executive program, and information processing system | |
US20140007112A1 (en) | System and method for identifying business critical processes | |
CN106686082B (en) | Storage resource adjusting method and management node | |
CN114896128A (en) | Application program performance testing method and device based on block chain | |
US20050216490A1 (en) | Automatic database diagnostic usage models |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: FUJITSU LIMITED, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:KATO, KOSUKE;REEL/FRAME:021512/0370 Effective date: 20080709 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
CC | Certificate of correction | ||
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20190412 |