WO2013069138A1 - 稼働情報予測計算機、稼働情報予測方法及びプログラム - Google Patents
稼働情報予測計算機、稼働情報予測方法及びプログラム Download PDFInfo
- Publication number
- WO2013069138A1 WO2013069138A1 PCT/JP2011/075980 JP2011075980W WO2013069138A1 WO 2013069138 A1 WO2013069138 A1 WO 2013069138A1 JP 2011075980 W JP2011075980 W JP 2011075980W WO 2013069138 A1 WO2013069138 A1 WO 2013069138A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- operation information
- value
- prediction
- configuration information
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3452—Performance evaluation by statistical analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/008—Reliability or availability analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/81—Threshold
Definitions
- the present invention relates to a computer that collects device operation information, and more particularly to an operation information prediction computer that predicts future operation information based on the collected operation information.
- the silent failure refers to a failure that cannot be detected by an autonomous diagnosis function prepared in advance on a computer system.
- Patent Document 1 collects performance data of an IT system at a predetermined interval for a certain period, and generates a baseline by averaging the collected performance data with a weight. Then, this apparatus calculates a prediction upper and lower limit range (threshold value) of the next performance data through a statistical analysis model using parameters such as tendency, time, and sensitivity. When the current performance data exceeds the threshold value, the apparatus notifies the failure by notifying the event.
- configuration information such as IT system resources (for example, CPU allocation rate and memory allocation rate) according to the IT system load status May change more frequently.
- FIG. 23 is an explanatory diagram of the relationship between configuration information of a conventional IT system and response performance (operation information) of the IT system.
- the configuration information of the IT system is a CPU allocation rate and a DB (database) cache.
- the CPU allocation rate indicates the percentage of the CPU allocated to the virtual machine generated in the IT system as a percentage
- the DB cache indicates the capacity of the DB cache allocated in the IT system in megabytes.
- the configuration information such as IT system resources is different, the response performance of the IT system is different.
- the configuration information of the IT system has been changed four times and is shown in each state (a) to (e) of the configuration information.
- state (a) the CPU allocation rate is 20% and the DB cache is 1 MB.
- the CPU allocation rate and the DB cache are changed from the state (a) to enter the state (b).
- state (b) the CPU allocation rate is 30% and the DB cache is 1.5 MB.
- the CPU allocation rate is changed from state (b) to state (c).
- the CPU allocation rate is 45% and the DB cache is 1.5 MB as in the state (b).
- the DB cache is changed from state (c) to state (d).
- the CPU allocation rate is 45% as in the state (b)
- the DB cache is 2 MB.
- the CPU allocation rate is changed from state (d) to state (e).
- the CPU allocation rate is 35% and the DB cache is 2 MB as in the state (d).
- FIG. 24 is an explanatory diagram of a conventional baseline calculation process.
- the baseline is a value that predicts future response performance.
- the conventional baseline calculation process is a process of calculating an average value of response performance at the same time on different days and using the calculated average value as a baseline.
- FIG. 25 is an explanatory diagram of the relationship between the conventional baseline and the current response performance.
- the baseline upper limit value is calculated by adding a predetermined value to the baseline calculated based on the response performance of the configuration information different from the current configuration information
- a baseline lower limit value is calculated by subtracting a predetermined value from the baseline. If the current response performance is not within the range between the baseline upper limit value and the baseline lower limit value, an abnormality in the IT system is detected.
- the baseline is calculated based on the response performance of the configuration information different from the current configuration information, as shown in FIG. 25, the current response performance is the difference between the baseline upper limit value and the baseline lower limit value. It was not within the range, and the baseline desired by the administrator could not be calculated.
- an object of the present invention is to provide a computer that can predict future operation information immediately after the configuration information is changed.
- a typical example of the invention disclosed in the present application is as follows. That is, collecting operation information of the device from at least one device, predicting future operation information of the device based on the collected operation information, in the operation information prediction computer including a storage area, the operation information, A state information collection unit that collects state information including the configuration information of the device at the time of collection of the operation information from the device, and the operation information and the configuration information collected by the state information collection unit in the storage area A state information storage unit that stores the past operation information stored in the storage area by the state information storage unit, and a correlation value calculation unit that calculates a correlation value to correspond to the current configuration information; A predicted operation value calculation unit that calculates a predicted operation value in the future based on the operation information and the correlation value calculated by the correlation value calculation unit.
- a computer capable of predicting future response performance immediately after the configuration information is changed can be provided.
- FIG. 1 is a schematic explanatory diagram of a baseline calculation method according to an embodiment of the present invention.
- the failure sign detection system 500 collects not only operation information but also configuration information from an apparatus to be observed (IT system 550 (see FIG. 5)). Information is stored in the storage area in time series.
- the failure sign detection system 500 calculates a correlation function based on the operation information and the configuration information stored in the storage area. Then, the failure sign detection system 500 calculates a correlation value for making the past operation information stored in the storage area correspond to the current configuration information based on the correlation function.
- the failure sign detection system 500 calculates a temporary baseline based on the operation information stored in the storage area.
- the failure sign detection system 500 corrects the temporary baseline by reflecting the correlation value in the temporary baseline, and generates a baseline.
- the failure sign detection system 500 can detect a failure sign of the IT system 550 immediately after the change of the configuration information without collecting the changed configuration information for a predetermined time. .
- FIG. 2A is a schematic explanatory diagram of CPU allocation rate correlation value calculation processing according to the embodiment of this invention.
- the failure sign detection system 500 plots the CPU allocation rate and response performance stored in the storage area on the coordinates with the CPU allocation rate as the x-axis and the response performance as the y-axis. Note that the CPU allocation rate, DB cache, and response performance stored in the storage area are the same as those described with reference to FIG.
- the failure sign detection system 500 determines, for each CPU allocation rate stored in the storage area, a correlation value for making the response performance of each CPU allocation rate correspond to the current CPU allocation rate based on the correlation function. To calculate.
- the failure sign detection system 500 subtracts a value obtained by substituting the current CPU allocation rate into the correlation function from a value obtained by substituting each CPU allocation rate stored in the storage area into the correlation function, thereby The correlation value with respect to each CPU allocation rate stored in is calculated.
- FIG. 2A illustrates the correlation values when the CPU allocation rate is 20%, 30%, and 45% when the current CPU allocation rate is 35%.
- the correlation value is f (35) -f (20).
- the correlation value is f (35) -f (30).
- the correlation value is f (35) -f (45).
- FIG. 2B is a schematic explanatory diagram of DB cache correlation value calculation processing according to the embodiment of this invention.
- the correlation value of the DB cache is calculated by the same process as the correlation value of the CPU allocation rate described in FIG. 2A.
- the correlation value when the DB cache is 1 MB is g (2) ⁇ g (1).
- the correlation value is g (2) ⁇ g (1.5).
- FIG. 3 is a schematic explanatory diagram of the baseline correction processing according to the embodiment of the present invention.
- FIG. 3 illustrates a temporary baseline calculated by the same processing as in FIG.
- the response performance configuration information is divided into the area (1) and the area (2) shown in FIG.
- the temporary baseline in the area (1) shown in FIG. 3 is calculated using response performances 3-a, 2-b, 1-c, and 1-d as shown in FIG. Is done.
- the configuration information (CPU allocation rate, DB cache) of these response performances is in order of 3-a (20, 1), 2-b (30, 1.5), 1-c (45), as shown in FIG. 1.5), and 1-d (45,2).
- the temporary baseline in the area (2) shown in FIG. 3 uses response performances 3-b, 2-b, 1-e, 2-c, and 1-d as shown in FIG. Calculated.
- the configuration information (CPU allocation rate, DB cache) of these response performances is in order of 3-b (30, 1.5), 2-b (30, 1.5), 1-e in order as shown in FIG. (35, 2), 2-c (45, 1.5), and 1-d (45, 2).
- the values for correcting the temporary baseline in the area (1) are the correlation values of the CPU allocation ratios of 20%, 30%, and 45%, and the correlations of the DB caches 1MB and 1.5MB. Calculated based on the value.
- the value for correcting the temporary baseline in the area (2) does not include the correlation value of the CPU allocation rate 20% and the correlation value of the DB cache 1MB, and the correlation values of the CPU allocation rate 30% and 45%. , And based on each correlation value of the DB cache 1.5 MB.
- the value for correcting the temporary baseline is a value obtained by adding a value obtained by multiplying each correlation value by a weighting factor set in each configuration information.
- FIG. 4 is a schematic explanatory diagram of the correlation value changing process according to the embodiment of the present invention.
- the difference (d) between the current response performance and the corrected baseline is reduced.
- the correlation value of each piece of configuration information is changed by changing at least one of the correlation function and the weighting coefficient.
- FIG. 5 is a system configuration diagram of the failure sign detection system 500 according to the embodiment of the present invention.
- the failure sign detection system 500 is connected to an IT system 550 to be monitored.
- the failure sign detection system 500 includes a CPU 521, a memory (storage area) 522, an external storage device 523, and a communication interface (I / F) 524.
- the CPU 521 executes various programs stored in the memory 522.
- the memory 522 programs the state management unit 501, the stream data processing unit 502, the baseline (BL) generation unit 503, the correction unit 504, the threshold generation unit 505, the abnormality detection unit 506, the notification unit 507, and the relative comparison unit 508.
- the memory 522 stores a state value storage database (DB) 511 and a correlation value storage database (DB) 512 as databases.
- DB state value storage database
- DB correlation value storage database
- the communication I / F 524 is connected to a device that communicates with the failure sign detection system 500. Specifically, the communication I / F 524 is connected to an IT system 550 to be observed and a client PC (not shown) operated by an administrator.
- the program for realizing the functions of the above-described units does not need to be stored in one memory, but may be distributed and stored in the memories of a plurality of computers, and the sign detection system 200 may be realized by a plurality of computers. .
- the IT system 550 includes a CPU 551, a storage device 552, an input / output device 553, and a tuning parameter 554.
- the CPU 551 executes various programs stored in the storage device 552.
- the storage device 552 stores various programs and the like.
- the input / output device 553 includes a device (for example, a mouse and a keyboard) for inputting various data to the IT system 550 and a device (for example, a display and a printer) for outputting various data.
- the tuning parameter 554 is a value of various parameters of various software, and is normally stored in the storage device 552.
- FIG. 6 is a functional block diagram of the failure sign detection system 500 according to the embodiment of the present invention.
- the state management unit 501 collects operation information and configuration information from the IT system 550 and inputs state information (state value) obtained by adding the collected configuration information to the collected operation information to the stream data processing unit 502.
- state information state value
- the stream data processing unit 502 temporarily stores the state value input from the state management unit 501 and averages the operation information and configuration information included in the state value at predetermined time intervals (for example, 1 minute).
- the operation information and the configuration information are stored in the state value storage DB 511.
- the process in which the stream data processing unit 502 stores operation information and configuration information in the state value storage DB 511 and the state value storage DB 511 will be described in detail with reference to FIGS. 9A and 9B.
- observation target information collection processing A series of processes in which the state management unit 501 collects operation information and configuration information and the stream data processing unit 502 stores the operation information and configuration information in the state value storage DB 511 is called observation target information collection processing. Details of the collection process will be described with reference to FIG.
- the BL generation unit 503 acquires a past state value stored in the state value storage DB 511, calculates a statistic using the acquired past operation information as a parameter as a temporary baseline, and calculates the calculated temporary baseline Is input to the correction unit 504. Specifically, as described with reference to FIG. 24, the BL generation unit 503 calculates a temporary baseline by averaging operation information at the same time on different days.
- the correction unit 504 acquires the correlation function stored in the correlation value storage DB 512 and acquires the correlation function. A correlation value is calculated based on the correlation function, and the calculated correlation value is reflected in the temporary baseline to calculate a baseline corresponding to the current configuration information.
- the correlation function is a function indicating the correspondence between the operation information stored in the state value storage DB 511 and the configuration information.
- the failure sign detection system 500 automatically sets a correlation value based on past operation information and configuration information stored in the state value storage DB 511, and the administrator manually sets the correlation function setting process.
- correction unit 504 calculates the baseline by correcting the temporary baseline
- the BL generation unit 503 does not generate a temporary baseline
- the correction unit 504 reflects the correlation value in the operation information stored in the state value storage DB 511, and the base is based on the operation information in which the correlation value is reflected. A line may be calculated. Details of this baseline calculation processing will be described with reference to FIG.
- the relative comparison unit 508 verifies the tendency of the change of the operation information due to the change of the configuration information by comparing the temporary baseline and the baseline obtained by correcting the temporary baseline with the current operation information.
- the relative comparison unit 508 determines whether or not the current operation information is within the range of the difference between the temporary baseline and the corrected baseline. If the relative comparison unit 508 determines that the current operation information is not within the difference between the temporary baseline and the corrected baseline, the relative comparison unit 508 detects that the corrected baseline is abnormal. Details of the process in which the relative comparison unit 508 detects the corrected baseline abnormality will be described with reference to FIG.
- the relative comparison unit 508 determines that the current operation information is within the difference between the temporary baseline and the corrected baseline, calculates the difference between the current operation information and the corrected baseline. Change the correlation value to make it smaller. As a result, the corrected baseline becomes closer to the current operation information, and the accuracy of detecting an abnormality in the IT system 550 can be improved.
- a series of processing of the relative comparison unit 508 will be described later with reference to FIG. Details of the process in which the relative comparison unit 508 changes the correlation value will be described with reference to FIG.
- the threshold generation unit 505 sets a threshold for detecting an abnormality in the IT system 550 based on the corrected baseline.
- the threshold includes an upper limit threshold and a lower limit threshold.
- the upper limit threshold is calculated by adding a predetermined value to the corrected baseline, and the lower limit threshold is obtained by subtracting the predetermined value from the corrected baseline. Is calculated by
- the abnormality detection unit 506 determines whether the current configuration information is within the threshold range calculated by the threshold generation unit 505. When the abnormality detection unit 506 determines that the current configuration information is not within the range of the threshold calculated by the threshold generation unit 505, the abnormality detection unit 506 detects an abnormality in the IT system 550 and notifies the notification unit 507 accordingly. On the other hand, if the abnormality detection unit 506 determines that the current configuration information is within the threshold range calculated by the threshold generation unit 505, the abnormality detection unit 506 determines that there is no abnormality in the IT system 550.
- the threshold value calculation processing by the threshold value generation unit 505 and the abnormality detection processing of the IT system 550 by the abnormality detection unit 506 are collectively referred to as abnormality detection processing, and will be described in detail with reference to FIGS. 19 and 20.
- the notification unit 507 When the notification unit 507 is notified that the abnormality of the IT system 550 is detected from the abnormality detection unit 506, the notification unit 507 notifies the administrator that the abnormality of the IT system 550 has been detected.
- a notification method a method of outputting an abnormality detection screen 2100 (see FIG. 21) on a screen of a client PC (not shown) connected to the failure sign detection system 500, a method of outputting from the client PC from a speaker, e-mail, etc. There is a way to output by.
- FIG. 7 is a flowchart of overall processing performed by the failure sign detection system 500 according to the embodiment of this invention.
- This entire process is executed by the CPU 521 provided in the failure sign detection system 500.
- the failure sign detection system 500 executes observation target information collection processing (701).
- the state management unit 501 collects operation information and configuration information from the IT system 550, and the stream data processing unit 502 stores the operation information and configuration information collected by the state management unit 501 in the state value storage DB 511. It is a process to store.
- the baseline generation process is a process in which the correction unit 504 generates a baseline corresponding to the current configuration information.
- the relative comparison process is a process in which the relative comparison unit 508 compares the temporary baseline and the baseline.
- the abnormality detection process is a process of detecting an abnormality of the IT system 550 by the threshold value generation unit 505 generating a threshold value and the abnormality detection unit 506 determining whether or not the current operation information is within the threshold value range.
- the notification process is a process in which the notification unit 507 notifies the administrator of the abnormality when an abnormality of the IT system 550 is detected in the abnormality detection process.
- FIG. 8 is an explanatory diagram of configuration information collected from the IT system 550 according to the embodiment of this invention.
- the IT system 550 is configured by at least one physical machine 810. On the hypervisor (not shown) of the physical machine 810, the virtual machine 820 operates.
- the configuration information collected from the IT system 550 includes physical configuration information of the IT system 550 and logical configuration information of the IT system 550.
- the physical configuration information is configuration information of physical resources (such as a CPU, a memory, and a hard disk) provided in the physical machine 810.
- the physical configuration information includes information on the number of CPU clocks and the number of cores, information on the number of clocks and capacity of the memory, and information on the capacity and buffer size of the hard disk.
- Logical configuration information is information related to software 805 executed by the physical machine 810.
- the logical configuration information includes, for example, OS version information 844 executed by the physical machine 810, a database cache size 845, and the like. Further, the logical configuration information includes information related to physical resources allocated to the virtual machine 820.
- the information regarding the physical resources allocated to the virtual machine 820 includes, for example, information regarding the number of CPU cores allocated to the virtual machine 820, information regarding the capacity of the memory allocated to the virtual machine 820, and information allocated to the virtual machine 820. Contains information about hard disk capacity.
- FIG. 9A and FIG. 9B show the observation target information collection processing according to the embodiment of the present invention until the stream data processing unit 502 stores the state value of the IT system 550 collected by the state management unit 501 in the state value storage DB 511. It is explanatory drawing.
- FIG. 9A (A) is a graph of the response performance (operation information) and configuration information of the IT system 550 in time series.
- the memory capacity included in the configuration information is changed from 1024 MB to 2048 MB at 12:01:00
- the DB cache included in the configuration information is changed from 1 MB to 2 MB.
- the memory capacity is physical configuration information
- the DB cache is logical configuration information.
- FIG. 9A illustrates a state in which the state value collected from the IT system 550 by the state management unit 501 is temporarily stored in the stream data processing unit 502.
- the state value temporarily held by the stream data processing unit 502 includes a collection time 901, operation information 902, and configuration information 903.
- the state management unit 501 collects state values including operation information and configuration information is registered. Note that the state management unit 501 collects state values from the IT system 550, for example, in units of one second.
- the response performance of the IT system 550 collected at the time registered at the collection time is registered.
- configuration information 903 physical configuration information (memory capacity) and logical configuration information (DB cache) of the IT system 550 collected at the time registered at the collection time are registered.
- the stream data processing unit 502 averages temporarily held state values in a predetermined time unit (for example, one minute unit).
- the state values averaged by the stream data processing unit 502 are shown in (C) of FIG. 9A.
- the stream data processing unit 502 stores the averaged state value in the state value storage DB 511.
- the state value storage DB 511 is shown in FIG.
- the state value storage DB 511 includes a collection time 901, operation information 902, and configuration information 903, as with the state values held in the stream data processing unit 502 shown in FIG. 9A (B). These descriptions are the same as (B) in FIG.
- FIG. 10 is a flowchart of observation target information collection processing according to the embodiment of the present invention.
- the observation target information collection process is executed by the CPU 521 calling a program corresponding to the state management unit 501 and a program corresponding to the stream data processing unit 502 and executing these programs.
- the stream data processing unit 502 sets the elapsed time to zero in order to measure the time for averaging the state values collected by the state management unit 501 (1001).
- the state management unit 501 collects operation information from the IT system 550 to be observed (1002), and collects configuration information from the IT system 550 to be observed (1003).
- the state management unit 501 inputs a state in which the operation information collected in the process of step 1003 is added to the operation information collected in the process of step 1002 to the stream data processing unit 502, and the stream data processing unit 502
- the input state value is temporarily held (1004).
- the stream data processing unit 502 determines whether or not the elapsed time after executing the processing of step 1001 has exceeded the time for averaging the state values (1005).
- step 1005 If it is determined in the process of step 1005 that the elapsed time since the process of step 1001 has not exceeded the time for averaging the state values, the process returns to the process of step 1002.
- the stream data processing unit 502 determines the operation information and configuration of the state value. The information is averaged, and the averaged operation information and configuration information are stored in the state value storage DB 511 (1006).
- the state management unit 501 determines whether or not a state value for a predetermined period (for example, one day) for generating a baseline is stored in the state value storage DB 511 (1007).
- step 1007 If it is determined in step 1007 that the state value for a predetermined period (for example, one day) for generating the baseline is stored in the state value storage DB 511, the observation target information collection process is terminated, and the CPU 521 Then, the baseline generation process which is the process of step 702 shown in FIG. 7 is executed.
- a predetermined period for example, one day
- step 1007 if it is determined in step 1007 that the state value for a predetermined period (for example, one day) for generating the baseline is not stored in the state value storage DB 511, the process returns to step 1001.
- FIG. 11 is an explanatory diagram until a baseline is generated in the baseline generation processing of the present invention.
- FIG. 11A illustrates a temporary baseline generated by the BL generation unit 503.
- the provisional baseline is calculated by the BL generation unit 503 referring to the state value storage DB 511 and averaging the operation information at the same time on different days.
- FIG. 11B shows an explanatory diagram of the correlation value storage DB 512.
- a correlation function indicating the correspondence between the configuration information and the operation information is registered for each type of configuration information and each type of operation information.
- the correlation value storage DB 512 includes a configuration value X1101, an operation value Y1102, and a correlation value 1103.
- the type of configuration information is registered in the configuration value X1101, and the response time is registered in the operating value Y1102.
- the correlation function a function that passes the configuration value and the operation value when the configuration value is plotted on the X axis and the operation value is plotted on the Y axis is registered.
- FIG. 11C illustrates a baseline generated by the correction unit 504 reflecting the correlation value with respect to the temporary baseline.
- the correlation value is for associating past operation information for which a temporary baseline has been calculated with current configuration information.
- the correlation value is calculated based on the correlation function, and the correlation function calculation process will be described in detail with reference to FIG.
- the correlation value of the operation information when the memory capacity is 1024 MB is the correlation information f (x), the operation information (f ( It is calculated by subtracting the operation information (f (1024)) having a memory capacity of 1024 MB from 2048)).
- Reflecting the correlation value in the temporary baseline means a value obtained by multiplying the correlation value corresponding to the configuration information of the operation information used for calculating the temporary baseline by a predetermined weight coefficient ((C in FIG. 11 ) (See (1)) is added to the temporary baseline.
- a predetermined weight coefficient ((C in FIG. 11 ) (See (1)) is added to the temporary baseline.
- the failure sign detection system 500 can calculate a baseline corresponding to the current configuration information from the temporary baseline.
- FIG. 12 is an explanatory diagram of the correlation function calculation process according to the embodiment of the present invention.
- the correlation function calculation process includes a correlation function automatic setting process in which the failure sign detection system 500 automatically calculates a correlation function with reference to the state value storage DB 511, and a correlation function in which the correlation function is manually set by an administrator. There is manual setting processing. First, the correlation function automatic setting process will be described.
- the correction unit 504 plots the configuration information and the operation information stored in the state value storage DB 511 with the configuration information as the x axis and the operation information as the y axis. In addition, when there are a plurality of pieces of operation information in the same configuration information, the correction unit 504 plots an average value of the plurality of pieces of operation information as the operation information.
- the correction unit 504 calculates a function that passes the plotted configuration information and operation information using, for example, the least square method, and sets the calculated function as a correlation function.
- the correlation function (f (x) ⁇ 0.15x + 500) when the configuration information is the memory capacity and the operation information is the response time, the configuration information is the DB cache, and the operation information is the response time.
- the correlation function (g (x) (350 / x) ⁇ 50) is illustrated.
- the correction unit 504 registers the set correlation function in the correlation value storage DB 512.
- the correction unit 504 transmits an instruction to display the correlation function registration screen 1200 to a client PC (not shown) connected to the failure sign detection system 500.
- the client PC receives the instruction, the client PC displays a correlation function registration screen 1200.
- the correlation function registration screen 1200 includes a configuration value input field 1201, an operation value input field 1202, a correlation function input field 1203, and a registration button 1204.
- the configuration value input field 1201 is a field in which the name of configuration information for calculating a correlation function is input.
- the operating value input field 1202 is a field into which the name of operating information for calculating a correlation function is input.
- the correlation function input field 1203 is a field in which the correlation function of the configuration information input to the configuration value input field 1201 and the operation information input to the operation value input field 1202 is input.
- the client PC When the registration button 1204 is operated, the client PC causes the configuration information input in the configuration value input field 1201, the operation information input in the operation value input field 1202, and the correlation function input in the correlation function input field 1203. To the failure sign detection system 500 as correlation function input data.
- the correction unit 504 registers the received correlation function input data in the correlation value storage DB 512.
- FIG. 13 is a flowchart of the baseline generation process for reflecting the correlation value according to the embodiment of the present invention in the temporary baseline.
- the baseline generation processing is executed by the CPU 521 calling a program corresponding to the BL generation unit 503 and a program corresponding to the correction unit 504 and executing these programs.
- the BL generation unit 503 obtains a past state value during a period in which a temporary baseline can be generated from the state value stored in the state value storage DB 511 (1301).
- the BL generation unit 503 calculates, as a temporary baseline, a statistic based on the operation information of the state value acquired in the process of step 1301 (1302).
- the BL generation unit 503 calculates, as a statistic, an average of operation information having the same collection time among the operation information of the state value acquired in the process of step 1301.
- the BL generation unit 503 determines whether the current configuration information is different from the configuration information of the past operation information used for calculating the temporary baseline (1303).
- the correction unit 504 refers to the correlation value storage DB 512. Then, the correlation function of the configuration information determined to be different is acquired (1304).
- the correction unit 504 calculates the correlation value of the operation information used for calculating the temporary baseline based on the correlation function acquired in the process of Step 1304 (1305).
- the correction unit 504 calculates the correlation value of the operation information whose configuration information is different from the current configuration information among the operation information used for calculating the temporary baseline.
- the correlation value is calculated by subtracting the value obtained by substituting the current configuration information into the correlation function from the value obtained by substituting the current configuration information into the correlation function.
- the correction unit 504 calculates a baseline corresponding to the current configuration information by reflecting the correlation value calculated in step 1305 on the temporary baseline (1306), and ends the baseline generation processing. To do.
- the correction unit 504 sets a temporary baseline as a baseline without executing the processing of steps 1304 to 1306, and ends the baseline generation processing.
- an object is to provide a computer that can predict future operation information immediately after the configuration information is changed.
- FIG. 14 is a flowchart of the baseline generation process for reflecting the correlation value of the embodiment of the present invention in the state value.
- the BL generation unit 503 acquires a past state value of a period during which a temporary baseline can be generated from the state value stored in the state value storage DB 511 (1401).
- the BL generation unit 503 determines whether or not the current configuration information is different from the configuration information of the past state value acquired in the process of Step 1401 (1402).
- the correction unit 504 refers to the correlation value storage DB 512 and is different.
- the correction unit 504 calculates the correlation value of the operation information of the past state value acquired in the process of step 1401 based on the correlation function acquired in the process of step 1403 (1404).
- the correction unit 504 calculates the correlation value of the operation information whose configuration information is different from the current configuration information among the operation information of the past state values acquired in the process of step 1401.
- the correlation value is calculated by subtracting the value obtained by substituting the current configuration information into the correlation function from the value obtained by substituting the current configuration information into the correlation function.
- the correction unit 504 reflects the correlation value calculated in the process of step 1404 on the operation information of the past state value acquired in the process of step 1401 (1405).
- the BL generation unit 503 calculates, as a baseline, a statistic based on the operation information of the past state value reflecting the correlation value in the process of step 1405 (1406), and ends the baseline generation process.
- the BL generation unit 503 acquires the process information in the process of step 1401.
- the statistic based on the operation information of the past state value is calculated as a baseline (1407), and the baseline generation processing is terminated. As described above, a baseline corresponding to the current configuration information is calculated.
- FIG. 15 is an explanatory diagram of processing in which the relative comparison unit 508 according to the embodiment of this invention detects an abnormality in the baseline after correction.
- the relative comparison unit 508 determines whether or not the current operation information is within the range between the corrected baseline and the temporary baseline.
- the relative comparison unit 508 detects that the baseline is abnormal and notifies that fact. Notification to the unit 507.
- the notification unit 507 When the notification unit 507 is notified from the abnormality detection unit 506 that the corrected baseline abnormality has been detected, the notification unit 507 notifies the administrator that the corrected baseline abnormality has been detected.
- a notification method a method of outputting an abnormality detection screen on a screen of a client PC (not shown) connected to the failure sign detection system 500, a method of outputting by voice from a speaker of the client PC, and a method of outputting by mail or the like There is a way. Details of the abnormality detection screen will be described with reference to FIG.
- the abnormality of the baseline is caused by the abnormality of the correlation function, and the notification of the abnormality of the baseline to the administrator means that the abnormality of the correlation function is notified to the administrator.
- FIG. 16 is an explanatory diagram of processing in which the relative comparison unit 508 of the embodiment of the present invention changes the correlation function. As illustrated in FIG. 16, when the current operation information is in the range between the corrected baseline and the temporary baseline, the relative comparison unit 508 calculates a difference between the current operation information and the corrected baseline. Then, it is determined whether or not the calculated difference is larger than a predetermined value.
- the relative comparison unit 508 When it is determined that the calculated difference is greater than the predetermined value, the relative comparison unit 508 performs correlation so that the difference between the current operation information and the baseline calculated based on the changed correlation function becomes a predetermined value. Change the function. By changing the correlation function, the correlation value is also changed.
- the configuration information is the memory capacity
- the correlation function f is the operation information is the response time.
- the configuration information is the DB cache
- the correction unit 504 calculates a correlation value based on the changed correlation function, and corrects the baseline by reflecting the calculated correlation value in the future temporary baseline.
- FIG. 17 is a flowchart of relative comparison processing by the relative comparison unit 508 according to the embodiment of this invention.
- the relative comparison process is executed by the CPU 521 calling a program corresponding to the relative comparison unit 508 and executing the program.
- the relative comparison unit 508 specifies the range of operation information between the corrected baseline and the temporary baseline (1701).
- the relative comparison unit 508 determines whether or not the current operation information is within the range of the operation information specified in the processing of Step 1701 (1702).
- step 1702 If it is determined in step 1702 that the current operation information is not within the range of operation information specified in step 1701, the relative comparison unit 508 detects that the baseline is abnormal (1703). ) To notify the notification unit 507 to that effect, and the relative comparison processing is terminated.
- the relative comparison unit 508 determines the difference between the current operation information and the baseline. Is determined to be greater than or equal to a predetermined value (1704).
- step 1704 If it is determined in step 1704 that the difference between the current operation information and the baseline is greater than or equal to a predetermined value, the relative comparison unit 508 calculates based on the current operation information and the changed correlation function. The correlation function is changed so that the difference from the baseline becomes smaller than a predetermined value (1705).
- the correcting unit 504 calculates and calculates the correlation value of the operation information used for calculating the temporary baseline after the current time, based on the correlation function changed in the processing of Step 1705.
- the correlation value is reflected in a temporary baseline after the current time, a new baseline is generated (1706), and the relative comparison process is terminated.
- the difference from the baseline calculated based on the changed correlation function is predetermined.
- the correlation function is changed so as to be smaller than the value, the predetermined value may be zero. That is, the correlation function may be changed so that there is no difference if there is a difference between the current operation information and the baseline.
- the failure sign detection system 500 can accurately detect the failure of the IT system 550.
- the correlation value is changed by changing the correlation function so that the difference between the current operation information and the baseline calculated based on the changed correlation function becomes smaller.
- the correlation function set in the correlation function automatic setting process or the correlation function manual setting process may not accurately indicate the correspondence between the operation information and the configuration information. This is because the correlation function set in the correlation function automatic setting process is set by averaging a plurality of pieces of operation information when there is a plurality of pieces of operation information of one piece of configuration information.
- the correlation function set in the processing is caused by the administrator arbitrarily designating the correlation function. In the present embodiment, such a correlation function can be changed to correspond to the current operation information in the correlation function changing process.
- FIG. 18 is an explanatory diagram of the abnormality detection process according to the embodiment of the present invention.
- the abnormality detection unit 506 determines whether there is current operation information in the range between the upper limit threshold and the lower limit threshold generated by the threshold generation unit 505, and the upper limit generated by the threshold generation unit 505. When it is determined that there is no current operation information in the range between the threshold and the lower threshold, an abnormality in the IT system 550 is detected.
- the threshold value generation unit 505 calculates the threshold value by a statistical value use threshold value calculation process that is calculated using a past operation information statistic value or a constant value use threshold value calculation process that is calculated using a preset constant value. calculate. Details of the statistic usage threshold calculation processing will be described with reference to FIG. 19, and details of the constant value usage threshold calculation processing will be described with reference to FIG.
- the abnormality detection unit 506 detects an abnormality in the IT system 550.
- FIG. 19 is a flowchart of the abnormality detection process when the threshold value is calculated by the statistic utilization threshold value calculation process according to the embodiment of the present invention.
- the abnormality detection process is executed by the CPU 521 calling a program corresponding to the threshold value generation unit 505 and a program corresponding to the abnormality detection unit 506 and executing these programs.
- the threshold value generation unit 505 acquires a past state value used for generating a baseline from the state value storage DB 511 (1901).
- the threshold value generation unit 505 calculates a statistic with the operation information of the past state value acquired in the process of Step 1901 as a parameter as a threshold value generation value for generating a threshold value (1902). Specifically, the threshold value generation unit 505 calculates an average (overall average) of the operation information of the past state values, and a standard from the average of the average of the operation information having the same collection time among the past state values. The deviation is calculated as a threshold generation value.
- the threshold generation unit 505 calculates an upper limit threshold by adding the threshold generation value at each time calculated by the processing at Step 1902 to the baseline at each time, and performs the processing at Step 1902 from the baseline at each time.
- the lower limit threshold value is calculated by subtracting the threshold generation value at each time calculated in (1903). Thereby, a threshold value is generated.
- the abnormality detection unit 506 determines whether or not the current operation information is within the range between the upper limit threshold and the lower limit threshold (1904). Specifically, the abnormality detection unit 506 determines that the current operation information is a range between the upper limit threshold and the lower limit threshold when the current operation information is less than or equal to the upper limit threshold and the current operation information is greater than or equal to the lower limit threshold. If the current operation information is greater than the upper limit threshold, or if the current operation information is less than the lower limit threshold, it is determined that the current operation information is not in the range between the upper limit threshold and the lower limit threshold.
- the abnormality detection unit 506 detects an abnormality in the IT system 550 and notifies the notification unit 507 accordingly. (1905), and the abnormality detection process ends.
- the abnormality detection unit 506 performs the abnormality detection process without detecting an abnormality in the IT system 550. finish.
- FIG. 20 is a flowchart of the abnormality detection process when the threshold value is calculated by the constant value use threshold value calculation process according to the embodiment of the present invention. Note that, in the abnormality detection process shown in FIG. 20, the same process as the abnormality detection process shown in FIG.
- the threshold generation unit 505 calculates an upper limit threshold by adding a preset constant value to the baseline, calculates a lower limit threshold by subtracting a preset constant value from the baseline, and sets the threshold value. Generate (2001).
- steps 1904 and 1905 are the same as that in FIG.
- FIG. 21 is an explanatory diagram of the abnormality detection screen 2100 according to the embodiment of this invention.
- the abnormality detection screen 2100 is displayed to notify the administrator of the abnormality when the notification unit 507 inputs that the abnormality of the IT system 550 is detected or that the abnormality of the baseline is detected. It is a screen.
- the abnormality detection screen 2100 includes an operation information graph display field 2101, an abnormality detection related information display field 2102, and an abnormality detection log display field 2103.
- the operation information graph display field 2101 displays the baseline of the predetermined period of operation information for a predetermined period until an abnormality is detected, and the upper and lower thresholds for the predetermined period.
- the anomaly detection related information display field 2102 displays the measured value of the operation information at the time when the anomaly was detected, the baseline of the time, the threshold value of the time, and the correlation value of the time.
- the detection time of the abnormality detected so far and the detailed contents indicating whether the abnormality detected so far is an abnormality of the IT system 550 or an abnormality of the baseline are displayed. Is displayed.
- the abnormality of the IT system 550 is caused by the operation information becoming larger than the upper limit value or the operation information becomes smaller than the lower limit value. Information indicating whether or not the problem is caused is also displayed in the detailed content.
- FIG. 22 is a flowchart of notification processing according to the embodiment of the present invention.
- the notification process is executed by the CPU 521 calling a program corresponding to the program corresponding to the notification unit 507 and executing this program.
- the notification unit 507 determines whether or not an abnormality in the IT system 550 has been detected or an indication that a baseline abnormality has been detected is input from the abnormality detection unit 506 (2201).
- step 2201 If it is determined in step 2201 that an abnormality in the IT system 550 has been detected or that an abnormality in the baseline has been detected is input from the abnormality detection unit 506, the notification unit 507 The administrator is notified that the input abnormality has occurred (2202), and the notification process is terminated.
- a method of notifying the administrator a method of outputting an abnormality detection screen 2100 to a screen of a client PC (not shown) connected to the failure sign detection system 500, a method of outputting by voice from a speaker of the client PC, and the like, There is a method of outputting to an external device by e-mail or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Biology (AREA)
- Computer Hardware Design (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
ITシステム550は、CPU551、記憶装置552、入出力装置553及びチューニングパラメータ554を備える。
これによって、障害予兆検知システム500は、仮のベースラインから現在の構成情報に対応させたベースラインを算出することができる。
以上によって、現在の構成情報に対応するベースラインが算出される。
入力された異常が発生した旨を管理者に通知し(2202)、通知処理を終了する。
Claims (19)
- 少なくとも一つの装置から当該装置の稼働情報を収集し、前記収集した稼働情報に基づいて当該装置の将来の稼働情報を予測し、記憶領域を備える稼働情報予測計算機において、
前記稼働情報と、当該稼働情報の収集時の前記装置の構成情報と、を含む状態情報を前記装置から収集する状態情報収集部と、
前記状態情報収集部によって収集された前記稼働情報及び前記構成情報を前記記憶領域に格納する状態情報格納部と、
前記状態情報格納部によって前記記憶領域に格納された過去の稼働情報を、現在の構成情報に対応させるための相関値を算出する相関値算出部と、
前記過去の稼働情報と前記相関値算出部によって算出された相関値とに基づいて、将来の稼働予測値を算出する稼働予測値算出部と、を有することを特徴とする稼働情報予測計算機。 - 請求項1に記載の稼働情報予測計算機であって、
前記相関値算出部は、
前記状態情報格納部によって前記記憶領域に格納された複数の稼働情報と、当該複数の稼働情報の収集時の構成情報と、に基づいて、前記構成情報と前記稼働情報との関係を示す相関関数を算出し、
前記相関関数に基づいて、前記現在の構成情報に対応する稼働情報及び前記過去の構成情報に対応する稼働情報を算出し、
前記現在の構成情報に対応する稼働情報から前記過去の構成情報に対応する稼働情報を減算することによって、前記相関値を算出することを特徴とする稼働情報予測計算機。 - 請求項1又は2に記載の稼働情報予測計算機であって、
前記稼働予測値算出部は、
前記過去の稼働情報に基づいて、仮の稼働予測値を算出し、
前記算出した仮の稼働予測値を前記相関値に基づいて補正することによって、前記将来の稼働予測値を算出することを特徴とする稼働情報予測計算機。 - 請求項1又は2に記載の稼働情報予測計算機であって、
前記稼働予測値算出部は、
前記過去の稼働情報を前記相関値に基づいて前記現在の構成情報に対応する稼働情報に変換し、
前記変換した稼働情報に基づいて前記将来の稼働予測値を算出することを特徴とする稼働情報予測計算機。 - 請求項1、2及び4のいずれか一つに記載の稼働情報予測計算機であって、
前記稼働予測値算出部は、前記過去の稼働情報に基づいて仮の稼働予測値を算出し、
前記稼働情報予測計算機は、
前記状態情報収集部によって収集された現在の稼働情報が、前記仮の稼働予測値と前記将来の稼働予測値との範囲内にあるか否かを判定する比較部と、
前記状態情報収集部によって収集された現在の稼働情報が、前記仮の稼働予測値と前記将来の稼働予測値との範囲内にないと前記比較部によって判定された場合、前記将来の稼働予測値が異常であることを通知する稼働予測値異常通知部と、
前記状態情報収集部によって収集された現在の稼働情報が、前記仮の稼働予測値と前記将来の稼働予測値との範囲内にあると前記比較部によって判定された場合、前記状態情報収集部によって収集された現在の稼働情報と前記将来の稼働予測値との差分が小さくなるように、前記相関値を変更する相関値変更部と、を備えることを特徴とする稼働情報予測計算機。 - 請求項1から5のいずれか一つに記載の稼働情報予測計算機であって、
前記将来の稼働予測値に所定の値を加算することによって上限閾値を算出し、前記将来の稼働予測値に前記所定の値を減算することによって下限閾値を算出する閾値算出部を備え、
前記閾値算出部は、前記稼働予測値算出部が前記将来の稼働予測値を算出するために用いた前記過去の稼働情報の統計量又は予め設定された一定の値を前記所定の値として設定することを特徴とする稼働情報予測計算機。 - 請求項6に記載の稼働情報予測計算機であって、
前記状態情報収集部によって収集された現在の稼働情報が前記上限閾値と前記下限閾値との範囲内にない場合、前記装置が異常であることを報知する装置異常通知部と、を備えることを特徴とする稼働情報予測計算機。 - 請求項1から7のいずれか一つに記載の稼働情報予測計算機であって、
前記構成情報は、前記装置の物理的な構成情報及び論理的な構成情報の少なくとも一方を含むことを特徴とする稼働情報予測計算機。 - 請求項8に記載の稼働情報予測計算機であって、
前記物理的な構成情報は、前記装置に備わる物理資源の性能値及び数の少なくとも一方を含み、
前記論理的な構成情報は、前記装置に生成された仮想装置に対して割り当てられた前記デバイスの要件、前記装置で実行されるソフトウェアのバージョン情報、及び、前記ソフトウェアのチューニングパラメータの少なくとも一つを含むことを特徴とする稼働情報予測計算機。 - 少なくとも一つの装置から当該装置の稼働情報を収集し、記憶領域を備える計算機が、前記収集した稼働情報に基づいて当該装置の将来の稼働情報を予測する稼働情報予測方法において、
前記方法は、
前記稼働情報と、当該稼働情報の収集時の前記装置の構成情報と、を含む状態情報を前記装置から収集する状態情報収集ステップと、
前記状態情報収集ステップで収集された前記稼働情報及び前記構成情報を前記記憶領域に格納する状態情報格納ステップと、
前記状態情報格納ステップで前記記憶領域に格納された過去の稼働情報を、現在の構成情報に対応させるための相関値を算出する相関値算出ステップと、
前記過去の稼働情報と前記相関値算出ステップで算出された相関値とに基づいて、将来の稼働予測値を算出する稼働予測値算出ステップと、を含むことを特徴とする稼働情報予測方法。 - 請求項10に記載の稼働情報予測方法であって、
前記相関値算出ステップは、
前記状態情報格納ステップで前記記憶領域に格納された複数の稼働情報と、当該複数の稼働情報の収集時の構成情報と、に基づいて、前記構成情報と前記稼働情報との関係を示す相関関数を算出するステップと、
前記相関関数に基づいて、前記現在の構成情報に対応する稼働情報及び前記過去の構成情報に対応する稼働情報を算出するステップと、
前記現在の構成情報に対応する稼働情報から前記過去の構成情報に対応する稼働情報を減算することによって、前記相関値を算出するステップと、を含むことを特徴とする稼働情報予測方法。 - 請求項10又11に記載の稼働情報予測方法であって、
前記稼働予測値算出ステップは、
前記過去の稼働情報に基づいて、仮の稼働予測値を算出するステップと、
前記算出した仮の稼働予測値を前記相関値に基づいて補正することによって、前記将来の稼働予測値を算出するステップと、を含むことを特徴とする稼働情報予測方法。 - 請求項10又は11に記載の稼働情報予測方法であって、
前記稼働予測値算出ステップは、
前記過去の稼働情報を前記相関値に基づいて前記現在の構成情報に対応する稼働情報に変換するステップと、
前記変換した稼働情報に基づいて前記将来の稼働予測値を算出するステップと、を含むことを特徴とする稼働情報予測方法。 - 請求項10、11及び13のいずれか一つに記載の稼働情報予測方法であって、
前記稼働予測値算出ステップは、前記過去の稼働情報に基づいて仮の稼働予測値を算出し、
前記状態情報収集ステップで収集された現在の稼働情報が、前記仮の稼働予測値と前記将来の稼働予測値との範囲内にあるか否かを判定する比較ステップを含み、
前記方法は、
前記状態情報収集ステップで収集された現在の稼働情報が、前記仮の稼働予測値と前記将来の稼働予測値との範囲内にないと前記比較ステップで判定された場合、前記将来の稼働予測値が異常であることを通知する稼働予測値異常通知ステップと、
前記状態情報収集ステップで収集された現在の稼働情報が、前記仮の稼働予測値と前記将来の稼働予測値との範囲内にあると前記比較ステップで判定された場合、前記状態情報収集ステップで収集された現在の稼働情報と前記将来の稼働予測値との差分が小さくなるように、前記相関値を変更する相関値変更ステップと、を含むことを特徴とする稼働情報予測方法。 - 少なくとも一つの装置から当該装置の稼働情報を収集し、プロセッサ及び記憶領域を備える計算機において、前記収集した稼働情報に基づいて当該装置の将来の稼働情報を予測する処理を前記プロセッサに実行させるプログラムにおいて、
前記処理は、
前記稼働情報と、当該稼働情報の収集時の前記装置の構成情報と、を含む状態情報を前記装置から収集する状態情報収集ステップと、
前記状態情報収集ステップで収集された前記稼働情報及び前記構成情報を前記記憶領域に格納する状態情報格納ステップと、
前記状態情報格納ステップで前記記憶領域に格納された過去の稼働情報を、現在の構成情報に対応させるための相関値を算出する相関値算出ステップと、
前記過去の稼働情報と前記相関値算出ステップで算出された相関値とに基づいて、将来の稼働予測値を算出する稼働予測値算出ステップと、を含むことを特徴とするプログラム。 - 請求項15に記載のプログラムであって、
前記相関値算出ステップは、
前記状態情報格納ステップで前記記憶領域に格納された複数の稼働情報と、当該複数の稼働情報の収集時の構成情報と、に基づいて、前記構成情報と前記稼働情報との関係を示す相関関数を算出するステップと、
前記相関関数に基づいて、前記現在の構成情報に対応する稼働情報及び前記過去の構成情報に対応する稼働情報を算出するステップと、
前記現在の構成情報に対応する稼働情報から前記過去の構成情報に対応する稼働情報を減算することによって、前記相関値を算出するステップと、を含むことを特徴とするプログラム。 - 請求項15又は16に記載のプログラムであって、
前記稼働予測値算出ステップは、
前記過去の稼働情報に基づいて、仮の稼働予測値を算出するステップと、
前記算出した仮の稼働予測値を前記相関値に基づいて補正することによって、前記将来の稼働予測値を算出するステップと、を含むことを特徴とするプログラム。 - 請求項15又は16に記載のプログラムであって、
前記稼働予測値算出ステップは、
前記過去の稼働情報を前記相関値に基づいて前記現在の構成情報に対応する稼働情報に変換するステップと、
前記変換した稼働情報に基づいて前記将来の稼働予測値を算出するステップと、を含むことを特徴とするプログラム。 - 請求項15、16及び18のいずれか一つに記載のプログラムであって、
前記稼働予測値算出ステップは、前記過去の稼働情報に基づいて仮の稼働予測値を算出し、
前記状態情報収集ステップで収集された現在の稼働情報が、前記仮の稼働予測値と前記将来の稼働予測値との範囲内にあるか否かを判定する比較ステップを含み、
前記処理は、
前記状態情報収集ステップで収集された現在の稼働情報が、前記仮の稼働予測値と前記将来の稼働予測値との範囲内にないと前記比較ステップで判定された場合、前記将来の稼働予測値が異常であることを報知する稼働予測値異常報知ステップと、
前記状態情報収集ステップで収集された現在の稼働情報が、前記仮の稼働予測値と前記将来の稼働予測値との範囲内にあると前記比較ステップで判定された場合、前記状態情報収集ステップで収集された現在の稼働情報と前記将来の稼働予測値との差分が小さくなるように、前記相関値を変更する相関値変更ステップと、を含むことを特徴とするプログラム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2013542774A JP5686904B2 (ja) | 2011-11-10 | 2011-11-10 | 稼働情報予測計算機、稼働情報予測方法及びプログラム |
PCT/JP2011/075980 WO2013069138A1 (ja) | 2011-11-10 | 2011-11-10 | 稼働情報予測計算機、稼働情報予測方法及びプログラム |
US14/352,457 US20140244563A1 (en) | 2011-11-10 | 2011-11-10 | Operation information prediction computer, operation information prediction method and program |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2011/075980 WO2013069138A1 (ja) | 2011-11-10 | 2011-11-10 | 稼働情報予測計算機、稼働情報予測方法及びプログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013069138A1 true WO2013069138A1 (ja) | 2013-05-16 |
Family
ID=48288752
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/075980 WO2013069138A1 (ja) | 2011-11-10 | 2011-11-10 | 稼働情報予測計算機、稼働情報予測方法及びプログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US20140244563A1 (ja) |
JP (1) | JP5686904B2 (ja) |
WO (1) | WO2013069138A1 (ja) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018135008A1 (ja) * | 2017-01-23 | 2018-07-26 | 株式会社日立製作所 | 影響分析システム、計測項目最適化方法、および計測項目最適化プログラム |
JP2018163542A (ja) * | 2017-03-27 | 2018-10-18 | 日本電気株式会社 | 予測装置、予測システム、予測方法、および予測プログラム |
JP2021043746A (ja) * | 2019-09-12 | 2021-03-18 | 日本電気株式会社 | 情報処理装置、情報処理方法、及びコンピュータプログラム |
JP2021051452A (ja) * | 2019-09-24 | 2021-04-01 | 日本電気株式会社 | 監視装置、監視方法、およびプログラム |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120046929A1 (en) * | 2010-08-20 | 2012-02-23 | International Business Machines Corporation | Statistical Design with Importance Sampling Reuse |
US9602426B2 (en) * | 2013-06-21 | 2017-03-21 | Microsoft Technology Licensing, Llc | Dynamic allocation of resources while considering resource reservations |
US10410155B2 (en) | 2015-05-01 | 2019-09-10 | Microsoft Technology Licensing, Llc | Automatic demand-driven resource scaling for relational database-as-a-service |
US9471778B1 (en) * | 2015-11-30 | 2016-10-18 | International Business Machines Corporation | Automatic baselining of anomalous event activity in time series data |
CN106909485B (zh) * | 2015-12-23 | 2020-10-23 | 伊姆西Ip控股有限责任公司 | 用于确定存储系统性能下降的原因的方法和设备 |
CN106685752B (zh) * | 2016-06-28 | 2019-01-04 | 腾讯科技(深圳)有限公司 | 一种信息处理方法及终端 |
US10565046B2 (en) * | 2016-09-01 | 2020-02-18 | Intel Corporation | Fault detection using data distribution characteristics |
US10395016B2 (en) * | 2017-01-24 | 2019-08-27 | International Business Machines Corporation | Communication pattern recognition |
US10380863B2 (en) * | 2017-04-03 | 2019-08-13 | Oneevent Technologies, Inc. | System and method for monitoring a building |
JP6681369B2 (ja) | 2017-09-07 | 2020-04-15 | 株式会社日立製作所 | 性能管理システム、管理装置および性能管理方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05324358A (ja) * | 1992-05-20 | 1993-12-07 | Hitachi Ltd | 性能予測装置 |
JPH08137725A (ja) * | 1994-11-14 | 1996-05-31 | Hitachi Ltd | 性能予測装置 |
JP2004164637A (ja) * | 2002-10-31 | 2004-06-10 | Hewlett-Packard Development Co Lp | ベースライン化および自動しきい値処理を行う仕組みを与える方法および装置 |
JP2009205208A (ja) * | 2008-02-26 | 2009-09-10 | Nec Corp | 運用管理装置、運用管理方法ならびにプログラム |
WO2011125138A1 (ja) * | 2010-04-06 | 2011-10-13 | 株式会社日立製作所 | 性能監視装置,方法,プログラム |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4739472B2 (ja) * | 1998-12-04 | 2011-08-03 | 新日鉄ソリューションズ株式会社 | 性能予測装置および方法、記録媒体 |
JP3966459B2 (ja) * | 2002-05-23 | 2007-08-29 | 株式会社日立製作所 | ストレージ機器管理方法、システム、およびプログラム |
US20080033991A1 (en) * | 2006-08-03 | 2008-02-07 | Jayanta Basak | Prediction of future performance of a dbms |
US7801994B2 (en) * | 2007-11-29 | 2010-09-21 | Hitachi, Ltd. | Method and apparatus for locating candidate data centers for application migration |
JP4872944B2 (ja) * | 2008-02-25 | 2012-02-08 | 日本電気株式会社 | 運用管理装置、運用管理システム、情報処理方法、及び運用管理プログラム |
WO2012153400A1 (ja) * | 2011-05-11 | 2012-11-15 | 株式会社日立製作所 | データ処理システム、データ処理方法、及び、プログラム |
-
2011
- 2011-11-10 JP JP2013542774A patent/JP5686904B2/ja not_active Expired - Fee Related
- 2011-11-10 WO PCT/JP2011/075980 patent/WO2013069138A1/ja active Application Filing
- 2011-11-10 US US14/352,457 patent/US20140244563A1/en not_active Abandoned
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05324358A (ja) * | 1992-05-20 | 1993-12-07 | Hitachi Ltd | 性能予測装置 |
JPH08137725A (ja) * | 1994-11-14 | 1996-05-31 | Hitachi Ltd | 性能予測装置 |
JP2004164637A (ja) * | 2002-10-31 | 2004-06-10 | Hewlett-Packard Development Co Lp | ベースライン化および自動しきい値処理を行う仕組みを与える方法および装置 |
JP2009205208A (ja) * | 2008-02-26 | 2009-09-10 | Nec Corp | 運用管理装置、運用管理方法ならびにプログラム |
WO2011125138A1 (ja) * | 2010-04-06 | 2011-10-13 | 株式会社日立製作所 | 性能監視装置,方法,プログラム |
Non-Patent Citations (1)
Title |
---|
DAVID WILSON, HP9000/720CRX, UNIX MAGAZINE, vol. 6, no. 10, 1 October 1991 (1991-10-01), pages 26 - 31 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018135008A1 (ja) * | 2017-01-23 | 2018-07-26 | 株式会社日立製作所 | 影響分析システム、計測項目最適化方法、および計測項目最適化プログラム |
JPWO2018135008A1 (ja) * | 2017-01-23 | 2019-06-27 | 株式会社日立製作所 | 影響分析システム、計測項目最適化方法、および計測項目最適化プログラム |
JP2018163542A (ja) * | 2017-03-27 | 2018-10-18 | 日本電気株式会社 | 予測装置、予測システム、予測方法、および予測プログラム |
JP2021043746A (ja) * | 2019-09-12 | 2021-03-18 | 日本電気株式会社 | 情報処理装置、情報処理方法、及びコンピュータプログラム |
JP7331567B2 (ja) | 2019-09-12 | 2023-08-23 | 日本電気株式会社 | 情報処理装置、情報処理方法、及びコンピュータプログラム |
JP2021051452A (ja) * | 2019-09-24 | 2021-04-01 | 日本電気株式会社 | 監視装置、監視方法、およびプログラム |
JP7331581B2 (ja) | 2019-09-24 | 2023-08-23 | 日本電気株式会社 | 監視装置、監視方法、およびプログラム |
Also Published As
Publication number | Publication date |
---|---|
JP5686904B2 (ja) | 2015-03-18 |
JPWO2013069138A1 (ja) | 2015-04-02 |
US20140244563A1 (en) | 2014-08-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5686904B2 (ja) | 稼働情報予測計算機、稼働情報予測方法及びプログラム | |
US9600394B2 (en) | Stateful detection of anomalous events in virtual machines | |
US8677191B2 (en) | Early detection of failing computers | |
US9720823B2 (en) | Free memory trending for detecting out-of-memory events in virtual machines | |
US10452983B2 (en) | Determining an anomalous state of a system at a future point in time | |
US10248561B2 (en) | Stateless detection of out-of-memory events in virtual machines | |
US10558545B2 (en) | Multiple modeling paradigm for predictive analytics | |
US10868744B2 (en) | Influence range identification method and influence range identification apparatus | |
JP4572251B2 (ja) | 計算機システム、計算機システムの障害の予兆検知方法及びプログラム | |
KR20190070659A (ko) | 컨테이너 기반의 자원 할당을 지원하는 클라우드 컴퓨팅 장치 및 방법 | |
US9191296B2 (en) | Network event management | |
US20130198370A1 (en) | Method for visualizing server reliability, computer system, and management server | |
JP2011128852A (ja) | 仮想ハードディスクの管理サーバおよび管理方法、管理プログラム | |
US11550634B2 (en) | Capacity management in a cloud computing system using virtual machine series modeling | |
JP6683920B2 (ja) | 並列処理装置、電力係数算出プログラムおよび電力係数算出方法 | |
US9852007B2 (en) | System management method, management computer, and non-transitory computer-readable storage medium | |
US9104612B2 (en) | System stability prediction using prolonged burst detection of time series data | |
CN115443638A (zh) | 诊断和缓解计算节点中的存储器泄漏 | |
US20130091391A1 (en) | User-coordinated resource recovery | |
US8214693B2 (en) | Damaged software system detection | |
US9397921B2 (en) | Method and system for signal categorization for monitoring and detecting health changes in a database system | |
US11113364B2 (en) | Time series data analysis control method and analysis control device | |
US11210159B2 (en) | Failure detection and correction in a distributed computing system | |
US11042463B2 (en) | Computer, bottleneck identification method, and non-transitory computer readable storage medium | |
JP6213309B2 (ja) | 情報処理装置、情報処理装置の性能情報採取プログラム及び情報処理装置の性能情報採取方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11875309 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2013542774 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14352457 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11875309 Country of ref document: EP Kind code of ref document: A1 |