WO2022196627A1 - 運用支援装置、システム及び方法並びにコンピュータ可読媒体 - Google Patents
運用支援装置、システム及び方法並びにコンピュータ可読媒体 Download PDFInfo
- Publication number
- WO2022196627A1 WO2022196627A1 PCT/JP2022/011285 JP2022011285W WO2022196627A1 WO 2022196627 A1 WO2022196627 A1 WO 2022196627A1 JP 2022011285 W JP2022011285 W JP 2022011285W WO 2022196627 A1 WO2022196627 A1 WO 2022196627A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- occurrence
- event
- rule information
- information
- rule
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 24
- 238000003860 storage Methods 0.000 claims abstract description 53
- 230000009471 action Effects 0.000 claims description 27
- 230000008859 change Effects 0.000 claims description 25
- 230000004044 response Effects 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 10
- 238000012545 processing Methods 0.000 description 24
- 238000001514 detection method Methods 0.000 description 19
- 238000012544 monitoring process Methods 0.000 description 18
- 238000012806 monitoring device Methods 0.000 description 17
- 238000010586 diagram Methods 0.000 description 14
- 238000012423 maintenance Methods 0.000 description 10
- 238000004891 communication Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 230000010485 coping Effects 0.000 description 6
- 238000004590 computer program Methods 0.000 description 5
- 238000012937 correction Methods 0.000 description 5
- 230000010365 information processing Effects 0.000 description 5
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000011084 recovery Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000008439 repair process Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000002250 progressing effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/542—Event management; Broadcasting; Multicasting; Notifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
Definitions
- the present invention relates to an operation support device, system, method and program, and more particularly to an operation support device, system, method and program for monitoring an operation system.
- a handling command is determined from notification information of an event that has occurred in the information system, and the handling command is automatically executed.
- Patent Documents 1 and 2 can be cited as technologies related to the automation of operations.
- Patent Literature 1 discloses a technique related to a failure recovery device capable of trying to recover from a failure other than failures described in a failure handling rule.
- Patent Literature 2 discloses a technology related to a failure recovery device that handles a predetermined rule based on the priority of a plurality of failure handling rules and the operating state of the system.
- the event handling rules for automating the operation of the operational system may no longer meet the conditions and cease to function as rules if the behavior or state of the system changes due to system modifications.
- the scale of the operational system is increasing, and it is not always possible to change the related event handling rules when modifying the system.
- maintenance of the rules is complicated. Therefore, there is a problem that it is difficult to maintain the event handling rule so that it functions properly.
- the purpose of the present disclosure is to provide an operation support device, system, method, and program for supporting the appropriate maintenance of rules for dealing with events that occur in the operation system.
- the operation support device includes: a storage unit that stores a plurality of rule information defining actions corresponding to each of a plurality of events that occur in an operational system; When a countermeasure defined in the rule information corresponding to the predetermined event among the plurality of rule information is executed in response to the occurrence of the predetermined event in the operation system, the date and time of occurrence of the event and the date and time of the event a registration unit that registers history information including rule information in the storage unit; an identifying unit that identifies rule information that satisfies a predetermined condition at intervals of occurrence of specific events based on the history information; an output unit that outputs the specified rule information; Prepare.
- An operation support system includes: Equipped with a management terminal and an operation support device, The operation support device is receiving from the management terminal a plurality of rule information defining measures corresponding to each of a plurality of events occurring in the operation system and storing them in a storage device; When a countermeasure defined in the rule information corresponding to the predetermined event among the plurality of rule information is executed in response to the occurrence of the predetermined event in the operation system, the date and time of occurrence of the event and the date and time of the event register history information including rule information in the storage device; Based on the history information, specify rule information that satisfies a predetermined condition with an occurrence interval of a specific event; The specified rule information is output to the management terminal.
- An operation support method includes: the computer In response to the occurrence of a predetermined event in the operating system, one of the storage devices that stores a plurality of rule information defining actions corresponding to each of the plurality of events that occur in the operating system is handled. register history information including the date and time of occurrence of the event and the rule information of the event in the storage device when the action defined in the rule information is executed; Based on the history information, specify rule information that satisfies a predetermined condition with an occurrence interval of a specific event; Output the specified rule information.
- the operation support program is In response to the occurrence of a predetermined event in the operating system, one of the storage devices that stores a plurality of rule information defining actions corresponding to each of the plurality of events that occur in the operating system is handled. a process of registering history information including the date and time of occurrence of the event and the rule information of the event in the storage device when the action defined in the rule information is executed; a process of identifying rule information that satisfies a predetermined condition at intervals of occurrence of specific events based on the history information; a process of outputting the identified rule information; run on the computer.
- FIG. 1 is a block diagram showing the configuration of an operation support device according to the first embodiment
- FIG. 4 is a flow chart showing the flow of an operation support method according to the first embodiment
- FIG. 11 is a block diagram showing the overall configuration of an operation support system according to a second embodiment
- FIG. 11 is a block diagram showing the configuration of an operation support device according to the second embodiment
- FIG. 10 is a flow chart showing the flow of coping processing for an occurring event according to the second embodiment
- FIG. 12 is a sequence diagram showing the flow of inappropriate rule detection and update processing according to the second embodiment
- FIG. 11 is a diagram showing a concept of a detection example of a rule in which an event is not resolved even after taking measures according to the second embodiment
- FIG. 12 is a diagram showing a concept of a detection example of a rule whose condition no longer matches due to a change in system state according to the second embodiment
- FIG. 11 is a diagram showing the concept of an example solved by rule update according to the second embodiment;
- FIG. 1 is a block diagram showing the configuration of an operation support device 1 according to the first embodiment.
- the operation support device 1 is an information processing device for performing operation management of an operation system and supporting operation by an administrator.
- the operation system is an information system configured by a plurality of monitored devices such as computers (servers), communication devices (network devices), and storages.
- the operational system is, for example, a service providing system that provides one or more services via a communication network, a business system within a company, or the like. Also, the operation system may cooperate with an external information system.
- the operation support device 1 includes a storage unit 11, a registration unit 12, an identification unit 13, and an output unit 14.
- the storage unit 11 stores rule information 151 to 15n (n is a natural number of 2 or more) and history information 161 to 16m (m is a natural number of 2 or more).
- the rule information 151 or the like is information that defines measures corresponding to each of a plurality of events that occur in the operation system. Events are not limited to system failures (hardware, software, network) that lead to the service outage of the operational system, but also include services provided that do not meet requirements even though the system is operating. In addition, countermeasures include processing instructions, commands, etc. for solving or avoiding the event.
- the countermeasures include OS (Operating System), middleware, application restart command, data correction patch execution command, and the like.
- the history information 161 and the like are histories when countermeasures are executed.
- the history information 161 and the like include the date and time when an event occurred and the rule information of the event.
- the registration unit 12 registers the occurrence date and time of the event when a countermeasure defined in the rule information corresponding to the predetermined event among the plurality of rule information is executed. , and the rule information of the event are registered in the storage unit 11 .
- the identifying unit 13 Based on the history information, the identifying unit 13 identifies rule information in which the occurrence interval of a specific event satisfies a predetermined condition.
- the output unit 14 outputs the specified rule information.
- FIG. 2 is a flow chart showing the flow of the operation support method according to the first embodiment.
- a predetermined event has occurred in the operating system.
- the operation support device 1 receives an event occurrence notification from the operation system or the monitoring system of the operation system.
- the operation support device 1 identifies the rule information corresponding to the notified event from the storage unit 11 storing the rule information 151 to 15n, and executes the action defined by the identified rule information.
- the registration unit 12 registers history information 161 including the date and time of occurrence of the event and the rule information of the event in the storage unit 11 (S11).
- the identifying unit 13 identifies rule information in which the occurrence interval of a specific event satisfies a predetermined condition (S12).
- a "specific event” is not necessarily a "predetermined event”.
- the output unit 14 outputs the specified rule information (S13).
- the output unit 14 may output the specified rule information to the management terminal of the administrator.
- the management terminal displays the specified rule information. Therefore, the administrator can grasp the rule information whose occurrence interval satisfies a predetermined condition among the events that have occurred in the operation system and have been dealt with.
- the rule information whose occurrence interval satisfies a predetermined condition includes cases where the trend of occurrence of events has changed compared to before. For example, events may occur more frequently than they used to. In other words, it is conceivable that the event has recurred in a short period of time even though the action defined in the rule information corresponding to the occurred event has been executed. Or, an event that used to occur regularly may no longer occur and no action is taken. In this case, it is conceivable that the event no longer conforms to the rule or the rule becomes unnecessary due to a change in the state of the system.
- the event occurrence interval is analyzed from the execution history of countermeasures against the event, and when the occurrence interval satisfies a predetermined condition, the rule information is specified and output. Therefore, the administrator or the like can use the output rule information as a clue to examine and implement maintenance of the rule information. Therefore, it is possible to support the appropriate maintenance of rules for dealing with events that occur in the operational system.
- the operation support device 1 includes a processor, memory, and storage device as configurations not shown. Further, the storage device stores a computer program in which processing of the operation support method according to the present embodiment is implemented. Then, the processor loads the computer program from the storage device into the memory and executes the computer program. Thereby, the processor implements the functions of the registration unit 12 , the identification unit 13 and the output unit 14 .
- each component of the operation support device 1 may be realized by dedicated hardware. Also, part or all of each component of each device may be implemented by general-purpose or dedicated circuitry, processors, etc., or combinations thereof. These may be composed of a single chip, or may be composed of multiple chips connected via a bus. A part or all of each component of each device may be implemented by a combination of the above-described circuits and the like and programs.
- a processor a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), an FPGA (Field-Programmable Gate Array), a quantum processor (quantum computer control chip), or the like can be used.
- each component of the operation support device 1 when part or all of each component of the operation support device 1 is realized by a plurality of information processing devices, circuits, etc., the plurality of information processing devices, circuits, etc. may be centrally arranged, They may be distributed.
- the information processing device, circuits, and the like may be implemented as a form in which each is connected via a communication network, such as a client-server system, a cloud computing system, or the like.
- the functions of the operation support device 1 may be provided in a SaaS (Software as a Service) format.
- events are not limited to failures that cause system outages, as described above, but also include the fact that the system itself is operating normally, such as failure to meet service specifications.
- the countermeasures to be taken in response to the occurrence of events are not limited to recovery from system failures.
- data correction data patch application
- restart etc. may be implemented as an operation each time an event occurs.
- the system should normally be repaired, but from the viewpoint of cost-effectiveness (occurrence frequency, repair cost, repair time, difficulty level, etc.), it may be possible to continue operation by coping. Therefore, in order to realize such an operation, rule information is used that defines measures to be executed under the condition of the occurrence of an event.
- FIG. 3 is a block diagram showing the overall configuration of the operation support system 1000 according to the second embodiment.
- the operation support system 1000 includes an operation system 100 , a management terminal 200 , an operation support device 300 and a monitoring device 400 .
- the operation system 100, the monitoring device 400, and the operation support device 300 are connected via at least a network N.
- the network N is a communication network such as the Internet or a dedicated line.
- the operation system 100 may be the above-described service providing system, a business system within a company, or the like.
- the operating system 100 includes at least one or more monitored devices such as computer servers, network devices, storage devices, and the like.
- the operation system 100 may be any system that can acquire monitoring target information from the monitoring device 400 and the operation support device 300 .
- the operation system 100 may be connected to an external system (not shown).
- the operation system 100 includes, for example, a GW (GateWay) server, FW (FireWall), WEB server, AP (Application) server, DB (DataBase) server, router, switch, storage device, and the like.
- GW GateWay
- FW FireWall
- WEB server AP (Application) server
- DB DataBase server
- FIG. 3 shows the server 110 as part of the configuration of the operation system 100.
- the server 110 is an example of the computer server described above, and assumes that an OS (Operating System), middleware, applications, and the like operate.
- the server 110 may be a storage device.
- Server 110 includes configuration information 111 and log files 112 .
- the setting information 111 includes setting files for the OS, middleware, applications, and the like.
- the setting information 111 is not limited to files, and may be execution results of various status acquisition commands.
- the log file 112 is a file that records log information output by the OS, middleware, applications, and the like.
- the operational system 100 may also include network equipment.
- the network device may also include configuration information and log files.
- the monitoring device 400 monitors each monitoring target device of the operation system 100 via the network N and acquires monitoring target information. When the monitoring device 400 detects the occurrence of an event from the monitoring target information, the monitoring device 400 transmits an event occurrence notification to the operation support device 300 via the network N.
- FIG. The monitoring device 400 may monitor each monitored device according to a predetermined monitoring schedule.
- the monitoring device 400 may acquire the setting information 111 and the log file 112 as monitoring target information from the server 110 .
- the monitoring device 400 may obtain specific parameter values within the configuration information 111 .
- the monitoring device 400 may acquire a log message (message ID, event occurrence date and time, etc.) written in the log file 112 .
- the monitoring device 400 may execute a status acquisition command for the server 110 and acquire the execution result of the command.
- the monitoring device 400 may detect the occurrence of an event by extracting an error message or the like from the acquired monitoring target information using a predetermined extraction logic.
- the monitoring device 400 may notify the operation support device 300 of the acquired setting information 111 and log file 112 via the network N.
- the management terminal 200 is a terminal device used by the operation manager to operate the operation work, and is, for example, a personal computer.
- the management terminal 200 is communicably connected to the operation support device 300 via a network or the like.
- the management terminal 200 receives input of information such as rule information and a countermeasure command execution file according to the operation of the operation manager, and transmits and registers them to the operation support device 300 .
- the management terminal 200 also receives input of update information for rule information from the operation manager, transmits the update information to the operation support device 300, and updates the rule information.
- the operation support device 300 is an example of the operation support device 1 described above.
- the operation support device 300 is an information processing device that performs processing for registering rule information and the like, processing for coping with incidents, inappropriate rule detection and update processing, and the like (operation support method).
- the operation support device 300 may be made redundant by a plurality of servers, and each functional block may be realized by a plurality of computers.
- FIG. 4 is a block diagram showing the configuration of the operation support device 300 according to the second embodiment.
- the operation support device 300 includes a storage unit 310 , a memory 320 , a communication unit 330 and a control unit 340 .
- the storage unit 310 is an example of the storage unit 11 described above.
- the storage unit 310 is an example of a storage device such as a hard disk, flash memory, SSD (Solid State Drive), or the like.
- Storage unit 310 stores program 311 , rule DB 312 , and history DB 313 .
- a program 311 is a computer program in which processing of the operation support method according to the second embodiment is implemented.
- the rule DB 312 is a database that manages a plurality of pieces of rule information 3121 to 312n.
- the rule information 3121 is information in which a rule ID 31211, a condition 31212 and a countermeasure 31213 are associated with each other.
- Rule ID31211 is the identification information of rule information.
- a condition 31212 is an action execution condition including an event that has occurred. Specifically, the event is a failure, an error, a status change, or the like that has occurred in the monitoring target device of the operation system 100 .
- the condition 31212 may include the setting information 111 or the log file 112 of the server 110, or the ID of a specific error message in the event occurrence notification.
- the action 31213 is information indicating the content of the action to be taken when the event that has occurred satisfies the condition 31212 .
- the action 31213 is an execution command, a job ID, and the like for the monitored device in which the event occurred and related devices.
- the countermeasure 31213 may be an OS, middleware, or application restart command of the server 110, an execution command of the command via the network N, or the like.
- rule information 3122 (not shown), .
- the history DB 313 is a database that manages multiple pieces of history information 3131 to 313m.
- the history information 3131 and the like are histories of countermeasures taken in response to the occurrence of events.
- the history information 3131 is information in which an occurrence event 31311, an occurrence date and time 31312, a rule ID 31313, and an execution result 31314 are associated with each other.
- Occurrence event 31311 is information specifying an event that has occurred.
- the occurrence event 31311 is an event defined in the condition 31212 described above, such as the ID of a specific error message.
- the date and time of occurrence 31312 is the date and time when the event 31311 occurred.
- the date and time of occurrence 31312 may be information included in the event occurrence notification, or the date and time when the operation support device 300 received the occurrence notification. Note that the execution date and time of the countermeasure 31213 may be used instead of the occurrence date and time 31312 .
- the rule ID 31313 is identification information of rule information, and is information corresponding to the rule ID 31211 or the like defining the action taken.
- the execution result 31314 is the result of the action taken.
- the execution result 31314 is, for example, information indicating that the handling has ended normally or abnormally.
- the memory 320 is a volatile storage device such as RAM (Random Access Memory), and is a storage area for temporarily holding information when the control unit 340 operates.
- a communication unit 330 is a communication interface with the network N. FIG.
- the control unit 340 is a processor that controls each component of the operation support device 300, that is, a control device.
- the control unit 340 loads the program 311 from the storage unit 310 into the memory 320 and executes the program 311 .
- the control unit 340 implements the functions of the registration unit 341 , the handling unit 342 , the specifying unit 343 and the output unit 344 .
- the registration unit 341 is an example of the registration unit 12 described above.
- the registration unit 341 performs registration processing, update processing, and the like of rule information.
- the registration unit 341 registers rule information received from the management terminal 200 in the rule DB 312 of the storage unit 310 .
- the format of the received rule information may be in various formats.
- the registration unit 341 may use conversion logic according to the format of the received rule information to convert it into a specific format such as the rule information 3121 described above and register it in the rule DB 312 . Further, the registration unit 341 may register the handling command execution file received from the management terminal 200 in the storage unit 310 .
- the registration unit 341 registers the history information in the history DB 313 of the storage unit 310 after the handling unit 342 (to be described later) executes handling. Also, the registration unit 341 updates the corresponding rule information in the rule DB 312 based on the update information of the rule information received from the management terminal 200 .
- the coping unit 342 performs coping processing for the occurring event. Upon receiving an event occurrence notification from the monitoring device 400, the handling unit 342 identifies rule information defining a condition corresponding to the event from the rule DB 312, and applies the handling defined in the identified rule information. This is executed for the monitoring target device, etc. Note that the handling unit 342 may acquire monitoring target information from a monitoring target device of the operation system 100 via the network N, analyze the monitoring target information, and detect the occurrence of an event. When detecting the occurrence of an event, the handling unit 342 performs the same handling as described above.
- the identification unit 343 is an example of the identification unit 13 described above.
- the specifying unit 343 performs inappropriate rule detection processing.
- the identification unit 343 analyzes each piece of history information in the history DB 313 in accordance with the update of the history DB 313 or at a predetermined timing, and determines whether or not the occurrence tendency of a specific occurrence event satisfies a predetermined condition. , if there is an event that satisfies a predetermined condition, a rule ID (rule information) associated with the event is specified.
- the identification unit 343 analyzes the tendency of occurrence of a specific event from a plurality of dates and times of occurrence of the specific event.
- the identifying unit 343 detects a change in tendency before or after the predetermined point in time from the occurrence tendency, the identifying unit 343 determines that the occurrence interval satisfies a predetermined condition. Then, the specifying unit 343 specifies, from among the plurality of pieces of rule information, rule information defining an event determined to satisfy a predetermined condition. In this way, there is a high possibility that rule information that defines an event in which a change in occurrence tendency is detected is inappropriate for the current operation system 100 in rule conditions and countermeasures. Therefore, it is possible to assist the administrator in considering whether or not to modify the rule information.
- the identification unit 343 when the identification unit 343 detects that the frequency of occurrence of a specific event has increased compared to before a predetermined point in time, it is preferable to determine that the occurrence interval satisfies a predetermined condition. That is, if the most recent occurrence interval of a particular event is (significantly) shorter than the average of past occurrence intervals, it is likely to be an inappropriate rule. Therefore, it is possible to assist the administrator in considering whether or not to modify the rule information. Further, the identifying unit 343 may determine that the occurrence interval satisfies a predetermined condition when a predetermined period or more has passed since the last occurrence of a specific event.
- the identifying unit 343 determines, from a plurality of dates and times of occurrence of a specific event, a first frequency of occurrence of the event in a period before the predetermined time and a second frequency of occurrence of the event in a period after the predetermined time. may be calculated as the occurrence tendency.
- the specifying unit 343 determines whether or not the occurrence interval satisfies a predetermined condition from the relationship between the first occurrence frequency and the second occurrence frequency. This makes it possible to more accurately detect a change in the occurrence tendency of a specific event based on the degree of difference in occurrence frequency before and after the predetermined point in time.
- the output unit 344 is an example of the output unit 14 described above.
- the output unit 344 outputs the specified rule information to the management terminal 200 .
- the output unit 344 outputs to the management terminal 200 the reason for detecting the change in occurrence tendency together with the specified rule information.
- the reason for detecting a change in occurrence trend is, for example, that a specific event has occurred more frequently than before a predetermined point in time, or that a predetermined period of time or more has passed since the last occurrence of a specific event. , the relationship (comparison result) between the above-described first frequency of occurrence and the second frequency of occurrence, and the like.
- the output unit 344 may further output information on the event that has occurred.
- the output unit 344 may output to a display device connected to the operation support device 300 or another information system.
- FIG. 5 is a flow chart showing the flow of processing for dealing with an event according to the second embodiment.
- the operation support apparatus 300 has a plurality of rule information 3121 and the like registered in the rule DB 312, and has already registered execution commands and the like corresponding to measures defined in each rule information, or at least via the network N shall be executable.
- a predetermined event (failure, etc.) occurs in a monitoring target device, for example, the server 110 within the operation system 100 .
- the monitoring device 400 detects an additional error message from the log file 112 of the server 110 or the like, and transmits the error message to the operation support device 300 via the network N as an event occurrence notification.
- the event occurrence notification includes message ID, message content, date and time of occurrence (date and time of detection), identification information of the detected monitoring target device (server 110), and the like.
- the handling unit 342 of the operation support device 300 receives the event occurrence notification from the monitoring device 400 via the network N (S101).
- the handling unit 342 may receive an event occurrence notification via the network N from the monitoring software in the server 110 .
- the handling unit 342 may acquire monitoring target information (such as the log file 112) from the server 110 via the network N, analyze the monitoring target information, and detect the occurrence of a predetermined event.
- the handling unit 342 searches for rule information that matches the conditions from the rule DB 312 (S102). Specifically, the handling unit 342 searches for an event (error message ID, etc.) included in the occurrence notification that matches the conditions of each piece of rule information in the rule DB 312 . Then, the handling unit 342 determines whether or not there is rule information that matches the conditions (S103). For example, if the condition 31212 includes the error message ID included in the notification of occurrence, the handling unit 342 determines that there is rule information that matches the condition, and identifies the rule information 3121 that defines the condition 31212 . Then, the handling unit 342 executes the handling defined in the rule information that matches the conditions (S104). For example, the handling unit 342 executes an execution command corresponding to the handling 31213 defined in the identified rule information 3121 to the server 110 via the network N. FIG. Then, it is assumed that the execution of the execution command has ended.
- the registration unit 341 registers the history information in the history DB 313 (S105). Specifically, the registration unit 341 treats the error message ID included in the occurrence notification as the occurrence event 31311, the occurrence date and time included in the occurrence notification as the occurrence date and time 31312, and the rule ID 31211 of the specified rule information 3121 as the rule ID 31313. do. Then, the registration unit 341 associates the occurrence event 31311 , the date and time of occurrence 31312 , the rule ID 31313 , and the execution result 31314 of the executed countermeasure and registers them in the history DB 313 as history information 3131 .
- the handling unit 342 outputs the occurrence of the event and the completion of handling to the management terminal 200 (S106). For example, the handling unit 342 outputs the error message ID and execution result 31314 included in the occurrence notification to the management terminal 200 . On the other hand, if it is determined in step S103 that there is no rule information that matches the conditions, the handling unit 342 outputs an event occurrence alert to the management terminal 200 (S107).
- FIG. 6 is a sequence diagram showing the flow of inappropriate rule detection and update processing according to the second embodiment.
- the specifying unit 343 starts inappropriate rule detection processing after the handling processing in FIG. 5 .
- the specifying unit 343 may start inappropriate rule detection processing at a predetermined timing.
- the identification unit 343 analyzes the occurrence tendency of specific events from the history DB 313 (S201). Specifically, the identifying unit 343 identifies a history information group whose occurrence event is a specific error message ID from the history DB 313, and acquires the date and time of occurrence of the identified history information group. Then, the specifying unit 343 calculates an interval (occurrence interval) between adjacent dates and times when the obtained occurrence dates and times are arranged in chronological order. At this time, the identifying unit 343 calculates a first frequency of occurrence from a plurality of occurrence intervals during a period before the predetermined time, and calculates a second frequency of occurrence from one or more occurrence intervals during a period after the predetermined time. .
- the first occurrence frequency and the second occurrence frequency are examples of occurrence tendencies.
- the identification unit 343 may analyze the occurrence tendency using other algorithms, analysis logic, or the like.
- the identifying unit 343 detects a change in occurrence tendency (S202). For example, the specifying unit 343 may detect that the second occurrence frequency is higher than the first occurrence frequency as a change in occurrence tendency. Further, the specifying unit 343 may detect that the second occurrence frequency is lower than the first occurrence frequency, for example, that the second occurrence frequency is 0, as a change in occurrence tendency. . Note that if no change in the occurrence tendency is detected in step S202, the process is terminated. Alternatively, inappropriate rule detection processing is performed for other events.
- the identifying unit 343 identifies rule information corresponding to the event in which the change in occurrence tendency is detected (S203). Specifically, the identifying unit 343 identifies the rule ID 31313 associated with the occurrence event 31311 that is the specific error message ID. The specifying unit 343 also specifies the detection reason (the reason for detecting the change in occurrence tendency).
- the output unit 344 transmits the specified rule information and the reason for detection to the management terminal 200 via the network N (S204).
- the management terminal 200 displays the rule information and the detection reason received from the operation support device 300 via the network N on the screen.
- the operations manager can visually recognize rules that are likely to be inappropriate and their reasons. Therefore, the operation manager can examine the necessity of correction and the content of correction for the conditions and countermeasures of the applicable rule information.
- the management terminal 200 receives update information of the rule information from the operation manager (S206). The management terminal 200 then transmits the update information to the operation support device 300 via the network N (S207).
- the registration unit 341 of the operation support device 300 updates the specified rule information based on the update information received from the management terminal 200 (S208). Specifically, the registration unit 341 updates the rule DB 312 with the contents of the update information regarding the conditions or countermeasures of the rule information corresponding to the update information.
- the operation support device 300 can support the maintenance of rule information by the operation manager through inappropriate rule detection and update processing.
- FIG. 7 is a diagram showing the concept of a detection example of a rule in which the event is not resolved even after the countermeasure according to the second embodiment.
- the black circles on the left side of FIG. 7 conceptually show the occurrence times of events in chronological order.
- the history DB 313 on the right side of FIG. 7 is an example in which each piece of history information on the occurrence event "m002" is arranged and displayed in chronological order of occurrence date and time.
- the first occurrence frequency f1 is about once a month in the first period before the predetermined point in time.
- the second occurrence frequency f2 is once every 30 minutes.
- the registration unit 341 registers the history information in the history DB 313, the identification unit 343 performs inappropriate rule detection processing. Good to start. As a result, it is possible to quickly detect a change in the occurrence tendency of the occurrence event "m002", and to prompt other countermeasures as well as maintenance of the rule.
- FIG. 8 is a diagram showing an example of detection of rules whose conditions no longer match due to changes in the system state according to the second embodiment.
- the first occurrence frequency f1 is about once a month in the first period before the predetermined point in time.
- the second occurrence frequency f2 is 0 times a month, that is, it does not occur for six months or more after the predetermined point in time.
- the event "m002" has never occurred in a period that greatly exceeds the execution frequency up to now, indicating that no countermeasures have been taken. Therefore, it is considered that there is some reason why the event "m002" no longer occurs.
- the specifying unit 343 may start inappropriate rule detection processing periodically. This makes it possible to detect rules that are not functioning or unnecessary rules, and to encourage maintenance.
- FIG. 9 is a diagram showing the concept of an example solved by rule update according to the second embodiment.
- an inappropriate rule is detected in the example of FIG. 7 described above, the rule information and the reason for detection are notified to the management terminal 200, and the rule information corresponding to the event "m002" is updated accordingly. shall be assumed.
- Non-transitory computer readable media include various types of tangible storage media.
- Examples of non-transitory computer-readable media include magnetic recording media (e.g., flexible discs, magnetic tapes, hard disk drives), magneto-optical recording media (e.g., magneto-optical discs), CD-ROMs (Read Only Memory), CD-Rs, Includes CD-R/W, DVD (Digital Versatile Disc), semiconductor memory (eg, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory)).
- magnetic recording media e.g., flexible discs, magnetic tapes, hard disk drives
- magneto-optical recording media e.g., magneto-optical discs
- CD-ROMs Read Only Memory
- CD-Rs Includes CD-R/W
- DVD Digital Versatile Disc
- semiconductor memory eg, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM,
- the program may also be delivered to the computer on various types of transitory computer readable medium.
- Examples of transitory computer-readable media include electrical signals, optical signals, and electromagnetic waves.
- Transitory computer-readable media can deliver the program to the computer via wired channels, such as wires and optical fibers, or wireless channels.
- (Appendix A1) a storage unit that stores a plurality of rule information defining actions corresponding to each of a plurality of events that occur in an operational system;
- a countermeasure defined in the rule information corresponding to the predetermined event among the plurality of rule information is executed in response to the occurrence of the predetermined event in the operation system, the date and time of occurrence of the event and the date and time of the event a registration unit that registers history information including rule information in the storage unit; an identifying unit that identifies rule information that satisfies a predetermined condition at intervals of occurrence of specific events based on the history information; an output unit that outputs the specified rule information; Operation support device.
- (Appendix A2) The identification unit Analyzing the occurrence trend of the event from the multiple occurrence dates and times of the specific event, If a change in tendency is detected before or after a predetermined time from the occurrence tendency, it is determined that the occurrence interval satisfies a predetermined condition, The operation support device according to appendix A1, wherein rule information defining an event determined to satisfy the predetermined condition is specified from among the plurality of pieces of rule information. (Appendix A3) The identification unit The operation support device according to appendix A2, wherein when it is detected that the specific event occurs more frequently than before the predetermined time, it is determined that the occurrence interval satisfies a predetermined condition.
- the identification unit The operation support device according to appendix A2 or A3, wherein if a predetermined period or more has passed since the last occurrence of the specific event, it is determined that the occurrence interval satisfies a predetermined condition.
- the identification unit From the plurality of occurrence dates and times of the specific event, a first occurrence frequency of the event in a period before a predetermined time point and a second occurrence frequency of the event in a period after the predetermined time point as the occurrence trend calculate, The operation support device according to any one of appendices A2 to A4, wherein it is determined whether or not the occurrence interval satisfies a predetermined condition from the relationship between the first occurrence frequency and the second occurrence frequency.
- the output unit The operation support device according to any one of Appendices A2 to A5, further outputting a reason for detecting the change in occurrence tendency together with the specified rule information.
- (Appendix B1) Equipped with a management terminal and an operation support device, The operation support device is receiving from the management terminal a plurality of rule information defining measures corresponding to each of a plurality of events occurring in the operation system and storing them in a storage device; When a countermeasure defined in the rule information corresponding to the predetermined event among the plurality of rule information is executed in response to the occurrence of the predetermined event in the operation system, the date and time of occurrence of the event and the date and time of the event register history information including rule information in the storage device; Based on the history information, specify rule information that satisfies a predetermined condition with an occurrence interval of a specific event; outputting the identified rule information to the management terminal; Operation support system.
- (Appendix B2) The management terminal displaying the rule information output from the operation support device; transmitting update information of the rule information to the operation support device;
- the operation support device is The operation support system according to appendix B1, wherein the specified rule information is updated based on the update information received from the management terminal.
- (Appendix C1) the computer In response to the occurrence of a predetermined event in the operating system, one of the storage devices that stores a plurality of rule information defining actions corresponding to each of the plurality of events that occur in the operating system is handled.
- register history information including the date and time of occurrence of the event and the rule information of the event in the storage device when the action defined in the rule information is executed; Based on the history information, specify rule information that satisfies a predetermined condition with an occurrence interval of a specific event; An operation support method for outputting the specified rule information.
- Appendix D1 In response to the occurrence of a predetermined event in the operating system, a storage device that stores a plurality of rule information defining measures corresponding to each of the plurality of events that occur in the operating system, and responds to the predetermined event.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を記憶する記憶部と、
前記運用システムにおける所定の事象の発生に応じて前記複数のルール情報のうち当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶部に登録する登録部と、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定する特定部と、
前記特定したルール情報を出力する出力部と、
を備える。
管理端末と、運用支援装置とを備え、
前記運用支援装置は、
運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を前記管理端末から受け付けて、記憶装置に格納し、
前記運用システムにおける所定の事象の発生に応じて前記複数のルール情報のうち当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶装置に登録し、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定し、
前記特定したルール情報を前記管理端末へ出力する。
コンピュータが、
運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を記憶した記憶装置の中から、当該運用システムにおける所定の事象の発生に応じて、当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶装置に登録し、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定し、
前記特定したルール情報を出力する。
運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を記憶した記憶装置の中から、当該運用システムにおける所定の事象の発生に応じて、当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶装置に登録する処理と、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定する処理と、
前記特定したルール情報を出力する処理と、
をコンピュータに実行させる。
図1は、本実施形態1にかかる運用支援装置1の構成を示すブロック図である。運用支援装置1は、運用システムに対する運用管理や管理者による運用の支援を行うための情報処理装置である。ここで、運用システムは、コンピュータ(サーバ)、通信機器(ネットワーク機器)、ストレージ等の複数の監視対象装置により構成された情報システムである。運用システムは、例えば、通信ネットワークを介して1以上のサービスを提供するサービス提供システムや企業内の業務システム等である。また、運用システムは、外部の情報システムと連携するものであってもよい。
ここで、本実施形態が解決しようとする課題について詳述する。まず、運用自動化にはAI(Artificial Intelligence)モデルを用いることも考えられる。しかし、AIモデルの利用には学習コストが発生し、利用への敷居が高いという問題点がある。そこで、上述した事象に対応する対処を定義したルール情報を用いたルールベースエンジンにより、比較的に容易に運用自動化を導入できる。
尚、上述の実施形態では、ハードウェアの構成として説明したが、これに限定されるものではない。本開示は、任意の処理を、CPUにコンピュータプログラムを実行させることにより実現することも可能である。
(付記A1)
運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を記憶する記憶部と、
前記運用システムにおける所定の事象の発生に応じて前記複数のルール情報のうち当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶部に登録する登録部と、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定する特定部と、
前記特定したルール情報を出力する出力部と、
を備える運用支援装置。
(付記A2)
前記特定部は、
前記特定の事象における複数の前記発生日時から、当該事象の発生傾向を分析し、
前記発生傾向から所定時点の前後で傾向の変化を検出した場合、前記発生間隔が所定条件を満たすと判定し、
前記複数のルール情報の中から、前記所定条件を満たすと判定した事象が定義されたルール情報を特定する
付記A1に記載の運用支援装置。
(付記A3)
前記特定部は、
前記特定の事象が前記所定時点の前と比べて発生頻度が高くなったことを検出した場合、前記発生間隔が所定条件を満たすと判定する
付記A2に記載の運用支援装置。
(付記A4)
前記特定部は、
前記特定の事象が最後に発生してから所定期間以上経過している場合、前記発生間隔が所定条件を満たすと判定する
付記A2又はA3に記載の運用支援装置。
(付記A5)
前記特定部は、
前記特定の事象における複数の前記発生日時から、所定時点以前の期間における当該事象の第1の発生頻度と、当該所定時点より後の期間における当該事象の第2の発生頻度とを前記発生傾向として算出し、
前記第1の発生頻度と前記第2の発生頻度との関係から前記発生間隔が所定条件を満たすか否かを判定する
付記A2乃至A4のいずれか1項に記載の運用支援装置。
(付記A6)
前記出力部は、
前記特定したルール情報と共に前記発生傾向の変化を検出した理由をさらに出力する
付記A2乃至A5のいずれか1項に記載の運用支援装置。
(付記B1)
管理端末と、運用支援装置とを備え、
前記運用支援装置は、
運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を前記管理端末から受け付けて、記憶装置に格納し、
前記運用システムにおける所定の事象の発生に応じて前記複数のルール情報のうち当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶装置に登録し、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定し、
前記特定したルール情報を前記管理端末へ出力する、
運用支援システム。
(付記B2)
前記管理端末は、
前記運用支援装置から出力されたルール情報を表示し、
前記ルール情報の更新情報を前記運用支援装置へ送信し、
前記運用支援装置は、
前記管理端末から受信した更新情報に基づき、前記特定したルール情報を更新する
付記B1に記載の運用支援システム。
(付記C1)
コンピュータが、
運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を記憶した記憶装置の中から、当該運用システムにおける所定の事象の発生に応じて、当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶装置に登録し、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定し、
前記特定したルール情報を出力する
運用支援方法。
(付記D1)
運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を記憶した記憶装置の中から、当該運用システムにおける所定の事象の発生に応じて、当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶装置に登録する処理と、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定する処理と、
前記特定したルール情報を出力する処理と、
をコンピュータに実行させる運用支援プログラム。
11 記憶部
12 登録部
13 特定部
14 出力部
151 ルール情報
15n ルール情報
161 履歴情報
16m 履歴情報
1000 運用支援システム
100 運用システム
110 サーバ
111 設定情報
112 ログファイル
200 管理端末
300 運用支援装置
310 記憶部
311 プログラム
312 ルールDB
3121 ルール情報
31211 ルールID
31212 条件
31213 対処
312n ルール情報
313 履歴DB
3131 履歴情報
31311 発生事象
31312 発生日時
31313 ルールID
31314 実行結果
313m 履歴情報
320 メモリ
330 通信部
340 制御部
341 登録部
342 対処部
343 特定部
344 出力部
400 監視装置
N ネットワーク
f1 第1の発生頻度
f2 第2の発生頻度
Claims (10)
- 運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を記憶する記憶手段と、
前記運用システムにおける所定の事象の発生に応じて前記複数のルール情報のうち当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶手段に登録する登録手段と、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定する特定手段と、
前記特定したルール情報を出力する出力手段と、
を備える運用支援装置。 - 前記特定手段は、
前記特定の事象における複数の前記発生日時から、当該事象の発生傾向を分析し、
前記発生傾向から所定時点の前後で傾向の変化を検出した場合、前記発生間隔が所定条件を満たすと判定し、
前記複数のルール情報の中から、前記所定条件を満たすと判定した事象が定義されたルール情報を特定する
請求項1に記載の運用支援装置。 - 前記特定手段は、
前記特定の事象が前記所定時点の前と比べて発生頻度が高くなったことを検出した場合、前記発生間隔が所定条件を満たすと判定する
請求項2に記載の運用支援装置。 - 前記特定手段は、
前記特定の事象が最後に発生してから所定期間以上経過している場合、前記発生間隔が所定条件を満たすと判定する
請求項2又は3に記載の運用支援装置。 - 前記特定手段は、
前記特定の事象における複数の前記発生日時から、所定時点以前の期間における当該事象の第1の発生頻度と、当該所定時点より後の期間における当該事象の第2の発生頻度とを前記発生傾向として算出し、
前記第1の発生頻度と前記第2の発生頻度との関係から前記発生間隔が所定条件を満たすか否かを判定する
請求項2乃至4のいずれか1項に記載の運用支援装置。 - 前記出力手段は、
前記特定したルール情報と共に前記発生傾向の変化を検出した理由をさらに出力する
請求項2乃至5のいずれか1項に記載の運用支援装置。 - 管理端末と、運用支援装置とを備え、
前記運用支援装置は、
運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を前記管理端末から受け付けて、記憶装置に格納し、
前記運用システムにおける所定の事象の発生に応じて前記複数のルール情報のうち当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶装置に登録し、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定し、
前記特定したルール情報を前記管理端末へ出力する、
運用支援システム。 - 前記管理端末は、
前記運用支援装置から出力されたルール情報を表示し、
前記ルール情報の更新情報を前記運用支援装置へ送信し、
前記運用支援装置は、
前記管理端末から受信した更新情報に基づき、前記特定したルール情報を更新する
請求項7に記載の運用支援システム。 - コンピュータが、
運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を記憶した記憶装置の中から、当該運用システムにおける所定の事象の発生に応じて、当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶装置に登録し、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定し、
前記特定したルール情報を出力する
運用支援方法。 - 運用システムにおいて発生する複数の事象のそれぞれに応じた対処を定義した複数のルール情報を記憶した記憶装置の中から、当該運用システムにおける所定の事象の発生に応じて、当該所定の事象に対応するルール情報に定義された対処が実行された場合に、当該事象の発生日時と、当該事象のルール情報とを含めた履歴情報を前記記憶装置に登録する処理と、
前記履歴情報に基づいて、特定の事象の発生間隔が所定条件を満たすルール情報を特定する処理と、
前記特定したルール情報を出力する処理と、
をコンピュータに実行させる運用支援プログラムが格納された非一時的なコンピュータ可読媒体。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2023507098A JPWO2022196627A5 (ja) | 2022-03-14 | 運用支援装置、システム及び方法並びにプログラム | |
US18/281,357 US20240160506A1 (en) | 2021-03-19 | 2022-03-14 | Operation support apparatus, system, method, and computer-readable medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021045848 | 2021-03-19 | ||
JP2021-045848 | 2021-03-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022196627A1 true WO2022196627A1 (ja) | 2022-09-22 |
Family
ID=83320436
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/011285 WO2022196627A1 (ja) | 2021-03-19 | 2022-03-14 | 運用支援装置、システム及び方法並びにコンピュータ可読媒体 |
Country Status (2)
Country | Link |
---|---|
US (1) | US20240160506A1 (ja) |
WO (1) | WO2022196627A1 (ja) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005354280A (ja) * | 2004-06-09 | 2005-12-22 | Fujitsu Ltd | ポリシールール最適化方法および装置 |
JP2012068812A (ja) * | 2010-09-22 | 2012-04-05 | Fujitsu Ltd | 対処提示装置、対処提示方法及び対処提示プログラム |
-
2022
- 2022-03-14 US US18/281,357 patent/US20240160506A1/en active Pending
- 2022-03-14 WO PCT/JP2022/011285 patent/WO2022196627A1/ja active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005354280A (ja) * | 2004-06-09 | 2005-12-22 | Fujitsu Ltd | ポリシールール最適化方法および装置 |
JP2012068812A (ja) * | 2010-09-22 | 2012-04-05 | Fujitsu Ltd | 対処提示装置、対処提示方法及び対処提示プログラム |
Also Published As
Publication number | Publication date |
---|---|
US20240160506A1 (en) | 2024-05-16 |
JPWO2022196627A1 (ja) | 2022-09-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10635557B2 (en) | System and method for automated detection of anomalies in the values of configuration item parameters | |
US8949676B2 (en) | Real-time event storm detection in a cloud environment | |
JP2018045403A (ja) | 異常検知システム及び異常検知方法 | |
CN107209511B (zh) | 监视控制装置 | |
US11157343B2 (en) | Systems and methods for real time computer fault evaluation | |
JP6756379B2 (ja) | ログ分析方法、システムおよびプログラム | |
JP5198154B2 (ja) | 障害監視システム及びデバイスと監視装置並びに障害監視方法 | |
JP6919438B2 (ja) | 障害解析支援装置、インシデント管理システム、障害解析支援方法及びプログラム | |
JP6878984B2 (ja) | 監視プログラム、監視方法および監視装置 | |
US9621679B2 (en) | Operation task managing apparatus and method | |
CN112527484A (zh) | 工作流断点续跑方法、装置、计算机设备及可读存储介质 | |
WO2022196627A1 (ja) | 運用支援装置、システム及び方法並びにコンピュータ可読媒体 | |
JP6880961B2 (ja) | 情報処理装置、およびログ記録方法 | |
JP5803246B2 (ja) | ネットワーク運用管理システム、ネットワーク監視サーバ、ネットワーク監視方法およびプログラム | |
JP2014153736A (ja) | 障害予兆検出方法、プログラムおよび装置 | |
US9690639B2 (en) | Failure detecting apparatus and failure detecting method using patterns indicating occurrences of failures | |
US20140101260A1 (en) | Processing a technical system | |
JP2015191327A (ja) | システム監視装置、システムの監視方法、及びプログラム | |
US12056033B2 (en) | Anomaly location estimating apparatus, method, and program | |
US12001271B2 (en) | Network monitoring apparatus, method, and program | |
US20180165141A1 (en) | Device driver verification | |
JP2012146049A (ja) | バッチジョブ遅延警告自動発報システムおよび自動発報方法、ならびにそのためのプログラム | |
WO2024135322A1 (ja) | 障害対処装置、システム、方法、及び、プログラム | |
JP2009181497A (ja) | ジョブ処理システムおよびジョブ処理方法 | |
CN111444032A (zh) | 一种计算机系统故障修复方法、系统及设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22771373 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023507098 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18281357 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22771373 Country of ref document: EP Kind code of ref document: A1 |