CN112559238B - Troubleshooting strategy generation method and device for Oracle database, processor and storage medium - Google Patents
Troubleshooting strategy generation method and device for Oracle database, processor and storage medium Download PDFInfo
- Publication number
- CN112559238B CN112559238B CN202110188401.4A CN202110188401A CN112559238B CN 112559238 B CN112559238 B CN 112559238B CN 202110188401 A CN202110188401 A CN 202110188401A CN 112559238 B CN112559238 B CN 112559238B
- Authority
- CN
- China
- Prior art keywords
- troubleshooting
- oracle
- rule
- abstract
- fault
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013024 troubleshooting Methods 0.000 title claims abstract description 225
- 238000000034 method Methods 0.000 title claims abstract description 60
- 230000001960 triggered effect Effects 0.000 claims abstract description 18
- 238000003379 elimination reaction Methods 0.000 claims description 37
- 230000008030 elimination Effects 0.000 claims description 30
- 238000010586 diagram Methods 0.000 claims description 26
- 238000012544 monitoring process Methods 0.000 claims description 12
- 238000001514 detection method Methods 0.000 claims description 11
- 230000002159 abnormal effect Effects 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 5
- 238000003745 diagnosis Methods 0.000 claims description 2
- 230000000977 initiatory effect Effects 0.000 claims 2
- 230000008569 process Effects 0.000 abstract description 14
- 238000012423 maintenance Methods 0.000 description 7
- 230000004888 barrier function Effects 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000007726 management method Methods 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/079—Root cause analysis, i.e. error or fault diagnosis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Test And Diagnosis Of Digital Computers (AREA)
Abstract
The embodiment of the invention relates to a troubleshooting technology and discloses a troubleshooting strategy generation method and device for an Oracle database, a processor and a storage medium. The method includes the steps that an abstract Oracle troubleshooting rule is created according to Oracle troubleshooting rule data, the abstract Oracle troubleshooting rule comprises abstract configuration events and abstract configuration rules, when a troubleshooting starting condition is triggered, an example Oracle troubleshooting map is generated according to the abstract Oracle troubleshooting rule and the Oracle troubleshooting knowledge map, the Oracle troubleshooting knowledge map comprises fault features and corresponding fault reasons, the example Oracle troubleshooting map comprises instantiated virtual events and instantiated abstract configuration rules, the troubleshooting rules under different scenes are established according to expert experience and knowledge fields aiming at different troubleshooting scenes, known expert troubleshooting experiences are effectively solidified, expert experience storage, fault identification and troubleshooting reasoning are achieved in automatic troubleshooting, and meanwhile, the whole process is streamlined, visualized and automated, and the troubleshooting process is accelerated.
Description
Technical Field
The embodiment of the invention relates to the technical field of troubleshooting, in particular to a troubleshooting strategy generation method and device for an Oracle database, a processor and a storage medium.
Background
At present, business systems of companies such as banks, securities, internet and the like are complex and huge, in an internal network, the association relation among all business components is complex, and the operation and maintenance work of the whole system is very challenging.
In a production network, a database is a typical type of troubleshooting scenario. Many different types of services, applications, etc. require reliance on different databases to store vast amounts of data. In large companies such as banks and securities, a database management team is often specially arranged to monitor and process database abnormalities. The inventor finds that at least the following problems exist in the prior art: the related troubleshooting knowledge of the database has high specialty, meanwhile, the monitoring index data of the database is complex, the format types are various, and establishing a universal troubleshooting rule for troubleshooting of a database scene is an important challenge in actual work.
Disclosure of Invention
The embodiment of the invention aims to provide a specific troubleshooting rule of an Oracle database scene and an algorithm for detecting related troubleshooting events, and a troubleshooting engine can establish the troubleshooting rule under different scenes according to expert experience and domain knowledge aiming at different troubleshooting scenes in the whole network. By the method, the known expert fault elimination experience can be effectively solidified, the expert experience storage, the fault identification and the fault elimination reasoning are realized in the automatic fault elimination, and meanwhile, the whole process is streamlined, visualized and automated, and the fault elimination process is accelerated.
In order to solve the technical problem, in one aspect, an embodiment of the present invention provides a method for generating a troubleshooting policy for an Oracle database, including:
acquiring Oracle troubleshooting rule data;
creating an abstract Oracle troubleshooting rule according to the Oracle troubleshooting rule data, wherein the abstract Oracle troubleshooting rule comprises an abstract configuration event and an abstract configuration rule, the abstract configuration event represents a virtual troubleshooting object, and the abstract configuration rule represents the relationship between the virtual configuration events;
acquiring an Oracle troubleshooting knowledge graph, wherein the Oracle troubleshooting knowledge graph comprises fault characteristics and corresponding fault reasons;
when a fault-removing starting condition is triggered, generating an instance Oracle fault-removing graph according to the abstract Oracle fault-removing rule and the Oracle fault-removing knowledge graph, wherein the instance Oracle fault-removing graph comprises an instance configuration event and an instance configuration rule, the instance configuration event is the instantiated virtual event, and the instance configuration rule is the instantiated abstract configuration rule;
and carrying out fault troubleshooting on the instance configuration events in the instance Oracle troubleshooting diagram one by one.
Further optionally, before the fault-elimination starting condition is triggered, the method further includes:
acquiring abnormal detection data;
the method comprises the following steps of generating an example Oracle troubleshooting map according to the abstract Oracle troubleshooting rule and the Oracle troubleshooting knowledge map, wherein the example Oracle troubleshooting map comprises the following steps: and generating an example Oracle troubleshooting chart according to the abstract Oracle troubleshooting rule, the Oracle troubleshooting knowledge chart and the anomaly detection data.
Further optionally, the fault-elimination starting condition comprises one or more of the following modes:
mode one, API triggering of other monitoring and/or alarming platforms;
the method II comprises the steps of streaming data threshold triggering;
a third mode is that streaming data abnormity detection is triggered;
and a fourth mode, triggering by other script commands.
Further optionally, the method further includes:
and carrying out root cause positioning on the checked fault information so as to determine the cause of the fault.
Further optionally, abstract Oracle troubleshooting rule is the dendrogram including node and limit, abstract configuration event corresponds with the node, configuration rule corresponds with the limit, according to abstract Oracle troubleshooting rule Oracle troubleshooting knowledge-graph generation example Oracle troubleshooting graph includes:
for each node containing a child node, each child node is assigned an entity object, which is determined by the corresponding root node or parent node.
Further optionally, generating an example Oracle obstacle-removing map from the Oracle obstacle-removing knowledge graph according to the abstract Oracle obstacle-removing rule further comprises:
and determining entity objects of the sub-nodes according to the types of the edges in the abstract Oracle troubleshooting rule graph.
Further optionally, the Oracle obstacle-removing knowledge graph further includes a spatial relationship of entities, and the entity object for determining the sub-node according to the type of the edge in the abstract Oracle obstacle-removing rule graph includes:
if the type of the edge is the same object, the child node directly inherits the entity object of the parent node; and/or the presence of a gas in the gas,
if the type of the edge is not the same object, calling corresponding spatial relationship data, and searching the corresponding entity object according to the respective spatial types of the father node and the son node.
Further optionally, the method further includes:
and graphically displaying the abstract Oracle troubleshooting rule and/or graphically displaying the instance Oracle troubleshooting diagram.
In another aspect, an apparatus for generating a troubleshooting policy for an Oracle database includes:
the rule data acquisition module is used for acquiring Oracle troubleshooting rule data;
the rule creating module is used for creating an abstract Oracle troubleshooting rule according to the Oracle troubleshooting rule data, wherein the abstract Oracle troubleshooting rule comprises an abstract configuration event and an abstract configuration rule, the abstract configuration event represents a virtual troubleshooting object, and the abstract configuration rule represents the relation between the virtual configuration events;
the system comprises a map acquisition module, an Oracle troubleshooting knowledge map and a fault diagnosis module, wherein the Oracle troubleshooting knowledge map comprises fault characteristics and corresponding fault reasons;
the system comprises an instance creating module and an instance creating module, wherein the instance creating module is used for generating an instance Oracle troubleshooting map according to the abstract Oracle troubleshooting rule and the Oracle troubleshooting knowledge map after a troubleshooting starting condition is triggered, the instance Oracle troubleshooting map comprises an instance configuration event and an instance configuration rule, the instance configuration event is an instantiated virtual event, and the instance configuration rule is an instantiated abstract configuration rule;
and the troubleshooting module is used for performing fault troubleshooting on the instance configuration events in the instance Oracle troubleshooting diagram one by one.
In yet another aspect, a server includes:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the above-described troubleshooting policy generation method for an Oracle database.
In yet another aspect, a computer-readable storage medium stores a computer program, which when executed by a processor implements the method for generating a troubleshooting policy for an Oracle database described above.
The embodiment of the invention provides a fault removal strategy generation method and device for an Oracle database, a processor and a storage medium; firstly, an abstract Oracle troubleshooting rule is created according to Oracle troubleshooting rule data, after a troubleshooting starting condition is triggered, an example Oracle troubleshooting diagram is generated according to the abstract Oracle troubleshooting rule and an Oracle troubleshooting knowledge graph, the example Oracle troubleshooting diagram comprises an example configuration event and an example configuration rule, the example configuration event in the example Oracle troubleshooting diagram is subjected to fault troubleshooting one by one, and a troubleshooting engine can establish troubleshooting rules under different scenes according to expert experience and field knowledge aiming at different troubleshooting scenes in the whole network. By the method, the known expert fault elimination experience can be effectively solidified, the expert experience storage, the fault identification and the fault elimination reasoning are realized in the automatic fault elimination, and meanwhile, the whole process is streamlined, visualized and automated, and the fault elimination process is accelerated.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.
FIG. 1 is a schematic flow chart of a method for generating a troubleshooting policy for an Oracle database according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an Oracle troubleshooting system framework implementation in an embodiment of the invention;
FIG. 3 is a flowchart illustrating another method for generating a troubleshooting policy for an Oracle database according to an embodiment of the present invention;
FIG. 4 is an example structure of an Oracle troubleshooting diagram in an embodiment of the present invention;
FIG. 5 is a schematic diagram of a frame structure of an Oracle obstacle-clearing diagram system according to an embodiment of the present invention;
fig. 6 is a functional structure diagram of a troubleshooting policy generating device for an Oracle database according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present invention, and the embodiments may be mutually incorporated and referred to without contradiction.
In order to solve the problem of troubleshooting in the whole network environment, the applicant provides an automatic troubleshooting system-a necessary troubleshooting engine based on knowledge maps and expert domain knowledge, and the core idea is to automate the process of manually troubleshooting the whole expert. By the aid of the troubleshooting engine, expert troubleshooting experience is automatically applied, failures of all systems are rapidly and fully checked, and troubleshooting information is globally and intensively displayed.
Specifically, the first embodiment of the invention relates to a fault-removal strategy generation method for an Oracle database. The flow is shown in fig. 1, and specifically comprises the following steps:
101. acquiring Oracle troubleshooting rule data;
102. creating an abstract Oracle troubleshooting rule according to Oracle troubleshooting rule data, wherein the abstract Oracle troubleshooting rule comprises an abstract configuration event and an abstract configuration rule, the abstract configuration event represents a virtual troubleshooting object, and the abstract configuration rule represents the relationship between the virtual configuration events;
103. acquiring an Oracle troubleshooting knowledge graph, wherein the Oracle troubleshooting knowledge graph comprises fault characteristics and corresponding fault reasons;
104. when the fault-removing starting condition is triggered, generating an example Oracle fault-removing graph according to an abstract Oracle fault-removing rule and an Oracle fault-removing knowledge graph, wherein the example Oracle fault-removing graph comprises an example configuration event and an example configuration rule, the example configuration event is an instantiated virtual event, and the example configuration rule is an instantiated abstract configuration rule;
105. and carrying out troubleshooting on the instance configuration events in the instance Oracle troubleshooting graph one by one.
According to the method for generating the fault removal strategy for the Oracle database, an abstract Oracle fault removal rule is created according to Oracle fault removal rule data, after a fault removal starting condition is triggered, an example Oracle fault removal map is generated according to the abstract Oracle fault removal rule and an Oracle fault removal knowledge map, the example Oracle fault removal map comprises example configuration events and example configuration rules, fault removal is performed on the example configuration events in the example Oracle fault removal map one by one, and for different fault removal scenes in the whole network, a fault removal engine can establish the fault removal rules under different scenes according to expert experience and domain knowledge. By the method, the known expert fault elimination experience can be effectively solidified, the expert experience storage, the fault identification and the fault elimination reasoning are realized in the automatic fault elimination, and meanwhile, the whole process is streamlined, visualized and automated, and the fault elimination process is accelerated.
As an improvement of the foregoing embodiment, a second embodiment of the present invention provides another method for generating a troubleshooting strategy for an Oracle database, where an Oracle troubleshooting map is implemented as a specific troubleshooting scene of the whole troubleshooting engine, and only the troubleshooting process related to the Oracle database is focused in the scene. And the operation and maintenance personnel complete the regular configuration of the Oracle troubleshooting chart through the configuration page and display the Oracle troubleshooting chart through a graphical interface. When the input data triggers the troubleshooting analysis, based on an Oracle troubleshooting chart pre-configured by operation and maintenance personnel, the troubleshooting engine can generate an instantiated troubleshooting chart based on the spatial relation data, and perform one-by-one troubleshooting and root cause positioning on event nodes in the troubleshooting chart. The final result is displayed on the graphical interface in a unified way. The method is realized by an Oracle troubleshooting system framework shown in FIG. 2, and as shown in FIG. 3, the method comprises the following steps:
301. acquiring Oracle troubleshooting rule data;
the troubleshooting rule data may be manually entered.
302. Creating an abstract Oracle troubleshooting rule according to Oracle troubleshooting rule data, wherein the abstract Oracle troubleshooting rule comprises an abstract configuration event and an abstract configuration rule, the abstract configuration event represents a virtual troubleshooting object, and the abstract configuration rule represents the relationship between the virtual configuration events;
according to the troubleshooting experience of the Oracle database example, the operation and maintenance personnel can complete the configuration of the Oracle troubleshooting rule graph through the configuration page and establish the Oracle troubleshooting graph, and the examples of the part of nodes are given as follows. Note that the meaning of the node is merely illustrative of the node and does not require configuration. In actual configuration, besides the node name, it is also necessary to specify whether the node can trigger troubleshooting (if so, a triggering mode needs to be specified), an entity type corresponding to the node (e.g., an Oracle instance, a database host, etc.), and for the monitoring node of the index, an algorithm parameter such as a name of the monitoring index, an abnormality detection mode, etc. needs to be specified.
According to the related events of the actual Oracle database and the association between the events, a barrier graph of the Oracle database can be constructed, wherein nodes in the graph are the events, and edges indicate the association between the corresponding events.
The incidence relation of the edges is determined by the spatial relation defined by each edge and the spatial type of the nodes at two ends. For example, when the space types of the nodes at the two ends are consistent, the troubleshooting logical relationship of the same entity object (e.g., an Oracle instance object) may be described, for example, the sequential troubleshooting sequence of different monitoring index troubleshooting items during troubleshooting is indicated; when the space types of the nodes at the two ends are inconsistent, such as the Oracle instance object and the database host, the relationship of the edges is "run (in)", which indicates that the Oracle instance object runs on the database host, and when an actual fault is cleared, the fault clearing engine checks the host on which the Oracle instance runs and performs subsequent fault clearing actions.
Because most indexes of the node example correspond to the same Oracle database instance, the spatial relationship corresponding to each edge does not relate to the upstream and downstream relationship of the cross-device (marked as the same database instance in the figure), and mainly corresponds to the logical causal relationship in the troubleshooting process. One exception is a CPU utilization event, which corresponds to a database host, and therefore the edge is labeled "run," meaning that a database instance is run on the database host.
Fig. 4 is a schematic diagram of an Oracle database troubleshooting rule graph, where each event node in the graph contains three lines of text, corresponding to the description of the event, the corresponding entity type, and the corresponding stop-loss policy:
compared with the conventional obstacle clearance diagram, the obstacle clearance rule diagram uses an abstract configuration mode, and in the stage of manual configuration by operation and maintenance personnel, the specific entity name does not need to be specified, and only the entity type needs to be specified. That is, only one configuration is needed for the same type of object, such as an Oracle entity, thereby avoiding the extra overhead of repeated configuration in a large network and greatly reducing the complexity of system operation and maintenance.
303. Acquiring an Oracle troubleshooting knowledge graph, wherein the Oracle troubleshooting knowledge graph comprises fault characteristics and corresponding fault reasons;
304. acquiring abnormal detection data;
305. when the fault-removing starting condition is triggered, generating an example Oracle fault-removing diagram according to an abstract Oracle fault-removing rule, an Oracle fault-removing knowledge diagram and exception detection data, wherein the example Oracle fault-removing diagram comprises an example configuration event and an example configuration rule, the example configuration event is an instantiated virtual event, and the example configuration rule is an instantiated abstract configuration rule;
and the trigger of the fault elimination is determined by the corresponding node appointed when the fault elimination rule graph is configured. When configuring an Oracle troubleshooting rule graph, at least one node needs to be designated as a node capable of triggering troubleshooting, and when the triggering condition of the node is met, a troubleshooting engine can take the node as a root node and recursively check each node downwards to generate a corresponding troubleshooting graph.
The triggering modes of the nodes include, but are not limited to, the following modes:
mode one, API triggering of other monitoring/alarm platforms: the troubleshooting engine triggers troubleshooting when the specified monitoring index is abnormal/alarmed (generated by a platform of a third party) through polling or a callback mode of other monitoring/alarming platforms.
Mode two, streaming data threshold triggering: and (3) triggering troubleshooting when the streaming data of the specified monitoring index exceeds a preset threshold range.
And a third mode, streaming data anomaly detection triggering: and detecting the streaming data of the specified monitoring index by using a real-time anomaly detection algorithm, and triggering troubleshooting when the anomaly is detected.
And a fourth mode, triggering by other script commands: and the operation and maintenance personnel trigger by a preset script, such as a timing trigger.
When a fault occurs, the fault elimination engine generates a specific instantiated fault elimination diagram according to the configured information such as the space relation in the abstract fault elimination diagram and the Oracle fault elimination knowledge graph, the actual fault occurrence time and the actual fault occurrence time, wherein nodes in the diagram are events to be subjected to fault elimination, and the edges are the relation among the events. In some alternative embodiments, step 205 may be implemented by, but is not limited to, the following processes:
when the troubleshooting map is instantiated, for each node, the troubleshooting engine assigns a specific entity object, and the entity object is determined by the triggered root node or a parent node of the triggered root node. For example: the root node is 'AAS Total abnormal of Oracle instance', and the triggering mode is the alarm of a third party. When an alarm occurs, the troubleshooting engine reads the alarm content and extracts the ID of the Oracle instance where the alarm occurred, which is instantiated as "Oracle instance [ Oracle _ ID _ a ] AAS Total exception". For the sub-nodes of the node, determining the entity objects of the sub-nodes according to the types of the edges in the troubleshooting rule graph:
(1) the type of edge is "same object": in this case, the child node directly inherits the entity object of the parent node; in this case, the monitoring index data of the same object are mostly;
(2) the type of edge is not "same object": the troubleshooting engine may invoke corresponding spatial relationship data, the sources of which include third party's CMDB data, knowledge graph data, machine configuration files, and the like. And searching corresponding entity objects according to the respective space types of the father node and the son nodes. For example, in an Oracle barrier graph, the type of edge may be "run-in", in which case the type of parent node is an Oracle instance (assuming an ID of Oracle _ ID _ A when instantiated), and the type of child node (determined by pre-configuration) is a logical host. Then, during instantiation, the troubleshooting engine queries which database HOST the Oracle _ ID _ a runs on in the spatial data (assuming that the result is HOST _ a), and generates a corresponding sub-node, where the spatial object of the sub-node is assigned as HOST _ a. It should be noted that in some cases, the space type may be one-to-many, in which case multiple sub-nodes (e.g., Oracle instances corresponding to multiple storage units) may be generated in parallel.
The specific system framework of the Oracle obstacle map is shown in fig. 5. The Oracle troubleshooting chart accesses various data, and specifically, the Oracle troubleshooting chart can include, but is not limited to, the following data:
index data/alarm data: these data are used to detect whether an event is anomalous;
CMDB data (knowledge graph): the method comprises the steps of inquiring incidence relation among barrier removal graph events when an instantiated barrier removal graph flow is generated, and determining a specific entity object connected with each edge in the barrier removal graph;
ES data: non-standardized data, such as text, system logs, etc., are stored.
306. And carrying out troubleshooting on the instance configuration events in the instance Oracle troubleshooting graph one by one.
307. And carrying out root cause positioning on the checked fault information so as to determine the cause of the fault.
308. The abstract Oracle troubleshooting rules are graphically displayed, and/or the instance Oracle troubleshooting map is graphically displayed.
The embodiment of the invention provides a fault removal strategy generation method for an Oracle database; firstly, an abstract Oracle troubleshooting rule is created according to Oracle troubleshooting rule data, after a troubleshooting starting condition is triggered, an example Oracle troubleshooting diagram is generated according to the abstract Oracle troubleshooting rule and an Oracle troubleshooting knowledge graph, the example Oracle troubleshooting diagram comprises an example configuration event and an example configuration rule, the example configuration event in the example Oracle troubleshooting diagram is subjected to fault troubleshooting one by one, and a troubleshooting engine can establish troubleshooting rules under different scenes according to expert experience and field knowledge aiming at different troubleshooting scenes in the whole network. By the method, the known expert fault elimination experience can be effectively solidified, the expert experience storage, the fault identification and the fault elimination reasoning are realized in the automatic fault elimination, and meanwhile, the whole process is streamlined, visualized and automated, and the fault elimination process is accelerated.
The steps of the above methods are divided for clarity, and the implementation may be combined into one step or split some steps, and the steps are divided into multiple steps, so long as the same logical relationship is included, which are all within the protection scope of the present patent; it is within the scope of the patent to add insignificant modifications to the algorithms or processes or to introduce insignificant design changes to the core design without changing the algorithms or processes.
A third embodiment of the present invention relates to a troubleshooting policy generation device for an Oracle database, as shown in fig. 6, including:
the rule data acquisition module 61 is used for acquiring Oracle troubleshooting rule data;
the rule creating module 62 is configured to create an abstract Oracle troubleshooting rule according to the Oracle troubleshooting rule data, where the abstract Oracle troubleshooting rule includes an abstract configuration event and an abstract configuration rule, the abstract configuration event represents a virtual troubleshooting object, and the abstract configuration rule represents a relationship between the virtual configuration events;
the map acquisition module 63 is used for acquiring an Oracle troubleshooting knowledge map, wherein the Oracle troubleshooting knowledge map comprises fault characteristics and corresponding fault reasons;
the instance creating module 64 is configured to generate an instance Oracle obstacle elimination graph according to the abstract Oracle obstacle elimination rule and the Oracle obstacle elimination knowledge graph after an obstacle elimination starting condition is triggered, where the instance Oracle obstacle elimination graph includes an instance configuration event and an instance configuration rule, the instance configuration event is an instantiated virtual event, and the instance configuration rule is an instantiated abstract configuration rule;
and the troubleshooting module 65 is used for performing fault troubleshooting on the instance configuration events in the instance Oracle troubleshooting map one by one.
The embodiment of the invention provides a fault removal strategy generation device for an Oracle database; firstly, a rule data acquisition module root rule creation module creates an abstract Oracle troubleshooting rule according to Oracle troubleshooting rule data, after a debugging starting condition is triggered, an example creation module generates an example Oracle troubleshooting map according to the abstract Oracle troubleshooting rule and an Oracle troubleshooting knowledge map, the example Oracle troubleshooting map comprises an example configuration event and an example configuration rule, a troubleshooting module conducts fault troubleshooting on the example configuration event in the example Oracle troubleshooting map one by one, and a troubleshooting engine can create troubleshooting rules under different scenes according to expert experience and domain knowledge aiming at different troubleshooting scenes in the whole network. By the method, the known expert fault elimination experience can be effectively solidified, the expert experience storage, the fault identification and the fault elimination reasoning are realized in the automatic fault elimination, and meanwhile, the whole process is streamlined, visualized and automated, and the fault elimination process is accelerated.
It should be understood that this embodiment is a physical example of the apparatus corresponding to the first embodiment, and the present embodiment can be implemented in cooperation with the first embodiment. The related technical details mentioned in the first embodiment are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the first embodiment.
It should be noted that each module referred to in this embodiment is a logical module, and in practical applications, one logical unit may be one physical unit, may be a part of one physical unit, and may be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present invention, elements that are not so closely related to solving the technical problems proposed by the present invention are not introduced in the present embodiment, but this does not indicate that other elements are not present in the present embodiment.
Since the second embodiment corresponds to the present embodiment, the present embodiment can be implemented in cooperation with the second embodiment. The related technical details mentioned in the second embodiment are still valid in this embodiment, and the technical effects that can be achieved in the second embodiment can also be achieved in this embodiment, and are not described herein again in order to reduce the repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the second embodiment.
A fourth embodiment of the present invention relates to a server including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any of the above-described troubleshooting-policy generation methods for an Oracle database.
Where the memory and processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting together one or more of the various circuits of the processor and the memory. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor.
The processor is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And the memory may be used to store data used by the processor in performing operations.
A fifth embodiment of the present invention relates to a computer-readable storage medium storing a computer program. When being executed by a processor, the computer program realizes the embodiment of the troubleshooting strategy generation method for the Oracle database.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.
Claims (11)
1. A method for generating a troubleshooting strategy for an Oracle database is characterized by comprising the following steps:
acquiring Oracle troubleshooting rule data;
creating an abstract Oracle troubleshooting rule according to the Oracle troubleshooting rule data, wherein the abstract Oracle troubleshooting rule comprises an abstract configuration event and an abstract configuration rule, the abstract configuration event represents a virtual troubleshooting object, and the abstract configuration rule represents the relationship between the abstract configuration events;
acquiring an Oracle troubleshooting knowledge graph, wherein the Oracle troubleshooting knowledge graph comprises fault characteristics and corresponding fault reasons;
when a fault-removing starting condition is triggered, generating an instance Oracle fault-removing graph according to the abstract Oracle fault-removing rule and the Oracle fault-removing knowledge graph, wherein the instance Oracle fault-removing graph comprises an instance configuration event and an instance configuration rule, the instance configuration event is the instantiated abstract configuration event, and the instance configuration rule is the instantiated abstract configuration rule;
and carrying out fault troubleshooting on the instance configuration events in the instance Oracle troubleshooting diagram one by one.
2. The method of claim 1, wherein before the troubleshooting initiation condition is triggered, the method further comprises:
acquiring abnormal detection data;
the method comprises the following steps of generating an example Oracle troubleshooting map according to the abstract Oracle troubleshooting rule and the Oracle troubleshooting knowledge map, wherein the example Oracle troubleshooting map comprises the following steps: and generating an example Oracle troubleshooting chart according to the abstract Oracle troubleshooting rule, the Oracle troubleshooting knowledge chart and the anomaly detection data.
3. The method of any of claims 1 to 2, wherein the troubleshooting initiation conditions include one or more of the following:
mode one, API triggering of other monitoring and/or alarming platforms;
the method II comprises the steps of streaming data threshold triggering;
a third mode is that streaming data abnormity detection is triggered;
and a fourth mode, triggering by other script commands.
4. The troubleshooting-policy generation method for an Oracle database as claimed in any one of claims 1 to 2, further comprising:
and carrying out root cause positioning on the checked fault information so as to determine the cause of the fault.
5. The method of claim 1, wherein the abstract Oracle troubleshooting rule is a tree graph including nodes and edges, the abstract configuration event corresponds to a node, the abstract configuration rule corresponds to an edge, and generating the example Oracle troubleshooting graph according to the abstract Oracle troubleshooting rule and the Oracle troubleshooting knowledge graph comprises:
for each node containing a child node, each child node is assigned an entity object, which is determined by the corresponding root node or parent node.
6. The method of claim 5, wherein generating an instance Oracle obstacle elimination graph from the abstract Oracle obstacle elimination rule from the Oracle obstacle elimination knowledge graph further comprises:
and determining entity objects of the sub-nodes according to the types of the edges in the abstract Oracle troubleshooting rule graph.
7. The method of claim 6, wherein the Oracle troubleshooting knowledge base further comprises a spatial relationship of entities, and the determining entity objects of the sub-nodes according to types of edges in the abstract Oracle troubleshooting rule base comprises:
if the type of the edge is the same object, the child node directly inherits the entity object of the parent node; and/or the presence of a gas in the gas,
if the type of the edge is not the same object, calling corresponding spatial relationship data, and searching the corresponding entity object according to the respective spatial types of the father node and the son node.
8. The method of claim 1, further comprising:
and graphically displaying the abstract Oracle troubleshooting rule and/or graphically displaying the instance Oracle troubleshooting diagram.
9. An obstacle avoidance policy generation apparatus for an Oracle database, comprising:
the rule data acquisition module is used for acquiring Oracle troubleshooting rule data;
the rule creating module is used for creating an abstract Oracle troubleshooting rule according to the Oracle troubleshooting rule data, wherein the abstract Oracle troubleshooting rule comprises an abstract configuration event and an abstract configuration rule, the abstract configuration event represents a virtual troubleshooting object, and the abstract configuration rule represents the relation between the abstract configuration events;
the system comprises a map acquisition module, an Oracle troubleshooting knowledge map and a fault diagnosis module, wherein the Oracle troubleshooting knowledge map comprises fault characteristics and corresponding fault reasons;
the system comprises an instance creating module, a fault removing starting module and a fault removing starting module, wherein the instance creating module is used for generating an instance Oracle fault removing graph according to the abstract Oracle fault removing rule and the Oracle fault removing knowledge graph after a fault removing starting condition is triggered, the instance Oracle fault removing graph comprises an instance configuration event and an instance configuration rule, the instance configuration event is the instantiated abstract configuration event, and the instance configuration rule is the instantiated abstract configuration rule;
and the troubleshooting module is used for performing fault troubleshooting on the instance configuration events in the instance Oracle troubleshooting diagram one by one.
10. A server, comprising:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 8 for obstacle avoidance policy generation for an Oracle database.
11. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the troubleshooting policy generation method for an Oracle database as set forth in any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110188401.4A CN112559238B (en) | 2021-02-19 | 2021-02-19 | Troubleshooting strategy generation method and device for Oracle database, processor and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110188401.4A CN112559238B (en) | 2021-02-19 | 2021-02-19 | Troubleshooting strategy generation method and device for Oracle database, processor and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112559238A CN112559238A (en) | 2021-03-26 |
CN112559238B true CN112559238B (en) | 2021-05-11 |
Family
ID=75034343
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110188401.4A Active CN112559238B (en) | 2021-02-19 | 2021-02-19 | Troubleshooting strategy generation method and device for Oracle database, processor and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112559238B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH1091450A (en) * | 1996-09-11 | 1998-04-10 | Nippon Telegr & Teleph Corp <Ntt> | Inference processing method |
CN101388085A (en) * | 2007-09-14 | 2009-03-18 | 李清东 | Rapid failure diagnosis reasoning machine |
CN107769967A (en) * | 2017-10-16 | 2018-03-06 | 中国电子科技集团公司第五十四研究所 | A kind of inter-network system trouble correlation analytic method in knowledge based storehouse |
CN112241734A (en) * | 2020-10-15 | 2021-01-19 | 首域科技(杭州)有限公司 | Method and system for diagnosing equipment fault through knowledge graph and Bayesian network |
CN112348213A (en) * | 2020-11-27 | 2021-02-09 | 新华三大数据技术有限公司 | Operation and maintenance troubleshooting implementation method, device, medium and equipment |
-
2021
- 2021-02-19 CN CN202110188401.4A patent/CN112559238B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH1091450A (en) * | 1996-09-11 | 1998-04-10 | Nippon Telegr & Teleph Corp <Ntt> | Inference processing method |
CN101388085A (en) * | 2007-09-14 | 2009-03-18 | 李清东 | Rapid failure diagnosis reasoning machine |
CN107769967A (en) * | 2017-10-16 | 2018-03-06 | 中国电子科技集团公司第五十四研究所 | A kind of inter-network system trouble correlation analytic method in knowledge based storehouse |
CN112241734A (en) * | 2020-10-15 | 2021-01-19 | 首域科技(杭州)有限公司 | Method and system for diagnosing equipment fault through knowledge graph and Bayesian network |
CN112348213A (en) * | 2020-11-27 | 2021-02-09 | 新华三大数据技术有限公司 | Operation and maintenance troubleshooting implementation method, device, medium and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN112559238A (en) | 2021-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102483025B1 (en) | Operational maintenance systems and methods | |
CN112162878B (en) | Database fault discovery method and device, electronic equipment and storage medium | |
US8166352B2 (en) | Alarm correlation system | |
DE102017128694A1 (en) | Multi-mode border selection for threat detection in an industrial plant control system | |
CN105095048B (en) | A kind of monitoring system alarm association processing method based on business rule | |
US10990668B2 (en) | Local and global decision fusion for cyber-physical system abnormality detection | |
CN111309565B (en) | Alarm processing method and device, electronic equipment and computer readable storage medium | |
CN112559237B (en) | Operation and maintenance system troubleshooting method and device, server and storage medium | |
CN109669844A (en) | Equipment obstacle management method, apparatus, equipment and storage medium | |
CN103761173A (en) | Log based computer system fault diagnosis method and device | |
CN110032463B (en) | System fault positioning method and system based on Bayesian network | |
CN112559376A (en) | Automatic positioning method and device for database fault and electronic equipment | |
CN115037597B (en) | Fault detection method and equipment | |
CN103200027A (en) | Method, device and system for locating network failure | |
CN114567538A (en) | Alarm information processing method and device | |
JP2019049802A (en) | Failure analysis supporting device, incident managing system, failure analysis supporting method, and program | |
CN117041029A (en) | Network equipment fault processing method and device, electronic equipment and storage medium | |
CN112100137A (en) | Unmanned aerial vehicle anomaly detection method based on multi-log collaborative analysis | |
CN116594840A (en) | Log fault acquisition and analysis method, system, equipment and medium based on ELK | |
CN107885634B (en) | Method and device for processing abnormal information in monitoring | |
CN113656252B (en) | Fault positioning method, device, electronic equipment and storage medium | |
CN112769615B (en) | Anomaly analysis method and device | |
CN111813872B (en) | Method, device and equipment for generating fault troubleshooting model | |
CN110245052A (en) | A kind of hot spot component of data system determines method, apparatus, electronic equipment and storage medium | |
CN113534752A (en) | Method of handling alarm handling in a processing system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |