CN113051147A - Database cluster monitoring method, device, system and equipment - Google Patents
Database cluster monitoring method, device, system and equipment Download PDFInfo
- Publication number
- CN113051147A CN113051147A CN202110448149.6A CN202110448149A CN113051147A CN 113051147 A CN113051147 A CN 113051147A CN 202110448149 A CN202110448149 A CN 202110448149A CN 113051147 A CN113051147 A CN 113051147A
- Authority
- CN
- China
- Prior art keywords
- alarm
- database cluster
- index
- user
- routing inspection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 59
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000007689 inspection Methods 0.000 claims abstract description 137
- 230000036541 health Effects 0.000 claims abstract description 58
- 230000002159 abnormal effect Effects 0.000 claims abstract description 24
- 238000012806 monitoring device Methods 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 30
- 238000001514 detection method Methods 0.000 claims description 7
- 238000011156 evaluation Methods 0.000 claims description 4
- 238000012423 maintenance Methods 0.000 abstract description 8
- 238000010586 diagram Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003862 health status Effects 0.000 description 1
- 210000001503 joint Anatomy 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/32—Monitoring with visual or acoustical indication of the functioning of the machine
- G06F11/324—Display of status information
- G06F11/327—Alarm or error message display
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Debugging And Monitoring (AREA)
Abstract
The application discloses a monitoring method, a monitoring device, a monitoring system and monitoring equipment of a database cluster. And acquiring a result file obtained after the information acquisition is carried out by the main node. And sending an alarm prompt to the user when the numerical value of the alarm index is detected to be abnormal, and prompting the user that the functional fault occurs in the database cluster. And inputting the numerical value of the routing inspection index into the health degree model, and displaying the health degree model to a user through a preset interface. Compared with the prior art, the method and the system have the advantages that the functional fault condition of the database cluster is obtained without manual intervention, the operation and maintenance efficiency is remarkably improved, and the operation and maintenance manpower is saved. In addition, the health degree model and the routing inspection indexes are used for assisting the user in learning the good and bad degree of the hardware performance of the database cluster, and the health state of the database cluster can be effectively sensed.
Description
Technical Field
The present application relates to the field of database technologies, and in particular, to a method, an apparatus, a system, and a device for monitoring a database cluster.
Background
The GP is called a Greenplus database, is a large-scale parallel computing database developed based on a PostgreSQL database, and the architecture of the GP is designed for managing a large-scale analysis type data warehouse and business intelligent workload. The existing GP monitoring tool GPCC is a native automatic operation and maintenance tool of GP, which is oriented to database administrators and users, and provides monitoring and management functions based on a visual graphical interface of a browser. However, the existing monitoring tool needs to manually perform hardware inspection on GP clusters, the inspection efficiency is very low in the GP cluster environment with a huge volume, and the health state of each GP cluster cannot be perceived, so that the working efficiency of the GP cluster is reduced.
Therefore, how to improve the hardware routing inspection efficiency of the GP cluster and effectively sense the health state of the GP cluster becomes a problem to be solved urgently in the field.
Disclosure of Invention
The application provides a monitoring method, a monitoring device, a monitoring system and monitoring equipment of a database cluster, and aims to improve hardware inspection efficiency of a GP cluster and effectively sense health status of the GP cluster.
In order to achieve the above object, the present application provides the following technical solutions:
a monitoring method of a database cluster comprises the following steps:
under the condition that a triggering operation of a user is received, a pre-configured acquisition task is distributed to a main node of a database cluster, so that the main node acquires information according to the acquisition task; the acquisition task comprises a plurality of index information acquisition tasks; the indexes comprise alarm indexes and routing inspection indexes; the alarm indicator is used for indicating information items influencing the service function of the database cluster; the inspection index is used for indicating information items which affect the hardware performance of the database cluster but do not affect the service function;
acquiring a result file obtained after the host node acquires information; the result file comprises an alarm log and a routing inspection log; the alarm log is used for recording the numerical value of the alarm index; the inspection log is used for recording the numerical value of the inspection index;
sending an alarm prompt to the user when the numerical value of the alarm index is detected to be abnormal, and prompting the user that the database cluster has a functional fault;
and inputting the numerical value of the routing inspection index into a pre-constructed health degree model, and displaying the health degree model to the user through a preset front-end interface.
Optionally, the method further includes:
inputting the numerical value of the routing inspection index recorded in the routing inspection log in a preset historical time period into a preset trend prediction model to obtain the predicted value of the routing inspection index; the predicted value is used for reflecting the performance of the database cluster hardware;
counting the number of routing inspection indexes with the predicted values larger than a preset threshold value;
and prompting the user that the health state of the database cluster is not good under the condition that the predicted value is greater than the number of routing inspection indexes of the preset threshold value and is greater than a preset numerical value.
Optionally, when the abnormal value of the alarm indicator is detected, sending an alarm prompt to the user to prompt the user that the database cluster has a functional fault, where the alarm prompt includes:
carrying out keyword detection on the alarm log;
determining that the numerical value of the alarm index is abnormal under the condition that the alarm log is detected to contain preset alarm characters;
and sending an alarm prompt to a preset monitoring alarm platform, and triggering the monitoring alarm platform to send a prompt that the database cluster has a functional fault to the user.
A monitoring apparatus of a database cluster, comprising:
the distribution unit is used for distributing a pre-configured acquisition task to a main node of a database cluster under the condition of receiving a triggering operation of a user, so that the main node acquires information according to the acquisition task; the acquisition task comprises a plurality of index information acquisition tasks; the indexes comprise alarm indexes and routing inspection indexes; the alarm indicator is used for indicating information items influencing the service function of the database cluster; the inspection index is used for indicating information items which affect the hardware performance of the database cluster but do not affect the service function;
the acquisition unit is used for acquiring a result file obtained after the information acquisition is carried out by the main node; the result file comprises an alarm log and a routing inspection log; the alarm log is used for recording the numerical value of the alarm index; the inspection log is used for recording the numerical value of the inspection index;
the warning unit is used for sending a warning prompt to the user when the abnormal numerical value of the warning index is detected, and prompting the user that the database cluster has a functional fault;
and the display unit is used for inputting the numerical value of the inspection index into a pre-constructed health degree model and displaying the health degree model to the user through a preset front-end interface.
Optionally, the method further includes:
the evaluation unit is used for inputting the numerical value of the routing inspection index recorded in the routing inspection log in a preset historical time period into a preset trend prediction model to obtain the predicted value of the routing inspection index; the predicted value is used for reflecting the performance of the database cluster hardware; counting the number of routing inspection indexes with the predicted values larger than a preset threshold value; and prompting the user that the health state of the database cluster is not good under the condition that the predicted value is greater than the number of routing inspection indexes of the preset threshold value and is greater than a preset numerical value.
Optionally, the alarm unit is configured to:
carrying out keyword detection on the alarm log;
determining that the numerical value of the alarm index is abnormal under the condition that the alarm log is detected to contain preset alarm characters;
and sending an alarm prompt to a preset monitoring alarm platform, and triggering the monitoring alarm platform to send a prompt that the database cluster has a functional fault to the user.
A monitoring system for a database cluster, comprising:
the system comprises a scheduling module, an acquisition module and an analysis module;
the scheduling module is used for pre-configuring an acquisition task according to the information of the database cluster under the condition of receiving the triggering operation of the user and sending the acquisition task to the acquisition module; the acquisition task comprises a plurality of index information acquisition tasks; the indexes comprise alarm indexes and routing inspection indexes; the alarm indicator is used for indicating an information item influencing the service function of the database cluster; the inspection index is used for indicating information items which affect the hardware performance of the database cluster but do not affect the service function;
the acquisition module is used for distributing the acquisition task to the main node of the database cluster under the condition of receiving the acquisition task, so that the main node acquires information according to the acquisition task and acquires a result file obtained after the main node acquires the information; the result file comprises an alarm log and a routing inspection log; the alarm log is used for recording the numerical value of the alarm index; the inspection log is used for recording the numerical value of the inspection index;
the acquisition module is also used for sending the alarm log and the routing inspection log to the analysis module;
the analysis module is used for sending an alarm prompt to the user when the abnormal numerical value of the alarm index is detected, and prompting the user that the database cluster has a functional fault;
the analysis module is further used for inputting the numerical value of the routing inspection index into a pre-constructed health degree model, and displaying the health degree model to the user through a preset front-end interface.
Optionally, the analysis module is further configured to:
inputting the numerical value of the routing inspection index recorded in the routing inspection log in a preset historical time period into a preset trend prediction model to obtain the predicted value of the routing inspection index; the predicted value is used for reflecting the performance of the database cluster hardware;
counting the number of routing inspection indexes with the predicted values larger than a preset threshold value;
and prompting the user that the health state of the database cluster is not good under the condition that the predicted value is greater than the number of routing inspection indexes of the preset threshold value and is greater than a preset numerical value.
A computer-readable storage medium comprising a stored program, wherein the program performs the database cluster monitoring method.
A monitoring device of a database cluster, comprising: a processor, a memory, and a bus; the processor and the memory are connected through the bus;
the memory is used for storing a program, and the processor is used for running the program, wherein the monitoring method of the database cluster is executed when the program runs.
According to the technical scheme, the pre-configured acquisition task is distributed to the main node of the database cluster under the condition that the triggering operation of the user is received, so that the main node can acquire information according to the acquisition task. The collection task comprises a plurality of indexes of information collection tasks, the indexes comprise an alarm index and a routing inspection index, the alarm index is used for indicating information items influencing the service function of the database cluster, and the routing inspection index is used for indicating information items influencing the hardware performance of the database cluster but not influencing the service function. And acquiring a result file obtained after the information acquisition is carried out by the main node. The result file includes an alarm log and a patrol log. The alarm log is used for recording the numerical value of the alarm index, and the routing inspection log is used for recording the numerical value of the routing inspection index. And sending an alarm prompt to the user when the numerical value of the alarm index is detected to be abnormal, and prompting the user that the functional fault occurs in the database cluster. And inputting the numerical value of the routing inspection index into a pre-constructed health degree model, and displaying the health degree model to a user through a preset front-end interface. Compared with the prior art, the method and the system have the advantages that the functional fault condition of the database cluster is obtained without manual intervention, the operation and maintenance efficiency is remarkably improved, and a large amount of operation and maintenance manpower is saved. In addition, the health degree model and the routing inspection indexes are used for assisting the user in learning the quality degree of the hardware performance of the database cluster, so that the health state of the database cluster can be effectively sensed. Therefore, by the scheme, the hardware inspection efficiency of the GP cluster can be obviously improved, and the health state of the GP cluster can be effectively perceived.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1a is a schematic architecture diagram of a monitoring system of a database cluster according to an embodiment of the present application;
FIG. 1b is a schematic diagram of a fitness model provided by an embodiment of the present application;
fig. 1c is a schematic diagram of a monitoring process implemented by a monitoring system of a database cluster according to an embodiment of the present application;
fig. 2 is a schematic diagram of a monitoring method for a database cluster according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a monitoring apparatus of a database cluster according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As shown in fig. 1a, an architectural schematic diagram of a monitoring system of a database cluster provided in the embodiment of the present application includes:
a scheduling module 101, an acquisition module 102, and an analysis module 103.
The scheduling module 101 is configured to, when a trigger operation of a user is received, pre-configure an acquisition task according to information of a database cluster (hereinafter referred to as a cluster). The information of the cluster includes but is not limited to: a cluster name, a floating IP of the master node of the cluster, a database name, a user of the cluster, and a password of the user of the cluster. In addition, the scheduling module 101 is further configured to issue an acquisition task to the acquisition module 102, and specifically, the acquisition task may be sent to the acquisition module 102 under the condition that a trigger operation of a user is received, and the acquisition task may also be sent to the acquisition module 102 at regular time.
It should be noted that, the specific implementation manner of sending the collection task to the collection module 102 at regular time is common knowledge familiar to those skilled in the art, for example, a preset crontab process may be called to implement the timed issuing of the collection task.
It is emphasized that the collection task comprises an information collection task with multiple indexes, wherein the indexes comprise an alarm index and a routing inspection index. The alarm index is used for indicating information items influencing the cluster service function, and the patrol index is used for indicating information items influencing the cluster hardware performance but not influencing the service function.
In the embodiment of the application, the acquisition time specified in the information acquisition task of each index can be set by technical personnel according to actual conditions, specifically, aiming at the alarm index, the acquisition time interval specified in the information acquisition task is shorter, and aiming at the routing inspection index, the acquisition time interval specified in the information acquisition task is longer.
The acquisition module 102 is configured to distribute the acquisition task to the master node of the cluster through a preset acquisition server under the condition that the acquisition task is received, so that the master node of the cluster performs information acquisition according to the acquisition task. In addition, the acquisition module 102 is further configured to acquire a result file obtained after the master node of the cluster performs information acquisition, and send the result file to the analysis module 103.
It should be noted that, in the information acquisition process, the master nodes of different clusters all use the same acquisition script (for example, the shell script and the parameter items used by the script are the same), and the file formats of the result files of different clusters are uniform and can be identified by the preset monitoring and warning platform.
Specifically, the directory structure of the collection script can be seen from table 1.
TABLE 1
In table 1, the "serial number", "first-level directory", "second-level directory", "third-level directory", and "description" are commonly used in the art for directory structures. In addition, the contents of each of the "serial number", "primary directory", "secondary directory", "tertiary directory", and "description" are well known to those skilled in the art.
It should be noted that the contents shown in table 1 are only for illustration.
In the embodiment of the application, the result file comprises an alarm log and an inspection log, wherein the alarm log is used for recording the numerical value of the alarm index, and the inspection log is used for recording the numerical value of the inspection index.
Specifically, the specific setting style of the alarm indicator can be seen in table 2.
TABLE 2
In table 2, the so-called "check item", that is, the information item described in the embodiment of the present application, the "function index" and the "performance index" are further sub-divisions of the information item, the "monitoring frequency" is used to represent the collection time interval specified in the information collection task, and the contents indicated in the "serial number", the "check item", the "alarm or not", and the "log code" are common knowledge familiar to those skilled in the art, and are not described herein again.
It should be noted that the contents described in table 2 above are only for illustration.
Specifically, the specific setting style of the routing inspection index can be seen in table 3.
TABLE 3
In table 3, the so-called "check item" is the information item described in the embodiment of the present application, and the "function index" and the "performance index" are further sub-divisions of the information item, and the "timing inspection", "daily monitoring", and "real-time inspection" are all used to represent the acquisition time interval specified in the information acquisition task, and the contents indicated in the "serial number", "check item", and "log code" are all common knowledge familiar to those skilled in the art, and are not described herein again.
It should be noted that the contents described in table 3 above are only for illustration.
And the analysis module 103 is configured to send an alarm prompt to the user when detecting that the numerical value of the alarm indicator is abnormal, and prompt the user that the database cluster has a functional fault. Specifically, the analysis module 103 is configured to: and carrying out keyword detection on the alarm log, determining that the numerical value of the alarm index is abnormal under the condition that the alarm log is detected to contain a preset alarm character, sending an alarm prompt to a preset monitoring alarm platform, and triggering the monitoring alarm platform to send a prompt that the database cluster has a functional fault to a user.
The analysis module 103 is further configured to input the numerical value of the inspection index into a pre-constructed health degree model, and display the health degree model to the user through a preset front-end interface.
The health degree model is a health degree evaluation index system, and an index system for expressing the health state of the cluster by using a multi-dimensional index. In particular, a health model of the cluster can be seen as shown in FIG. 1 b. Specifically, the health degree model of the cluster can be optimized by using a machine learning algorithm.
Furthermore, the analysis module 103 is further configured to: inputting the numerical value of the routing inspection index recorded in the routing inspection log in a preset historical time period into a preset trend prediction model to obtain a predicted value of the routing inspection index, wherein the predicted value is used for reflecting the performance degree of cluster hardware; counting the number of routing inspection indexes with predicted values larger than a preset threshold value; and prompting the user that the cluster health state is not good under the condition that the number of the routing inspection indexes with the predicted values larger than the preset threshold is larger than a preset numerical value.
It should be noted that the trend prediction model is common knowledge familiar to those skilled in the art, and will not be described herein.
It is emphasized that the existing monitoring index of the GP cluster has the defect of single monitoring index, and a GP performance monitoring index system with rich acquisition index and strong expansibility is lacked, and is used as an operation and maintenance support of an enterprise-level big data platform. In addition, the existing monitoring tool of the GP cluster lacks the distinction of the alarm index and the patrol index. The system of the embodiment of the application realizes the distinguishing of function alarm and hardware inspection aiming at different types of indexes, and the alarm indexes influencing the GP cluster function are in butt joint with the preset monitoring alarm platform, so that automatic alarm is realized, and the operation and maintenance efficiency is improved. The routing inspection indexes affecting the performance of GP cluster hardware are displayed through the health degree model, and an effective reference basis is provided for optimizing the cluster performance by a user (such as operation and maintenance personnel).
In view of the functions of the above modules, as shown in fig. 1c, the monitoring system of the database cluster according to the present application implements a process of monitoring the hardware performance and the service function of the database cluster, and includes the following steps:
s101: and under the condition of receiving the triggering operation of the user, the scheduling module pre-configures an acquisition task according to the information of the database cluster.
S102: and the scheduling module sends an acquisition task to the acquisition module.
S103: the acquisition module distributes the acquisition task to the main node of the database cluster, so that the main node acquires information according to the acquisition task.
Wherein, the result file comprises an alarm log and a patrol log.
S104: the acquisition module acquires a result file obtained after the information acquisition is carried out by the main node.
Wherein, the result file comprises an alarm log and a patrol log.
S105: the acquisition module sends an alarm log and a routing inspection log to the analysis module.
S106: and under the condition that the numerical value of the alarm index is detected to be abnormal, the analysis module sends an alarm prompt to the user to prompt the user that the database cluster has a functional fault.
S107: the analysis module inputs the numerical value of the routing inspection index into a pre-constructed health degree model, and the health degree model is displayed to a user through a preset front-end interface.
To sum up, compare in prior art, need not to learn the functional fault condition of database cluster through the mode of artifical intervention again, fortune dimension efficiency obtains showing and promotes to a large amount of fortune dimension manpower are saved. In addition, the health degree model and the routing inspection indexes are used for assisting the user in learning the quality degree of the hardware performance of the database cluster, so that the health state of the database cluster can be effectively sensed. Therefore, by using the scheme of the embodiment, the hardware inspection efficiency of the GP cluster can be obviously improved, and the health state of the GP cluster can be effectively perceived.
As shown in fig. 2, a schematic diagram of a monitoring method for a database cluster provided in the embodiment of the present application includes the following steps:
s201: and under the condition of receiving the triggering operation of the user, distributing the pre-configured acquisition task to the main node of the database cluster, so that the main node acquires information according to the acquisition task.
The collection task comprises a plurality of indexes of information collection tasks, the indexes comprise an alarm index and a routing inspection index, the alarm index is used for indicating information items influencing the service function of the database cluster, and the routing inspection index is used for indicating information items influencing the hardware performance of the database cluster but not influencing the service function.
S202: and acquiring a result file obtained after the information acquisition is carried out by the main node.
The result file comprises an alarm log and an inspection log, wherein the alarm log is used for recording the numerical value of the alarm index, and the inspection log is used for recording the numerical value of the inspection index.
S203: and sending an alarm prompt to the user when the numerical value of the alarm index is detected to be abnormal, and prompting the user that the functional fault occurs in the database cluster.
Optionally, detecting keywords in the alarm log; determining that the numerical value of the alarm index is abnormal under the condition that the alarm log is detected to contain preset alarm characters; and sending an alarm prompt to a preset monitoring alarm platform, and triggering the monitoring alarm platform to send a prompt that the database cluster has a functional fault to a user.
S204: and inputting the numerical value of the routing inspection index into a pre-constructed health degree model, and displaying the health degree model to a user through a preset front-end interface.
Optionally, inputting the numerical value of the inspection index recorded in the inspection log in a preset historical time period into a preset trend prediction model to obtain a predicted value of the inspection index; the predicted value is used for reflecting the performance of the database cluster hardware; counting the number of routing inspection indexes with predicted values larger than a preset threshold value; and prompting the user that the cluster health state of the database is not good under the condition that the number of the routing inspection indexes with the predicted values larger than the preset threshold is larger than a preset numerical value.
To sum up, compare in prior art, need not to learn the functional fault condition of database cluster through the mode of artifical intervention again, fortune dimension efficiency obtains showing and promotes to a large amount of fortune dimension manpower are saved. In addition, the health degree model and the routing inspection indexes are used for assisting the user in learning the quality degree of the hardware performance of the database cluster, so that the health state of the database cluster can be effectively sensed. Therefore, by using the scheme of the embodiment, the hardware inspection efficiency of the GP cluster can be obviously improved, and the health state of the GP cluster can be effectively perceived.
Corresponding to the monitoring method for the database cluster provided by the embodiment of the application, the embodiment of the application also provides a monitoring device for the database cluster.
As shown in fig. 3, an architecture diagram of a monitoring device of a database cluster provided in the embodiment of the present application is shown, including:
the distribution unit 100 is configured to distribute a preconfigured acquisition task to a master node of a database cluster under the condition that a triggering operation of a user is received, so that the master node performs information acquisition according to the acquisition task; the acquisition task comprises a plurality of index information acquisition tasks; the indexes comprise alarm indexes and routing inspection indexes; the alarm indicator is used for indicating information items influencing the service function of the database cluster; the patrol indicator is used to indicate an item of information that affects database cluster hardware performance but does not affect service functionality.
An obtaining unit 200, configured to obtain a result file obtained after the master node performs information acquisition; the result file comprises an alarm log and a routing inspection log; the alarm log is used for recording the numerical value of the alarm index; the inspection log is used for recording numerical values of inspection indexes.
And the alarm unit 300 is configured to send an alarm prompt to a user when detecting that the value of the alarm indicator is abnormal, and prompt the user that the database cluster has a functional fault.
Wherein, the alarm unit 300 is configured to: carrying out keyword detection on the alarm log; determining that the numerical value of the alarm index is abnormal under the condition that the alarm log is detected to contain preset alarm characters; and sending an alarm prompt to a preset monitoring alarm platform, and triggering the monitoring alarm platform to send a prompt that the database cluster has a functional fault to a user.
And the display unit 400 is used for inputting the numerical value of the inspection index into a pre-constructed health degree model and displaying the health degree model to a user through a preset front-end interface.
The evaluation unit 500 is used for inputting the numerical value of the routing inspection index recorded in the routing inspection log in a preset historical time period into a preset trend prediction model to obtain the predicted value of the routing inspection index; the predicted value is used for reflecting the performance of the database cluster hardware; counting the number of routing inspection indexes with predicted values larger than a preset threshold value; and prompting the user that the cluster health state of the database is not good under the condition that the number of the routing inspection indexes with the predicted values larger than the preset threshold is larger than a preset numerical value.
To sum up, compare in prior art, need not to learn the functional fault condition of database cluster through the mode of artifical intervention again, fortune dimension efficiency obtains showing and promotes to a large amount of fortune dimension manpower are saved. In addition, the health degree model and the routing inspection indexes are used for assisting the user in learning the quality degree of the hardware performance of the database cluster, so that the health state of the database cluster can be effectively sensed. Therefore, by using the scheme of the embodiment, the hardware inspection efficiency of the GP cluster can be obviously improved, and the health state of the GP cluster can be effectively perceived.
The application also provides a computer readable storage medium, which includes a stored program, wherein the program executes the monitoring method of the database cluster provided by the application.
The present application further provides a monitoring device of a database cluster, including: a processor, a memory, and a bus. The processor is connected with the memory through a bus, the memory is used for storing programs, and the processor is used for running the programs, wherein when the programs run, the monitoring method for the database cluster provided by the application comprises the following steps:
under the condition that a triggering operation of a user is received, a pre-configured acquisition task is distributed to a main node of a database cluster, so that the main node acquires information according to the acquisition task; the acquisition task comprises a plurality of index information acquisition tasks; the indexes comprise alarm indexes and routing inspection indexes; the alarm indicator is used for indicating information items influencing the service function of the database cluster; the inspection index is used for indicating information items which affect the hardware performance of the database cluster but do not affect the service function;
acquiring a result file obtained after the host node acquires information; the result file comprises an alarm log and a routing inspection log; the alarm log is used for recording the numerical value of the alarm index; the inspection log is used for recording the numerical value of the inspection index;
sending an alarm prompt to the user when the numerical value of the alarm index is detected to be abnormal, and prompting the user that the database cluster has a functional fault;
and inputting the numerical value of the routing inspection index into a pre-constructed health degree model, and displaying the health degree model to the user through a preset front-end interface.
Optionally, the method further includes:
inputting the numerical value of the routing inspection index recorded in the routing inspection log in a preset historical time period into a preset trend prediction model to obtain the predicted value of the routing inspection index; the predicted value is used for reflecting the performance of the database cluster hardware;
counting the number of routing inspection indexes with the predicted values larger than a preset threshold value;
and prompting the user that the health state of the database cluster is not good under the condition that the predicted value is greater than the number of routing inspection indexes of the preset threshold value and is greater than a preset numerical value.
Optionally, when the abnormal value of the alarm indicator is detected, sending an alarm prompt to the user to prompt the user that the database cluster has a functional fault, where the alarm prompt includes:
carrying out keyword detection on the alarm log;
determining that the numerical value of the alarm index is abnormal under the condition that the alarm log is detected to contain preset alarm characters;
and sending an alarm prompt to a preset monitoring alarm platform, and triggering the monitoring alarm platform to send a prompt that the database cluster has a functional fault to the user.
The functions described in the method of the embodiment of the present application, if implemented in the form of software functional units and sold or used as independent products, may be stored in a storage medium readable by a computing device. Based on such understanding, part of the contribution to the prior art of the embodiments of the present application or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. A monitoring method for a database cluster is characterized by comprising the following steps:
under the condition that a triggering operation of a user is received, a pre-configured acquisition task is distributed to a main node of a database cluster, so that the main node acquires information according to the acquisition task; the acquisition task comprises a plurality of index information acquisition tasks; the indexes comprise alarm indexes and routing inspection indexes; the alarm indicator is used for indicating information items influencing the service function of the database cluster; the inspection index is used for indicating information items which affect the hardware performance of the database cluster but do not affect the service function;
acquiring a result file obtained after the host node acquires information; the result file comprises an alarm log and a routing inspection log; the alarm log is used for recording the numerical value of the alarm index; the inspection log is used for recording the numerical value of the inspection index;
sending an alarm prompt to the user when the numerical value of the alarm index is detected to be abnormal, and prompting the user that the database cluster has a functional fault;
and inputting the numerical value of the routing inspection index into a pre-constructed health degree model, and displaying the health degree model to the user through a preset front-end interface.
2. The method of claim 1, further comprising:
inputting the numerical value of the routing inspection index recorded in the routing inspection log in a preset historical time period into a preset trend prediction model to obtain the predicted value of the routing inspection index; the predicted value is used for reflecting the performance of the database cluster hardware;
counting the number of routing inspection indexes with the predicted values larger than a preset threshold value;
and prompting the user that the health state of the database cluster is not good under the condition that the predicted value is greater than the number of routing inspection indexes of the preset threshold value and is greater than a preset numerical value.
3. The method according to claim 1, wherein the sending an alarm prompt to the user in case of detecting that the value of the alarm indicator is abnormal, the prompt of the user that the database cluster is out of function comprises:
carrying out keyword detection on the alarm log;
determining that the numerical value of the alarm index is abnormal under the condition that the alarm log is detected to contain preset alarm characters;
and sending an alarm prompt to a preset monitoring alarm platform, and triggering the monitoring alarm platform to send a prompt that the database cluster has a functional fault to the user.
4. A monitoring apparatus for a database cluster, comprising:
the distribution unit is used for distributing a pre-configured acquisition task to a main node of a database cluster under the condition of receiving a triggering operation of a user, so that the main node acquires information according to the acquisition task; the acquisition task comprises a plurality of index information acquisition tasks; the indexes comprise alarm indexes and routing inspection indexes; the alarm indicator is used for indicating information items influencing the service function of the database cluster; the inspection index is used for indicating information items which affect the hardware performance of the database cluster but do not affect the service function;
the acquisition unit is used for acquiring a result file obtained after the information acquisition is carried out by the main node; the result file comprises an alarm log and a routing inspection log; the alarm log is used for recording the numerical value of the alarm index; the inspection log is used for recording the numerical value of the inspection index;
the warning unit is used for sending a warning prompt to the user when the abnormal numerical value of the warning index is detected, and prompting the user that the database cluster has a functional fault;
and the display unit is used for inputting the numerical value of the inspection index into a pre-constructed health degree model and displaying the health degree model to the user through a preset front-end interface.
5. The apparatus of claim 4, further comprising:
the evaluation unit is used for inputting the numerical value of the routing inspection index recorded in the routing inspection log in a preset historical time period into a preset trend prediction model to obtain the predicted value of the routing inspection index; the predicted value is used for reflecting the performance of the database cluster hardware; counting the number of routing inspection indexes with the predicted values larger than a preset threshold value; and prompting the user that the health state of the database cluster is not good under the condition that the predicted value is greater than the number of routing inspection indexes of the preset threshold value and is greater than a preset numerical value.
6. The apparatus of claim 4, wherein the alert unit is configured to:
carrying out keyword detection on the alarm log;
determining that the numerical value of the alarm index is abnormal under the condition that the alarm log is detected to contain preset alarm characters;
and sending an alarm prompt to a preset monitoring alarm platform, and triggering the monitoring alarm platform to send a prompt that the database cluster has a functional fault to the user.
7. A monitoring system for a database cluster, comprising:
the system comprises a scheduling module, an acquisition module and an analysis module;
the scheduling module is used for pre-configuring an acquisition task according to the information of the database cluster under the condition of receiving the triggering operation of the user and sending the acquisition task to the acquisition module; the acquisition task comprises a plurality of index information acquisition tasks; the indexes comprise alarm indexes and routing inspection indexes; the alarm indicator is used for indicating an information item influencing the service function of the database cluster; the inspection index is used for indicating information items which affect the hardware performance of the database cluster but do not affect the service function;
the acquisition module is used for distributing the acquisition task to the main node of the database cluster under the condition of receiving the acquisition task, so that the main node acquires information according to the acquisition task and acquires a result file obtained after the main node acquires the information; the result file comprises an alarm log and a routing inspection log; the alarm log is used for recording the numerical value of the alarm index; the inspection log is used for recording the numerical value of the inspection index;
the acquisition module is also used for sending the alarm log and the routing inspection log to the analysis module;
the analysis module is used for sending an alarm prompt to the user when the abnormal numerical value of the alarm index is detected, and prompting the user that the database cluster has a functional fault;
the analysis module is further used for inputting the numerical value of the routing inspection index into a pre-constructed health degree model, and displaying the health degree model to the user through a preset front-end interface.
8. The database cluster monitoring system of claim 7, wherein the analysis module is further configured to:
inputting the numerical value of the routing inspection index recorded in the routing inspection log in a preset historical time period into a preset trend prediction model to obtain the predicted value of the routing inspection index; the predicted value is used for reflecting the performance of the database cluster hardware;
counting the number of routing inspection indexes with the predicted values larger than a preset threshold value;
and prompting the user that the health state of the database cluster is not good under the condition that the predicted value is greater than the number of routing inspection indexes of the preset threshold value and is greater than a preset numerical value.
9. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises a stored program, wherein the program performs the method of monitoring a database cluster according to any one of claims 1 to 3.
10. A monitoring device of a database cluster, comprising: a processor, a memory, and a bus; the processor and the memory are connected through the bus;
the memory is used for storing a program and the processor is used for running the program, wherein the program is used for executing the database cluster monitoring method in any one of claims 1-3 during running.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110448149.6A CN113051147A (en) | 2021-04-25 | 2021-04-25 | Database cluster monitoring method, device, system and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110448149.6A CN113051147A (en) | 2021-04-25 | 2021-04-25 | Database cluster monitoring method, device, system and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113051147A true CN113051147A (en) | 2021-06-29 |
Family
ID=76520419
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110448149.6A Pending CN113051147A (en) | 2021-04-25 | 2021-04-25 | Database cluster monitoring method, device, system and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113051147A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113505044A (en) * | 2021-09-09 | 2021-10-15 | 格创东智(深圳)科技有限公司 | Database warning method, device, equipment and storage medium |
CN113641567A (en) * | 2021-10-13 | 2021-11-12 | 北京易真学思教育科技有限公司 | Database inspection method and device, electronic equipment and storage medium |
CN114090382A (en) * | 2021-11-22 | 2022-02-25 | 北京志凌海纳科技有限公司 | Health inspection method and device for super-converged cluster |
CN114584455A (en) * | 2022-03-04 | 2022-06-03 | 吉林大学 | Small and medium-sized high-performance cluster monitoring system based on enterprise WeChat |
CN114598624A (en) * | 2022-03-15 | 2022-06-07 | 平安科技(深圳)有限公司 | Cluster monitoring method and device, electronic equipment and readable storage medium |
CN114647551A (en) * | 2022-03-11 | 2022-06-21 | 成都飞机工业(集团)有限责任公司 | Database automatic inspection method, device, equipment and medium |
CN116032574A (en) * | 2022-12-16 | 2023-04-28 | 深圳市网安信科技有限公司 | Intelligent safe operation and maintenance monitoring data processing system |
CN116127149A (en) * | 2023-04-14 | 2023-05-16 | 杭州悦数科技有限公司 | Quantification method and system for health degree of graph database cluster |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140195853A1 (en) * | 2013-01-09 | 2014-07-10 | Microsoft Corporation | Cloud management using a component health model |
CN105337765A (en) * | 2015-10-10 | 2016-02-17 | 上海新炬网络信息技术有限公司 | Distributed hadoop cluster fault automatic diagnosis and restoration system |
CN109857613A (en) * | 2018-12-25 | 2019-06-07 | 南京南瑞信息通信科技有限公司 | A kind of automation operational system based on acquisition cluster |
CN111984499A (en) * | 2020-08-04 | 2020-11-24 | 中国建设银行股份有限公司 | Fault detection method and device for big data cluster |
-
2021
- 2021-04-25 CN CN202110448149.6A patent/CN113051147A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140195853A1 (en) * | 2013-01-09 | 2014-07-10 | Microsoft Corporation | Cloud management using a component health model |
CN105337765A (en) * | 2015-10-10 | 2016-02-17 | 上海新炬网络信息技术有限公司 | Distributed hadoop cluster fault automatic diagnosis and restoration system |
CN109857613A (en) * | 2018-12-25 | 2019-06-07 | 南京南瑞信息通信科技有限公司 | A kind of automation operational system based on acquisition cluster |
CN111984499A (en) * | 2020-08-04 | 2020-11-24 | 中国建设银行股份有限公司 | Fault detection method and device for big data cluster |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113505044A (en) * | 2021-09-09 | 2021-10-15 | 格创东智(深圳)科技有限公司 | Database warning method, device, equipment and storage medium |
CN113641567A (en) * | 2021-10-13 | 2021-11-12 | 北京易真学思教育科技有限公司 | Database inspection method and device, electronic equipment and storage medium |
CN113641567B (en) * | 2021-10-13 | 2022-03-25 | 北京易真学思教育科技有限公司 | Database inspection method and device, electronic equipment and storage medium |
CN114090382A (en) * | 2021-11-22 | 2022-02-25 | 北京志凌海纳科技有限公司 | Health inspection method and device for super-converged cluster |
CN114090382B (en) * | 2021-11-22 | 2022-07-22 | 北京志凌海纳科技有限公司 | Health inspection method and device for super-converged cluster |
CN114584455A (en) * | 2022-03-04 | 2022-06-03 | 吉林大学 | Small and medium-sized high-performance cluster monitoring system based on enterprise WeChat |
CN114647551A (en) * | 2022-03-11 | 2022-06-21 | 成都飞机工业(集团)有限责任公司 | Database automatic inspection method, device, equipment and medium |
CN114598624A (en) * | 2022-03-15 | 2022-06-07 | 平安科技(深圳)有限公司 | Cluster monitoring method and device, electronic equipment and readable storage medium |
CN114598624B (en) * | 2022-03-15 | 2023-11-07 | 平安科技(深圳)有限公司 | Cluster monitoring method and device, electronic equipment and readable storage medium |
CN116032574A (en) * | 2022-12-16 | 2023-04-28 | 深圳市网安信科技有限公司 | Intelligent safe operation and maintenance monitoring data processing system |
CN116127149A (en) * | 2023-04-14 | 2023-05-16 | 杭州悦数科技有限公司 | Quantification method and system for health degree of graph database cluster |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113051147A (en) | Database cluster monitoring method, device, system and equipment | |
CN111614491B (en) | Power monitoring system oriented safety situation assessment index selection method and system | |
CN104468282B (en) | cluster monitoring processing system and method | |
CN110162445A (en) | The host health assessment method and device of Intrusion Detection based on host log and performance indicator | |
CN110708316A (en) | Method and system architecture for enterprise network security operation management | |
CN106951360B (en) | Data statistical integrity calculation method and system | |
CN111241059A (en) | Database optimization method and device based on database | |
CN113516565A (en) | Intelligent alarm processing method and device for power monitoring system based on knowledge base | |
CN108809729A (en) | The fault handling method and device that CTDB is serviced in a kind of distributed system | |
CN110109906B (en) | Data storage system and method | |
CN111221890A (en) | Automatic monitoring and early warning method and device for general indexes | |
CN114138601A (en) | Service alarm method, device, equipment and storage medium | |
CN108337100B (en) | Cloud platform monitoring method and device | |
CN115794479B (en) | Log data processing method and device, electronic equipment and storage medium | |
CN108551444A (en) | A kind of log processing method, device and equipment | |
CN114726649B (en) | Situation awareness evaluation method and device, terminal equipment and storage medium | |
CN110162444A (en) | A kind of system performance monitoring method and platform | |
CN115529219A (en) | Alarm analysis method and device, computer readable storage medium and electronic equipment | |
CN112965793B (en) | Identification analysis data-oriented data warehouse task scheduling method and system | |
CN104516916A (en) | Method and device for analyzing network report incidence relation | |
CN113886130A (en) | Method, device and medium for processing database fault | |
CN113469559A (en) | Quality bit design and display method and system based on data quality inspection | |
CN114039878A (en) | Network request processing method and device, electronic equipment and storage medium | |
CN112883253A (en) | Data processing method, device, equipment and readable storage medium | |
CN111080325A (en) | System and method for analyzing civil aviation customer relationship |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |