CN114880153A - Data processing method and device, electronic equipment and computer readable storage medium - Google Patents

Data processing method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN114880153A
CN114880153A CN202210488875.5A CN202210488875A CN114880153A CN 114880153 A CN114880153 A CN 114880153A CN 202210488875 A CN202210488875 A CN 202210488875A CN 114880153 A CN114880153 A CN 114880153A
Authority
CN
China
Prior art keywords
data
database cluster
detection
detection result
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210488875.5A
Other languages
Chinese (zh)
Inventor
乔丹
李粒
齐方方
马申君
钟本立
徐凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pingkai Star Beijing Technology Co ltd
Original Assignee
Pingkai Star Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pingkai Star Beijing Technology Co ltd filed Critical Pingkai Star Beijing Technology Co ltd
Priority to CN202210488875.5A priority Critical patent/CN114880153A/en
Publication of CN114880153A publication Critical patent/CN114880153A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application provides a data processing method and device, electronic equipment and a computer readable storage medium, and relates to the technical field of distributed databases. The method is applied to a server and comprises the following steps: acquiring data to be tested of a database cluster acquired by a client; performing anomaly detection on the distribution characteristics of the data to be detected based on a preset distribution characteristic detection model to obtain a first detection result aiming at the data to be detected; detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster; generating a first repair operation instruction based on at least one of the first detection result and the second detection result; and sending the first repairing operation instruction to the client so that the client performs repairing operation on the database cluster. According to the embodiment of the application, the database cluster is detected and analyzed through multiple dimensions, and the stability and the reliability of the database system are guaranteed.

Description

Data processing method and device, electronic equipment and computer readable storage medium
Technical Field
The present application relates to the field of distributed database technologies, and in particular, to a data processing method, an apparatus, an electronic device, and a computer-readable storage medium.
Background
With the development and application of computer technology, network technology and communication technology, more and more enterprises improve the overall management level and continuous operation capacity of the enterprises through the deployment of the computer technology. While the organization architecture of a department or an enterprise is continuously expanded, the data resources of the department or the enterprise are continuously increased, and the distributed database has a flexible architecture and is adapted to a distributed management and control mechanism, so that the distributed database can be used for improving the reliability and the availability of an enterprise information system, and the application of the distributed database is more and more extensive.
The distributed database system allows an application program to access databases distributed in different geographic positions through network connection, and when the databases are operated and maintained, time sequence data of the database system are often collected and time sequence change trends of the time sequence data are analyzed so as to complete detection and operation and maintenance of the distributed database system; however, due to the fact that the system has many nodes, many components and a complex logic architecture, the method can only detect the occurred fault and cannot find the potential risk and analyze the root cause of the problem in time, so that the stability and the reliability of the distributed database system cannot be guaranteed.
Disclosure of Invention
The embodiment of the application provides a data processing method and device, electronic equipment and a computer readable storage medium, which can guarantee the stability and reliability of a distributed database system. The technical scheme is as follows:
according to an aspect of the embodiments of the present application, there is provided a data processing method, which is applied to a server, and includes:
acquiring data to be tested of a database cluster acquired by a client;
performing anomaly detection on the distribution characteristics of the data to be detected based on a preset distribution characteristic detection model to obtain a first detection result aiming at the data to be detected;
detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster;
generating a first repair operation instruction based on at least one of the first detection result and the second detection result;
and sending the first repairing operation instruction to the client so that the client performs repairing operation on the database cluster.
Optionally, the method further includes:
and when the rule updating request of the client is obtained, obtaining the subset in the first operation state detection rule, and sending the subset to the client.
Optionally, before the detecting the data to be detected based on the preset first operation state detection rule, the method further includes:
acquiring distribution characteristics of historical abnormal data;
and determining a first operation state detection rule based on the distribution characteristics of the historical abnormal data.
Optionally, the detecting the data to be detected based on the preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster includes:
determining the distribution characteristics of the target historical abnormal data corresponding to the first operation state detection rule;
and matching the distribution characteristics of the data to be detected and the target historical abnormal data to obtain a second detection result.
Optionally, the method further includes:
generating a detection report based on at least one of the first detection result and the second detection result;
and responding to a downloading request of the client, and sending the detection report to the client.
According to another aspect of the embodiments of the present application, there is provided a data processing method, which is applied to a client and includes:
collecting data to be detected of a database cluster;
detecting the numerical value of the data to be detected based on a preset second operation state detection rule to obtain a third detection result of the operation state of the database cluster; wherein the second operation state detection rule is a subset of the first operation state detection rule in the server;
generating a second repair operation instruction based on the third detection result;
and performing repair operation on the database cluster according to the second repair operation instruction.
Optionally, the method further includes:
and acquiring a first repair operation instruction sent by the server, and performing repair operation on the database cluster according to the first repair operation instruction.
Optionally, before detecting the value of the data to be detected based on the second operation state detection rule, the method further includes:
sending a rule updating request to a server;
receiving a subset in a first operation state detection rule sent by a server according to a rule updating request;
the second operating condition detection rule is updated based on the subset.
Optionally, the acquiring data to be detected of the database cluster includes:
acquiring data to be detected of a database cluster based on a preset acquisition mode;
and determining a second operation state detection rule matched with the acquisition mode.
According to another aspect of the embodiments of the present application, there is provided a data processing apparatus, provided in a server, including:
the acquisition module is used for acquiring data to be detected of the database cluster acquired by the client;
the first detection module is used for carrying out abnormity detection on the distribution characteristics of the data to be detected based on a preset distribution characteristic detection model to obtain a first detection result aiming at the data to be detected;
the second detection module is used for detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster;
a first generation module, configured to generate a first repair operation instruction based on at least one of the first detection result and the second detection result;
and the sending module is used for sending the first repairing operation instruction to the client so that the client can carry out repairing operation on the database cluster.
Optionally, the apparatus further includes a request module, configured to:
and when the rule updating request of the client is obtained, obtaining the subset in the first operation state detection rule, and sending the subset to the client.
Optionally, before the second detection module detects the data to be detected based on the preset first operation state detection rule, the second detection module is further configured to:
acquiring distribution characteristics of historical abnormal data;
and determining a first operation state detection rule based on the distribution characteristics of the historical abnormal data.
Optionally, the second detection module detects the data to be detected based on a preset first operation state detection rule, and when a second detection result of the operation state of the database cluster is obtained, the second detection module is further configured to:
determining the distribution characteristics of the target historical abnormal data corresponding to the first operation state detection rule;
and matching the distribution characteristics of the data to be detected and the target historical abnormal data to obtain a second detection result.
Optionally, the apparatus further includes a report generating module, configured to:
generating a detection report based on at least one of the first detection result and the second detection result;
and responding to a downloading request of the client, and sending the detection report to the client.
According to another aspect of the embodiments of the present application, there is provided a data processing apparatus, provided at a client, including:
the acquisition module is used for acquiring data to be detected of the database cluster;
the third detection module is used for detecting the numerical value of the data to be detected based on a preset second operation state detection rule to obtain a third detection result of the operation state of the database cluster; wherein the second operation state detection rule is a subset of the first operation state detection rule in the server;
the second generation module is used for generating a second repair operation instruction based on the third detection result;
and the repairing module is used for repairing the database cluster according to the second repairing operation instruction.
Optionally, the repair module is further configured to:
and acquiring a first repair operation instruction sent by the server, and performing repair operation on the database cluster according to the first repair operation instruction.
Optionally, the third detection module is further configured to, before detecting the value of the data to be detected based on the second operation state detection rule:
sending a rule updating request to a server;
receiving a subset of a first operation state detection rule sent by a server according to a rule updating request;
the second operating condition detection rule is updated based on the subset.
Optionally, when the acquisition module acquires data to be detected of the database cluster, the acquisition module is configured to:
acquiring data to be detected of a database cluster based on a preset acquisition mode;
and determining a second operation state detection rule matched with the acquisition mode.
According to another aspect of an embodiment of the present application, there is provided an electronic apparatus including: the device comprises a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to realize the steps of the method shown in the first aspect and the second aspect of the embodiment of the application.
According to a further aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method as shown in the first and second aspects of embodiments of the present application.
According to an aspect of embodiments of the present application, there is provided a computer program product comprising a computer program that, when executed by a processor, performs the steps of the method shown in the first and second aspects of embodiments of the present application.
The technical scheme provided by the embodiment of the application has the following beneficial effects:
according to the method and the device, the server acquires the data to be detected of the database cluster acquired by the client, and the distribution characteristics of the data to be detected are subjected to abnormal detection based on a preset distribution characteristic detection model to obtain a first detection result; meanwhile, detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster; the method comprises the steps that a distribution characteristic detection model and a first operation state detection rule are combined, multi-dimensional detection analysis is conducted on a database cluster, and compared with the prior art that a database cluster system is detected based on time sequence data of the database system; according to the method and the device, based on the distribution characteristics of the data and the running state of the system, the database cluster is comprehensively and effectively detected from different data dimensions and different detection levels, so that potential problems can be found in advance, potential risks can be eliminated in time for the database cluster, and the stable running of a database cluster system is guaranteed.
Meanwhile, the first repairing operation instruction is generated based on at least one of the first detection result and the second detection result and is sent to the client, and the client performs repairing operation on the database cluster based on the first repairing operation instruction, so that risk avoidance and abnormal repairing of the database cluster are achieved, the reliability of the database cluster is further guaranteed, the operation and maintenance efficiency of the database is improved, and continuous and efficient data support is provided for a service system corresponding to the database cluster.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
Fig. 1 is a schematic system architecture diagram of a data processing method according to an embodiment of the present application;
fig. 2 is a schematic view of an application scenario of a data processing method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 4 is a schematic flowchart of another data processing method according to an embodiment of the present application;
fig. 5 is a data interaction timing chart of a data processing method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of another data processing apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a data processing electronic device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described below in conjunction with the drawings in the present application. It should be understood that the embodiments set forth below in connection with the drawings are exemplary descriptions for explaining technical solutions of the embodiments of the present application, and do not limit the technical solutions of the embodiments of the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be further understood that the terms "comprises" and/or "comprising," when used in this specification in connection with embodiments of the present application, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, as embodied in the art. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein indicates at least one of the items defined by the term, e.g., "a and/or B" may be implemented as "a", or as "B", or as "a and B".
To make the objects, technical solutions and advantages of the present application more clear, the following detailed description of the embodiments of the present application will be made with reference to the accompanying drawings.
Distributed Database systems typically use smaller computer systems, each of which may be individually located in a single location, each of which may have a full copy, or a partial copy, of a DBMS (Database Management System) with its own local Database, with many computers located at different locations interconnected via a network to form a complete, globally logically centralized, physically distributed, large Database.
The distributed database has good availability and expandability, and provides good data support for an upper-layer service system. Often, the business system of the client requires 7 × 24 hours of uninterrupted operation, and extremely high requirements are provided for the stability and reliability of the database cluster. Once the database cluster has abnormal state, the influence of the database cluster on the upper business system can not be measured if the database cluster is not adjusted in time.
In the prior art, monitoring indexes of a distributed database cluster can only be analyzed according to prometheus (an open source system monitoring and alarming system) or similar monitoring data sampled periodically, and the inventor finds that the following problems exist:
1. when serious problems occur, potential problems of a cluster cannot be found in advance through comprehensive analysis of multi-dimensional monitoring data such as logs, configuration, system information and the like before an alarm is generated, and stability risks cannot be eliminated in time;
2. after a problem occurs in the distributed database cluster, the distributed database cluster needs to collect and integrate diagnostic data from different nodes due to the fact that the distributed database cluster is provided with multiple nodes, multiple components and complex logic architecture, and related operations are very complicated. Due to the lack of related diagnosis knowledge, operation and maintenance personnel cannot collect effective diagnosis data, so that the problem analysis time is long, and the problem root cannot be found. And long-time communication is needed to provide complete diagnosis data, the problem solving time is long, and the influence on upper-layer services is large.
The application provides a data processing method, a data processing device, an electronic device and a computer-readable storage medium, which aim to solve the above technical problems in the prior art.
The embodiment of the present application provides a data processing method, which may be implemented by a data processing system including a client and a server, as shown in fig. 1. The client can be deployed in a central control machine on the database cluster side and used as an operation and maintenance client of the diagnosis system; the server serving as the cloud diagnosis center can be in bidirectional connection with the operation and maintenance client based on wireless communication and serves as an operation and maintenance server of the diagnosis system. There may be multiple database clusters, each database cluster including multiple cluster nodes. The operation and maintenance server obtains data to be detected of a database cluster acquired by an operation and maintenance client, and performs anomaly detection on distribution characteristics of the data to be detected based on a preset distribution characteristic detection model to obtain a first detection result; meanwhile, the operation and maintenance server detects the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster; and then the operation and maintenance server generates a first repair operation instruction based on at least one of the first detection result and the second detection result, and sends the first repair operation instruction to the operation and maintenance client, and the operation and maintenance client performs repair operation on the database cluster based on the first repair operation instruction. According to the method and the device, the distributed characteristic detection model and the first running state detection rule can be combined, multi-dimensional detection analysis is carried out on the database cluster, potential problems can be found in advance, risk avoidance and abnormal restoration of the database cluster are achieved, and reliability and stability of the database cluster are guaranteed.
The technical solutions of the embodiments of the present application and the technical effects produced by the technical solutions of the present application will be described below through descriptions of several exemplary embodiments. It should be noted that the following embodiments may be referred to, referred to or combined with each other, and the description of the same terms, similar features, similar implementation steps and the like in different embodiments is not repeated.
As shown in fig. 2, the data processing method of the present application may be applied to the scenario shown in fig. 2, specifically, the operation and maintenance server 202 obtains data to be detected of a database cluster acquired by the operation and maintenance client 201, and performs anomaly detection on distribution characteristics of the data to be detected based on a preset distribution characteristic detection model to obtain a first detection result; meanwhile, the operation and maintenance server 202 detects the data to be detected based on a preset first operation state detection rule, and obtains a second detection result of the operation state of the database cluster; then, the operation and maintenance server 202 generates a first repair operation instruction based on at least one of the first detection result and the second detection result, and sends the first repair operation instruction to the operation and maintenance client 201, and the operation and maintenance client 201 performs a repair operation on the database cluster based on the first repair operation instruction.
In the scenario shown in fig. 2, the data processing method may be implemented in the interaction between the server and the client. As will be appreciated by those skilled in the art, a "client," as used herein, may be an application; a "server" may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
The embodiment of the present application provides a data processing method, as shown in fig. 3, which may be applied to an operation and maintenance server, and the method includes:
s301, acquiring data to be tested of the database cluster acquired by the operation and maintenance client.
The number of the database clusters is at least one, each database cluster comprises a plurality of database nodes, and the data to be tested can comprise various types of logs, configuration, Prometous monitoring data, hardware system information, routing (core of Go language parallel design) data, database system parameters, slow query data and the like of all the database nodes.
Meanwhile, the operation and maintenance server can determine the label of the data to be detected based on the database cluster corresponding to the data to be detected. The tag may be an ID (Identity document) of the database cluster, or may be a location coordinate corresponding to the database cluster.
Specifically, the operation and maintenance server may establish a network connection with the operation and maintenance client based on a wired or wireless local area network, acquire data to be tested sent by the operation and maintenance client based on the network connection, and store the data to be tested in a storage space preset by the operation and maintenance server. The wired lan may be an ethernet based on IEEE 802.3 (a lan communication standard) protocol, and the wireless lan may be a Wi-Fi (wireless communication technology) based on IEEE 802.11 (a standard for wireless network communication).
S302, carrying out abnormity detection on the distribution characteristics of the data to be detected based on a preset distribution characteristic detection model to obtain a first detection result aiming at the data to be detected.
Wherein the distributed feature model may be a machine learning model pre-trained based on historical data of the database cluster. The distribution characteristic model can be used for detecting at least one type of data in the data to be detected.
The machine learning model can efficiently acquire knowledge and meet the continuously increased data analysis requirements in big data scenes. Based on machine learning, the complex and various data can be analyzed deeply, and information can be utilized more efficiently. The machine learning mainly comprises algorithms such as decision trees, random forests, artificial neural networks, Bayesian learning and the like.
In an embodiment of the application, the distributed feature model is a machine learning model based on a neural network, the operation and maintenance server may construct a training set based on historical log data of a database cluster, train an initial model based on the training set to obtain the distributed feature model, perform anomaly detection on log data in data to be detected based on the distributed feature model, and extract an anomaly log in the log data as a first detection result.
And S303, detecting the data to be detected based on a preset first operation state detection rule, and obtaining a second detection result of the operation state of the database cluster.
The first operation state detection rule may be a set of various detection rules determined based on the distribution characteristics of the historical abnormal data. The first operation state detection rule may be used to detect the operation state of the database cluster by combining multiple types of data in the data to be detected.
Meanwhile, the first operation state detection rule may include a plurality of subsets, for example, when the first operation state detection rule is a mixed deployment condition detection of the database cluster, the subset may include a data import log detection, a data storage log and monitoring index detection, a data export log detection, and the like of each node in the database cluster.
Specifically, the operation and maintenance server may cluster the data to be detected based on the first operation state detection rule to obtain distribution characteristics of multiple types of data, compare the distribution characteristics of the multiple types of data with distribution characteristics of abnormal data in the current database cluster or in a preset knowledge base to obtain abnormal data in the data to be detected, determine whether the operation state of the database cluster is normal based on the abnormal data, and obtain a second detection result. Wherein the knowledge base is a collection of a series of problem models extracted from historical anomaly data of a plurality of clusters. Based on the data comparison of the knowledge base, the abnormal distribution characteristics which do not appear in the current database cluster can be found, and the data to be tested is comprehensively analyzed.
S304, generating a first repairing operation instruction based on at least one of the first detection result and the second detection result.
The number of the first repair operation instructions can be multiple; the first repair operation instruction may be an instruction to be automatically executed or an instruction requiring manual authorization.
In this embodiment of the application, when the first detection result or the second detection result indicates that the configuration item is abnormal, the corresponding first repair operation instruction may indicate to modify the abnormal configuration item; when the first detection result or the second detection result indicates that the database index is lost, the corresponding first repair operation may indicate to create a corresponding database index.
S305, sending the first repair operation instruction to the operation and maintenance client so that the operation and maintenance client can perform repair operation on the database cluster.
Wherein the number of the first repair operation instructions is at least one. Each first repair operation instruction corresponds to a tag, and the tag is used for indicating the database cluster corresponding to the first repair operation instruction.
Specifically, the operation and maintenance server may determine a label corresponding to the first repair operation instruction based on a label corresponding to the data to be detected, and send the first repair operation instruction and the corresponding label to the operation and maintenance client, so that the operation and maintenance client performs a repair operation on the database cluster corresponding to the label.
According to the method and the device, the operation and maintenance server obtains the data to be detected of the database cluster acquired by the operation and maintenance client, and the distribution characteristics of the data to be detected are subjected to abnormal detection based on a preset distribution characteristic detection model to obtain a first detection result; meanwhile, detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster; the method comprises the steps that a distribution characteristic detection model and a first operation state detection rule are combined, multi-dimensional detection analysis is conducted on a database cluster, and compared with the prior art that a database cluster system is detected based on time sequence data of the database system; according to the method and the device, based on the distribution characteristics of the data and the running state of the system, the database cluster is comprehensively and effectively detected from different data dimensions and different detection levels, so that potential problems can be found in advance, potential risks can be eliminated in time for the database cluster, and the stable running of a database cluster system is guaranteed.
Meanwhile, a first repair operation instruction is generated based on at least one of the first detection result and the second detection result and is sent to the operation and maintenance client side, and the operation and maintenance client side performs repair operation on the database cluster based on the first repair operation instruction, so that risk avoidance and abnormal repair of the database cluster are achieved, the reliability of the database cluster is further guaranteed, the operation and maintenance efficiency of the database is improved, and continuous and efficient data support is provided for a service system corresponding to the database cluster.
A possible implementation manner is provided in the embodiment of the present application, and the method further includes:
and when the rule updating request of the operation and maintenance client is obtained, obtaining the subset in the first operation state detection rule, and sending the subset to the operation and maintenance client.
In some embodiments, the operation and maintenance client may send a rule update request to the operation and maintenance server based on a preset time interval;
in other embodiments, the operation and maintenance client may send a rule update request to the operation and maintenance server in response to a version update instruction of a user; wherein the version update instruction may be triggered based on at least one of the following operations:
dragging or moving the interface element component corresponding to the version update to the preset range of the current interface;
clicking or touching the corresponding interface element component aiming at the version update;
and updating the corresponding input operation of the identifier for the version in a preset input control.
In the embodiment of the application, the operation and maintenance server acquires the subset of the first operation state detection rule by acquiring the rule update request of the operation and maintenance client and sends the subset to the operation and maintenance client, so that the issuing of the detection rule is realized, and the consistency of the detection standard in the operation and maintenance client and the detection standard in the operation and maintenance server is ensured. The operation and maintenance client can perform basic and simple detection on the local of the server cluster, the operation and maintenance server can perform comprehensive and systematic complex detection work on the database cluster at the cloud, and the stability of the database cluster is guaranteed from different levels.
A possible implementation manner is provided in the embodiment of the present application, before the step S303 detects data to be detected based on a preset first operation state detection rule, the method further includes:
(1) and acquiring the distribution characteristics of the historical abnormal data.
The distribution characteristics of the historical abnormal data can include the proportion of abnormal nodes in the historical abnormal data, the time sequence change period and change rate of the historical abnormal data, the number statistics of keywords in the historical abnormal data and the like. The historical abnormal data can be obtained from at least one of the historical data of the current database cluster and the abnormal database of the preset database cluster.
(2) And determining a first operation state detection rule based on the distribution characteristics of the historical abnormal data.
Specifically, when the distribution characteristics of the historical abnormal data include the proportion of the abnormal node in the historical abnormal data, the first operation state detection rule may include: when the proportion of abnormal nodes in the data to be detected exceeds a preset threshold value, judging that the data to be detected is abnormal;
when the distribution characteristics of the historical abnormal data include a time sequence change period and a change rate of the historical abnormal data, the first operation state detection rule may include: when the time sequence change period or the change rate of the data to be detected does not meet the preset condition, judging that the data to be detected is abnormal;
when the distribution characteristics of the historical abnormal data include statistics of the number of keywords in the historical abnormal data, the first operation state detection rule may include: and when the number of the keywords in the data to be detected exceeds a preset threshold value, judging that the data to be detected is abnormal.
Specifically, the operation and maintenance server may determine the first operation state detection rule based on a problem analysis type selection instruction sent by the user, a distribution characteristic of the cluster historical data, and a problem model in a preset knowledge base. Wherein, the problem analysis type selecting instruction comprises: analyzing database delay instructions, analyzing execution plan instructions, analyzing node state instructions, and the like. The distribution characteristics of the historical abnormal data in the embodiment of the present application include, but are not limited to, the above examples, and the first operation state detection rule may also be a combination of the above multiple judgments, and is not limited in this embodiment.
According to the method and the device, the distribution characteristics of the operation data of the database system are effectively analyzed based on the historical abnormal data, and the distribution characteristics are used as the basis for designating the first operation state detection rule, so that the reliability of the detection rule is improved.
In an embodiment of the present application, a possible implementation manner is provided, where the detecting of the data to be detected in step S303 based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster includes:
(1) and determining the distribution characteristics of the target historical abnormal data corresponding to the first operation state detection rule.
The first operation state detection rule may be determined based on a user selection instruction.
Specifically, the operation and maintenance server may receive a user selection instruction, and determine a first operation state detection rule corresponding to the user selection instruction.
In the embodiment of the application, when a user finds that the database cluster has an abnormal problem which cannot be detected by the operation and maintenance client, a user selection instruction can be sent to the operation and maintenance server, so that the database cluster is subjected to deep detection by adopting a specified first operation state detection rule. The first operation state detection rule can be matched with the user level, and when the user level is high, more first operation state detection rules are selected by the user.
(2) And matching the distribution characteristics of the data to be detected and the target historical abnormal data to obtain a second detection result.
In the embodiment of the present application, a first operation state detection rule is used for log and configuration state detection for a database cluster as an example. The operation and maintenance server acquires log type data and configuration item data from the data to be detected based on a first operation state detection rule, classifies the log type data and calculates the proportion and the change trend of different types of logs; meanwhile, acquiring log distribution characteristics in historical abnormal data, comparing the proportion and the change trend with the log distribution characteristics in the historical abnormal data to obtain the problem type of the log, and judging whether the configuration information of the database cluster needs to be changed or not by combining the configuration item data. And if the configuration change is needed, taking the modification information of the configuration item as a second detection result.
According to the method and the device, the distribution characteristics of the data to be detected and the target historical abnormal data are matched to obtain the second detection result, and the first operation state detection rule can be combined with various types of data to be detected to perform multi-dimensional detection on the database cluster, so that stable operation of the database cluster is further guaranteed.
A possible implementation manner is provided in the embodiment of the present application, and the method further includes:
generating a detection report based on at least one of the first detection result and the second detection result; and responding to a downloading request of the operation and maintenance client, and sending the detection report to the operation and maintenance client.
The detection report is used for indicating the running state of the database cluster and whether the data to be detected is abnormal or not; the detection report may be generated based on a preset report template. The detection report can help a user to investigate a part of potential risks, can also give out possible root cause judgment of problems, and helps the user to quickly analyze and locate the problems.
In this embodiment of the application, the operation and maintenance server may further generate warning information based on at least one of the first detection result and the second detection result, where the warning information is used to prompt a user to perform a problem repairing operation. The alarm information can comprise a severity level, an alarm level and a notification level, each alarm can be configured with a knowledge base document link, and a user can view the knowledge base to obtain detailed problem analysis and repair operation suggestions.
An embodiment of the present application provides a data processing method, which is applied to an operation and maintenance client, and as shown in fig. 4, the method includes:
s401, collecting data to be detected of the database cluster.
Specifically, the operation and maintenance client may collect the data to be detected of the database cluster based on a preset collection mode. Different acquisition modes may correspond to different data to be measured.
The collection modes can be selected by a user, each collection mode can correspond to a problem type of a database cluster, and the to-be-detected data corresponding to the collection modes mainly comprises database operation data corresponding to the problem type.
Meanwhile, the number of the database clusters is at least one, each database cluster comprises a plurality of database nodes, and the data to be tested can comprise various types of logs, configuration, Prometheus monitoring data, hardware system information, routing data, database system parameters, Slowquery data and the like of all the database nodes. The operation and maintenance client may acquire data to be tested in manners of SCP (Secure copy) transmission, SSH (Secure Shell, a remote connection tool) remote execution command, HTTP (Hyper Text Transfer Protocol) call, SQL statement query, and the like, which is not specifically limited in the embodiment of the present application.
In the embodiment of the application, the operation and maintenance client may be deployed on a central control machine of the database cluster so as to collect operation data of the database cluster, and the operation and maintenance client may also perform some basic detections on the database cluster locally, where a specific detection manner will be described in detail below.
In some embodiments, the operation and maintenance client may compress the data to be tested by using a general open source compression algorithm, encrypt the data by using an asymmetric encryption algorithm, and upload the data to be tested to the operation and maintenance server. The operation and maintenance client can upload the data to be tested to the operation and maintenance server in real time and can also automatically upload the data in a specified uploading period according to the configuration of a user; the data to be tested can be database cluster operation data in any time period so as to realize data uploading according to service conditions and save network resources.
S402, detecting the numerical value of the data to be detected based on a preset second operation state detection rule to obtain a third detection result of the operation state of the database cluster.
And the second operation state detection rule is a subset of the first operation state detection rule in the operation and maintenance server. The second operation state detection rules are obtained from the operation and maintenance server, each second operation state detection rule can detect one problem type of the database cluster, and each first operation state detection rule can comprehensively detect a plurality of problem types of the database cluster.
In the embodiment of the application, when the second operation state detection rule is detection on SQL performance, statistics may be performed on execution duration of the database cluster, and if it is found that the execution duration steeply increases with a time sequence change rate, a third detection result indicates that there is a risk of deviation of the execution plan for the database cluster.
And S403, generating a second repair operation instruction based on the third detection result.
In this embodiment of the application, when the second operation state detection rule is detection of SQL performance, statistics may be performed on execution duration of the database cluster, and if a third detection result is that the database cluster has an execution plan deviation risk, the second repair operation instruction may be to perform a curing operation on the execution plan.
And S404, performing repair operation on the database cluster according to the second repair operation instruction.
The second repair operation instruction may be an instruction to be automatically executed, or may be an instruction requiring manual authorization:
when the second repair operation instruction is an automatically executed instruction, the operation and maintenance client automatically performs corresponding repair and modification on the configuration or key parameters of the database cluster after receiving the instruction;
when the second repairing operation instruction is an instruction requiring manual authorization, the operation and maintenance client reminds the user of performing authorization operation after receiving the instruction, and when the user completes authorization, the operation and maintenance client performs corresponding repairing and changing on the configuration or key parameters of the database cluster.
The operation and maintenance client in the embodiment of the application can repair potential problems and non-standard configurations of the database cluster based on the second repair operation instruction, and reduces the probability and risk of abnormal operation of the subsequent database cluster.
According to the embodiment of the application, the operation and maintenance client acquires the data to be detected of the database cluster, and detects the numerical value of the data to be detected based on the preset second operation state detection rule to obtain a third detection result of the operation state of the database cluster; generating a second repairing operation instruction based on the third detection result, and performing repairing operation on the database cluster according to the second repairing operation instruction; the operation and maintenance client can be deployed in a central control computer of the database cluster to achieve local detection of the database cluster, and detection efficiency is high. Meanwhile, the second operation state detection rule is a subset of the first operation state detection rule in the operation and maintenance server, and the consistency of the detection standards of the operation and maintenance client and the operation and maintenance server can be guaranteed. According to the embodiment of the application, the database cluster is comprehensively and effectively detected from different data dimensions and different detection levels by combining the operation and maintenance client and the operation and maintenance server, so that potential problems can be found in advance, potential risks can be eliminated for the database cluster in time, and stable operation of a database cluster system is guaranteed.
A possible implementation manner is provided in the embodiment of the present application, and the method further includes:
and acquiring a first repairing operation instruction sent by the operation and maintenance server, and repairing the database cluster according to the first repairing operation instruction.
Specifically, the operation and maintenance client may obtain a first repair operation instruction and a tag corresponding to the first repair instruction from the operation and maintenance server, and the operation and maintenance client may perform a repair operation on the database cluster corresponding to the tag based on the first repair operation instruction.
A possible implementation manner is provided in the embodiment of the present application, and the acquiring data to be tested of the database cluster in step S401 includes:
(1) and acquiring the data to be detected of the database cluster based on a preset acquisition mode.
Wherein, the collection mode includes: a slow SQL (Structured Query Language) mode, a CPU (central processing unit) rising mode, an out of memory (out of memory) mode, an execution plan deviation mode, a stability mode, a Data migration mode, a Data synchronization mode, an index problem mode, a high concurrency mode, a cluster capacity expansion mode, a DDL (Data Definition Language) detection mode, an object-related problem mode, and the like.
Specifically, when the acquisition mode is the slow SQL mode, the data to be detected includes latency (latency, which represents a clock cycle required for completely executing an instruction) related logs of each database node in the database cluster, slow SQL logs, performance data of the system, and the like;
when the collection mode is a CPU lifting mode, the data to be detected comprises CPU data, database load data, complete logs and performance data of CPU lifting nodes and the like;
when the acquisition mode is an OOM mode, the data to be detected comprises memory data, complete logs of all database nodes, system performance data and the like;
and when the acquisition mode is an execution plan deviation mode, the data to be detected comprises a slow log, an execution plan historical log and the like.
(2) And determining a second operation state detection rule matched with the acquisition mode.
Specifically, the operation and maintenance client may pre-construct a corresponding relationship between the acquisition mode and the second operation state detection rule, and determine the second operation state detection rule matched with the acquisition mode based on the corresponding relationship.
For example, when the collection mode is the CPU increasing mode, there may be a plurality of second operation state detection rules, including detection rules for CPU logs, performance data, or database load conditions. The CPU log detection rule can find out high-risk log content based on the log keywords, and the performance data detection rule can detect whether performance-related parameter values in the configuration items are in compliance or not.
A possible implementation manner is provided in the embodiment of the present application, before detecting the value of the data to be detected based on the second operation state detection rule in step S402, the method further includes:
(1) and sending a rule updating request to the operation and maintenance server.
Specifically, the operation and maintenance client may send a rule update request to the operation and maintenance server based on a rule update instruction of the user.
The rule updating instruction can be sent to the operation and maintenance client by the user through external equipment such as a mouse, a keyboard, a touch screen and the like.
(2) Receiving a subset in a first operation state detection rule sent by an operation and maintenance server according to a rule updating request; the second operating condition detection rule is updated based on the subset.
Specifically, the operation and maintenance client may update the second operation state detection rule based on the coverage update mode or the differential update mode.
In some embodiments, the operation and maintenance client may add the subset as a newly added second operation state detection rule to the first operation state detection rule;
in other embodiments, the operation and maintenance client may replace the original second operation state detection rule with the subset as a new second operation state detection rule, and add the new second operation state detection rule to the first operation state detection rule.
In order to better understand the above data processing method, an example of the data processing method of the present application is set forth in detail below with reference to fig. 5, and the method includes the following steps:
s501, the operation and maintenance client collects data to be tested of the database cluster based on a preset collection mode.
The collection modes can be selected by a user, each collection mode can correspond to a problem type of a database cluster, and the to-be-detected data corresponding to the collection modes mainly comprises database operation data corresponding to the problem type.
S502, the operation and maintenance client can compress the data to be tested by using a general open source compression algorithm, encrypt the data by using an asymmetric encryption algorithm to obtain a data packet, and then upload the data packet to the operation and maintenance server.
S503, the operation and maintenance server receives the data packet and decrypts the data packet to obtain the data to be detected;
s504, the operation and maintenance server carries out abnormity detection on the distribution characteristics of the data to be detected based on a preset distribution characteristic detection model, and a first detection result aiming at the data to be detected is obtained.
And S505, the operation and maintenance server detects the data to be detected based on a preset first operation state detection rule, and obtains a second detection result of the operation state of the database cluster.
S506, the operation and maintenance server generates a first repair operation instruction based on at least one of the first detection result and the second detection result, and sends the first repair operation instruction to the operation and maintenance client.
And S507, the operation and maintenance client detects the numerical value of the data to be detected based on a preset second operation state detection rule to obtain a third detection result of the operation state of the database cluster, and generates a second repair operation instruction based on the third detection result.
And S508, the operation and maintenance client performs repair operation on the database cluster according to the first repair operation instruction and the second repair operation instruction.
According to the embodiment of the application, the operation and maintenance server acquires the data to be detected of the database cluster acquired by the operation and maintenance client, and the distribution characteristics of the data to be detected are subjected to abnormal detection based on a preset distribution characteristic detection model to obtain a first detection result; meanwhile, detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster; the method comprises the steps that a distribution characteristic detection model and a first operation state detection rule are combined, multi-dimensional detection analysis is conducted on a database cluster, and compared with the prior art that a database cluster system is detected based on time sequence data of the database system; according to the method and the device, based on the distribution characteristics of the data and the running state of the system, the database cluster is comprehensively and effectively detected from different data dimensions and different detection levels, so that potential problems can be found in advance, potential risks can be eliminated in time for the database cluster, and the stable running of a database cluster system is guaranteed.
Meanwhile, a first repair operation instruction is generated based on at least one of the first detection result and the second detection result and is sent to the operation and maintenance client side, and the operation and maintenance client side performs repair operation on the database cluster based on the first repair operation instruction, so that risk avoidance and abnormal repair of the database cluster are achieved, the reliability of the database cluster is further guaranteed, the operation and maintenance efficiency of the database is improved, and continuous and efficient data support is provided for a service system corresponding to the database cluster. .
An embodiment of the present application provides a data processing apparatus, where the apparatus is disposed in an operation and maintenance server, and as shown in fig. 6, the data processing apparatus 60 may include: an acquisition module 601, a first detection module 602, a second detection module 603, a first generation module 604 and a sending module 605;
the acquisition module 601 is used for acquiring to-be-detected data of a database cluster acquired by an operation and maintenance client;
a first detection module 602, configured to perform anomaly detection on the distribution characteristics of the data to be detected based on a preset distribution characteristic detection model, so as to obtain a first detection result for the data to be detected;
the second detection module 603 is configured to detect data to be detected based on a preset first operation state detection rule, and obtain a second detection result of the operation state of the database cluster;
a first generating module 604 for generating a first repair operation instruction based on at least one of the first detection result and the second detection result;
a sending module 605, configured to send the first repair operation instruction to the operation and maintenance client, so that the operation and maintenance client performs a repair operation on the database cluster.
In an embodiment of the present application, a possible implementation manner is provided, where the apparatus 60 further includes a request module, configured to:
and when the rule updating request of the operation and maintenance client is obtained, obtaining the subset in the first operation state detection rule, and sending the subset to the operation and maintenance client.
In the embodiment of the present application, a possible implementation manner is provided, and before the second detection module 603 detects the data to be detected based on the preset first operation state detection rule, the second detection module is further configured to:
acquiring distribution characteristics of historical abnormal data;
and determining a first operation state detection rule based on the distribution characteristics of the historical abnormal data.
In an embodiment of the present application, a possible implementation manner is provided, where the second detecting module 603 detects data to be detected based on a preset first operation state detection rule, and when a second detection result of an operation state of a database cluster is obtained, is further configured to:
determining the distribution characteristics of the target historical abnormal data corresponding to the first operation state detection rule;
and matching the distribution characteristics of the data to be detected and the target historical abnormal data to obtain a second detection result.
In an embodiment of the present application, a possible implementation manner is provided, where the apparatus 60 further includes a report generating module, configured to:
generating a detection report based on at least one of the first detection result and the second detection result;
and responding to a downloading request of the operation and maintenance client, and sending the detection report to the operation and maintenance client.
An embodiment of the present application provides a data processing apparatus, where the apparatus is disposed at an operation and maintenance client, as shown in fig. 7, the data processing apparatus 70 may include: an acquisition module 701, a third detection module 702, a second generation module 703 and a repair module 704;
the acquisition module 701 is used for acquiring data to be detected of the database cluster;
a third detection module 702, configured to detect a numerical value of the data to be detected based on a preset second operation state detection rule, to obtain a third detection result of the operation state of the database cluster; the second operation state detection rule is a subset of the first operation state detection rule in the operation and maintenance server;
a second generating module 703, configured to generate a second repair operation instruction based on the third detection result;
and a repairing module 704, configured to perform a repairing operation on the database cluster according to the second repairing operation instruction.
A possible implementation manner is provided in the embodiment of the present application, and the repairing module 704 is further configured to:
and acquiring a first repairing operation instruction sent by the operation and maintenance server, and repairing the database cluster according to the first repairing operation instruction.
In an embodiment of the present application, a possible implementation manner is provided, where before the third detecting module 702 detects the value of the data to be detected based on the second operation state detection rule, the third detecting module is further configured to:
sending a rule updating request to an operation and maintenance server;
receiving a subset in a first operation state detection rule sent by an operation and maintenance server according to a rule updating request;
the second operating condition detection rule is updated based on the subset.
The embodiment of the present application provides a possible implementation manner, and when the acquisition module 701 acquires data to be detected of a database cluster, the acquisition module is configured to:
acquiring data to be detected of a database cluster based on a preset acquisition mode;
and determining a second operation state detection rule matched with the acquisition mode.
The apparatus in the embodiment of the present application may execute the method provided in the embodiment of the present application, and the implementation principle is similar, the actions executed by the modules in the apparatus in the embodiments of the present application correspond to the steps in the method in the embodiments of the present application, and for the detailed functional description of the modules in the apparatus, reference may be made to the description in the corresponding method shown in the foregoing, and details are not repeated here.
According to the embodiment of the application, the operation and maintenance server acquires the data to be detected of the database cluster acquired by the operation and maintenance client, and the distribution characteristics of the data to be detected are subjected to abnormal detection based on a preset distribution characteristic detection model to obtain a first detection result; meanwhile, detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster; the method comprises the steps that a distribution characteristic detection model and a first operation state detection rule are combined, multi-dimensional detection analysis is conducted on a database cluster, and compared with the prior art that a database cluster system is detected based on time sequence data of the database system; according to the method and the device, based on the distribution characteristics of the data and the running state of the system, the database cluster is comprehensively and effectively detected from different data dimensions and different detection levels, so that potential problems can be found in advance, potential risks can be eliminated in time for the database cluster, and the stable running of a database cluster system is guaranteed.
Meanwhile, a first repair operation instruction is generated based on at least one of the first detection result and the second detection result and is sent to the operation and maintenance client side, and the operation and maintenance client side performs repair operation on the database cluster based on the first repair operation instruction, so that risk avoidance and abnormal repair of the database cluster are achieved, the reliability of the database cluster is further guaranteed, the operation and maintenance efficiency of the database is improved, and continuous and efficient data support is provided for a service system corresponding to the database cluster.
In an embodiment of the present application, there is provided an electronic device, including a memory, a processor, and a computer program stored in the memory, where the processor executes the computer program to implement the steps of the data processing method, and compared with the related art, the steps of: according to the method and the device, the operation and maintenance server obtains the data to be detected of the database cluster acquired by the operation and maintenance client, and the distribution characteristics of the data to be detected are subjected to abnormal detection based on a preset distribution characteristic detection model to obtain a first detection result; meanwhile, detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster; the method comprises the steps that a distribution characteristic detection model and a first operation state detection rule are combined, multi-dimensional detection analysis is conducted on a database cluster, and compared with the prior art that a database cluster system is detected based on time sequence data of the database system; according to the method and the device, based on the distribution characteristics of the data and the running state of the system, the database cluster is comprehensively and effectively detected from different data dimensions and different detection levels, so that potential problems can be found in advance, potential risks can be eliminated in time for the database cluster, and the stable running of a database cluster system is guaranteed. Meanwhile, a first repair operation instruction is generated based on at least one of the first detection result and the second detection result and is sent to the operation and maintenance client side, and the operation and maintenance client side performs repair operation on the database cluster based on the first repair operation instruction, so that risk avoidance and abnormal repair of the database cluster are achieved, the reliability of the database cluster is further guaranteed, the operation and maintenance efficiency of the database is improved, and continuous and efficient data support is provided for a service system corresponding to the database cluster.
In an alternative embodiment, an electronic device is provided, as shown in fig. 8, the electronic device 80 shown in fig. 8 comprising: a processor 801 and a memory 803. Wherein the processor 801 is coupled to a memory 803, such as via a bus 802. Optionally, the electronic device 80 may further include a transceiver 804, and the transceiver 804 may be used for data interaction between the electronic device and other electronic devices, such as transmission of data and/or reception of data. It should be noted that the transceiver 804 is not limited to one in practical applications, and the structure of the electronic device 80 is not limited to the embodiment of the present application.
The Processor 801 may be a CPU (Central Processing Unit), a general-purpose Processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 801 may also be a combination of computing functions, e.g., comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
Bus 802 may include a path that transfers information between the above components. The bus 802 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 802 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 8, but this is not intended to represent only one bus or type of bus.
The Memory 803 may be a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic Disc storage medium, other magnetic storage devices, or any other medium that can be used to carry or store a computer program and that can be Read by a computer, without limitation.
The memory 803 is used for storing computer programs for executing the embodiments of the present application, and is controlled by the processor 801 to execute the computer programs. The processor 801 is adapted to execute computer programs stored in the memory 803 to implement the steps shown in the foregoing method embodiments.
Among them, electronic devices include but are not limited to: mobile terminals such as mobile phones, notebook computers, PADs, etc. and fixed terminals such as digital TVs, desktop computers, etc.
Embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, and when being executed by a processor, the computer program may implement the steps and corresponding contents of the foregoing method embodiments.
Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device realizes the following when executed:
acquiring to-be-detected data of a database cluster acquired by an operation and maintenance client;
performing anomaly detection on the distribution characteristics of the data to be detected based on a preset distribution characteristic detection model to obtain a first detection result aiming at the data to be detected;
detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster;
generating a first repair operation instruction based on at least one of the first detection result and the second detection result;
and sending the first repair operation instruction to the operation and maintenance client so that the operation and maintenance client performs repair operation on the database cluster.
The terms "first," "second," "third," "fourth," "1," "2," and the like in the description and in the claims of the present application and in the above-described drawings (if any) are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in other sequences than illustrated or otherwise described herein.
It should be understood that, although each operation step is indicated by an arrow in the flowchart of the embodiment of the present application, the implementation order of the steps is not limited to the order indicated by the arrow. In some implementation scenarios of the embodiments of the present application, the implementation steps in the flowcharts may be performed in other sequences as desired, unless explicitly stated otherwise herein. In addition, some or all of the steps in each flowchart may include multiple sub-steps or multiple stages based on an actual implementation scenario. Some or all of these sub-steps or stages may be performed at the same time, or each of these sub-steps or stages may be performed at different times, respectively. In a scenario where execution times are different, an execution sequence of the sub-steps or the phases may be flexibly configured according to requirements, which is not limited in the embodiment of the present application.
The foregoing is only an optional implementation manner of a part of implementation scenarios in this application, and it should be noted that, for those skilled in the art, other similar implementation means based on the technical idea of this application are also within the protection scope of the embodiments of this application without departing from the technical idea of this application.

Claims (13)

1. A data processing method is applied to a server and comprises the following steps:
acquiring data to be tested of a database cluster acquired by a client;
performing anomaly detection on the distribution characteristics of the data to be detected based on a preset distribution characteristic detection model to obtain a first detection result aiming at the data to be detected;
detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster;
generating a first repair operation instruction based on at least one of the first detection result and the second detection result;
and sending the first repair operation instruction to the client so that the client performs repair operation on the database cluster.
2. The method of claim 1, further comprising:
and when the rule updating request of the client is obtained, obtaining a subset in the first operation state detection rule, and sending the subset to the client.
3. The method according to claim 1, wherein before the detecting the data to be detected based on the preset first operation state detection rule, the method further comprises:
acquiring distribution characteristics of historical abnormal data;
and determining the first operation state detection rule based on the distribution characteristics of the historical abnormal data.
4. The method according to claim 3, wherein the detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster comprises:
determining the distribution characteristics of the target historical abnormal data corresponding to the first running state detection rule;
and matching the distribution characteristics of the data to be detected and the target historical abnormal data to obtain the second detection result.
5. The method of claim 1, further comprising:
generating a detection report based on at least one of the first detection result and the second detection result;
and responding to a downloading request of the client, and sending the detection report to the client.
6. A data processing method is applied to a client and comprises the following steps:
collecting data to be detected of a database cluster;
detecting the numerical value of the data to be detected based on a preset second operation state detection rule to obtain a third detection result of the operation state of the database cluster; wherein the second operating state detection rule is a subset of the first operating state detection rule in the server;
generating a second repair operation instruction based on the third detection result;
and repairing the database cluster according to the second repairing operation instruction.
7. The method of claim 6, further comprising:
and acquiring a first repairing operation instruction sent by the server, and repairing the database cluster according to the first repairing operation instruction.
8. The method according to claim 6, wherein before detecting the value of the data to be detected based on the second operation state detection rule, the method further comprises:
sending a rule update request to the server;
receiving a subset of the first operation state detection rules sent by the server according to the rule updating request;
updating the second operational state detection rule based on the subset.
9. The method of claim 6, wherein collecting data under test for a database cluster comprises:
acquiring data to be detected of the database cluster based on a preset acquisition mode;
and determining a second operation state detection rule matched with the acquisition mode.
10. A data processing apparatus provided in a server, comprising:
the acquisition module is used for acquiring data to be detected of the database cluster acquired by the client;
the first detection module is used for carrying out abnormal detection on the distribution characteristics of the data to be detected based on a preset distribution characteristic detection model to obtain a first detection result aiming at the data to be detected;
the second detection module is used for detecting the data to be detected based on a preset first operation state detection rule to obtain a second detection result of the operation state of the database cluster;
a first generation module, configured to generate a first repair operation instruction based on at least one of the first detection result and the second detection result;
and the sending module is used for sending the first repairing operation instruction to the client so that the client can carry out repairing operation on the database cluster.
11. A data processing apparatus, wherein the data processing apparatus is provided at a client, comprising:
the acquisition module is used for acquiring data to be detected of the database cluster;
the third detection module is used for detecting the numerical value of the data to be detected based on a preset second running state detection rule to obtain a third detection result of the running state of the database cluster; wherein the second operating state detection rule is a subset of the first operating state detection rule in the server;
a second generating module, configured to generate a second repair operation instruction based on the third detection result;
and the repairing module is used for repairing the database cluster according to the second repairing operation instruction.
12. An electronic device comprising a memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to implement the steps of the method of any of claims 1 to 9.
13. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 9.
CN202210488875.5A 2022-05-06 2022-05-06 Data processing method and device, electronic equipment and computer readable storage medium Pending CN114880153A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210488875.5A CN114880153A (en) 2022-05-06 2022-05-06 Data processing method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210488875.5A CN114880153A (en) 2022-05-06 2022-05-06 Data processing method and device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114880153A true CN114880153A (en) 2022-08-09

Family

ID=82673352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210488875.5A Pending CN114880153A (en) 2022-05-06 2022-05-06 Data processing method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114880153A (en)

Similar Documents

Publication Publication Date Title
CN110351150B (en) Fault source determination method and device, electronic equipment and readable storage medium
US11138058B2 (en) Hierarchical fault determination in an application performance management system
CN108039959B (en) Data situation perception method, system and related device
US11348023B2 (en) Identifying locations and causes of network faults
US10438124B2 (en) Machine discovery of aberrant operating states
US10346756B2 (en) Machine discovery and rapid agglomeration of similar states
KR102068622B1 (en) Failure prediction system for heterogeneous network security system
JP2018045403A (en) Abnormality detection system and abnormality detection method
US9933772B2 (en) Analyzing SCADA systems
US10942801B2 (en) Application performance management system with collective learning
US20200348996A1 (en) Application performance management system with dynamic discovery and extension
KR20190001501A (en) Artificial intelligence operations system of telecommunication network, and operating method thereof
JP6457777B2 (en) Automated generation and dynamic update of rules
US11055631B2 (en) Automated meta parameter search for invariant based anomaly detectors in log analytics
US11153183B2 (en) Compacted messaging for application performance management system
US10848371B2 (en) User interface for an application performance management system
US20230060461A1 (en) Inference engine configured to provide a heat map interface
CN114880153A (en) Data processing method and device, electronic equipment and computer readable storage medium
US10817396B2 (en) Recognition of operational elements by fingerprint in an application performance management system
Naukudkar et al. Enhancing performance of security log analysis using correlation-prediction technique
Peng et al. Research on data quality detection technology based on ubiquitous state grid internet of things platform
CN113138875B (en) Fault detection method, terminal and computer storage medium
Karanth et al. Workaround Prediction of Cloud Alarms using Machine Learning
EP4303730A1 (en) Computer-implemented method for automatically detecting anomalies in a cloud infrastructure comprising microservices
Gottumukkala et al. Fault Detection in Mobile Communication Networks Using Data Mining techniques with big data analytics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination