CN116662059A - MySQL database CPU fault diagnosis and self-healing method and readable storage medium - Google Patents

MySQL database CPU fault diagnosis and self-healing method and readable storage medium Download PDF

Info

Publication number
CN116662059A
CN116662059A CN202310904872.XA CN202310904872A CN116662059A CN 116662059 A CN116662059 A CN 116662059A CN 202310904872 A CN202310904872 A CN 202310904872A CN 116662059 A CN116662059 A CN 116662059A
Authority
CN
China
Prior art keywords
fault
self
cpu
healing
mysql database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310904872.XA
Other languages
Chinese (zh)
Other versions
CN116662059B (en
Inventor
麻振华
周文雅
黄炎
陈书俊
李恒
梁广涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Aikesheng Information Technology Co ltd
Original Assignee
Shanghai Aikesheng Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Aikesheng Information Technology Co ltd filed Critical Shanghai Aikesheng Information Technology Co ltd
Priority to CN202310904872.XA priority Critical patent/CN116662059B/en
Publication of CN116662059A publication Critical patent/CN116662059A/en
Application granted granted Critical
Publication of CN116662059B publication Critical patent/CN116662059B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0721Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/80Database-specific techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention relates to the technical field of computers, in particular to a MySQL database CPU fault diagnosis and self-healing method and a readable storage medium, wherein the method comprises the following steps: presetting a fault scene, wherein the fault scene comprises a fault reason, an acquisition index, a fault condition and a fault event; presetting self-healing rules according to different fault scenes, wherein the self-healing rules comprise the fault scenes and self-healing strategies corresponding to the fault scenes; collecting collection indexes when a CPU (central processing unit) fault occurs to a server in the operation and maintenance process of the MySQL database, and generating a fault event according to a collection result matching fault conditions; according to the fault event, matching the fault scene in the self-healing rule, and using the corresponding self-healing strategy to complete the automatic repair of the CPU fault. Through presetting the fault scene and presetting the adaptive self-healing rules according to different fault scenes, the proper self-healing rules can be matched according to the fault scene when the CPU faults occur, so that the automatic repair of the CPU faults is finished, and the diagnosis is accurate and the repair efficiency is high.

Description

MySQL database CPU fault diagnosis and self-healing method and readable storage medium
Technical Field
The invention relates to the technical field of computers, in particular to a MySQL database CPU fault diagnosis and self-healing method and a readable storage medium.
Background
The structured query language (Structured Query Language, abbreviated as SQL) is a special purpose programming language, a database query and programming language, for accessing data and querying, updating and managing relational database systems. MySQL is a relational database management system that keeps data in different tables rather than placing all data in one large warehouse, which increases speed and flexibility.
The MySQL database runs on a server, and system resources of the server, such as CPU, memory, disk, etc., need to be used in the running process. If the CPU resource utilization rate of the server is close to or exceeds 100%, the MySQL database can respond slowly or even not be available to the externally provided service, and the CPU failure is determined. In the operation and maintenance process of the database, the reasons for the occurrence of the CPU fault of the server are many, for example, the CPU utilization rate is high, the memory overflows, the service is high and concurrent, the hardware problem of the server is caused by the MySQL slow log, and the like. The user needs to collect each index of the server and the database, analyze the association among each index, respectively eliminate irrelevant factors one by one, finally locate the real fault cause and execute fault repair.
The conventional fault diagnosis method generally collects indexes, such as the CPU utilization rate and the SQL connection thread, then analyzes the collected indexes and locates the fault cause, and because the relevance among several indexes is unclear, the indexes need to be checked one by combining a plurality of diagnosis tools, and more manual judgment is needed, so that the efficiency is low and deviation easily exists.
Disclosure of Invention
The invention aims to provide a MySQL database CPU fault diagnosis and self-healing method and a readable storage medium, which solve the problems of low user fault diagnosis efficiency and easy deviation caused by unaware of the relevance of various indexes of a database and a server.
In order to achieve the above purpose, the present invention provides a method for diagnosing and self-healing faults of a CPU in a MySQL database, comprising:
presetting a fault scene, wherein the fault scene comprises a fault reason, an acquisition index, a fault condition and a fault event;
presetting a self-healing rule according to different fault scenes, wherein the self-healing rule comprises the fault scenes and a self-healing strategy corresponding to the fault scenes;
collecting collection indexes when a CPU (central processing unit) fault occurs to a server in the operation and maintenance process of the MySQL database, and generating the fault event according to the collection results and matching fault conditions in the fault scene;
and matching the fault scene in the self-healing rule according to the fault event, and completing the automatic repair of the CPU fault by using a corresponding self-healing strategy.
Optionally, the fault causes include slow SQL, high concurrency of traffic, and presence of MySQL spin locks.
Optionally, the collection index includes a CPU usage rate of a server, a CPU usage rate of a MySQL database, a connection thread of SQL, a query rate per second, a slow log statistics, and a partition table statistics of the MySQL database.
Optionally, when the fault conditions of the acquisition result matching include: and generating a fault event of the slow SQL when the MySQL database and the CPU of the server have the same rate of increase of the utilization rate of the CPU within a period of time before the fault, the connecting line of the SQL is low and is in a straight line, the current existence of the unfinished SQL is larger than an abnormal value and is at least at a minute level, the curve of the query rate per second within a period of time before the fault is in a non-ascending trend, the current value is lower than a historical average value, and the partition table does not exist in the MySQL database.
Optionally, when the fault event of the slow SQL is generated, the connection thread of the slow SQL is killed to prevent the slow SQL from continuing to execute.
Optionally, when the fault conditions of the acquisition result matching include: and generating fault events with high business concurrency when the MySQL database and the CPU of the server have the same utilization rate of increase in a period of time before the fault, the connecting line of the SQL is a curve with high number of strokes and an ascending trend, the current existence of the unfinished SQL is smaller than an abnormal value, the execution time is equal to a second level, and the curve with high query rate per second in the period of time before the fault is an ascending trend.
Optionally, when generating the fault event with high concurrency of the service, adjusting the resource configuration and concurrency parameters of the server.
Optionally, when the fault conditions of the acquisition result matching include: and generating fault events with MySQL spin locks when the MySQL database and the CPU of the server have the same rate of increase of utilization rate of the CPU within a period of time before the fault, the SQL has high connecting thread number and is a curve with ascending trend, the current existence of the unfinished SQL has execution time larger than an abnormal value and is at least at a minute level, the curve with high query rate per second within a period of time before the fault has non-ascending trend and the current value is lower than a historical average value, and the MySQL database is in a partition table.
Optionally, when generating the fault event that the MySQL spin lock exists, adjusting the number of partition tables of the MySQL database and changing the model of the CPU.
Based on the same technical conception, the invention also provides a readable storage medium, wherein the computer program is stored on the readable storage medium, and the computer program can realize the MySQL database CPU fault diagnosis and self-healing method when being executed.
According to the MySQL database CPU fault diagnosis and self-healing method and the readable storage medium, the fault scene is preset, the self-healing rule adapted according to different fault scenes is preset, when a CPU fault occurs, the proper self-healing rule can be matched according to the fault scene, automatic repair of the CPU fault is completed, and diagnosis is accurate and repair efficiency is high.
Drawings
Those of ordinary skill in the art will appreciate that the figures are provided for a better understanding of the present invention and do not constitute any limitation on the scope of the present invention. Wherein:
FIG. 1 is a step diagram of a method for CPU fault diagnosis and self-healing of MySQL database according to an embodiment of the present invention;
fig. 2 is a flowchart of a method for diagnosing and self-healing a CPU failure in a MySQL database according to an embodiment of the present invention.
Detailed Description
The invention will be described in further detail with reference to the drawings and the specific embodiments thereof in order to make the objects, advantages and features of the invention more apparent. It should be noted that the drawings are in a very simplified form and are all to a non-precise scale, merely for the purpose of facilitating and clearly aiding in the description of embodiments of the invention. For a better understanding of the invention with objects, features and advantages, refer to the drawings. It should be understood that the structures, proportions, sizes, etc. shown in the drawings are shown only in connection with the present disclosure for the understanding and reading of the present disclosure, and are not intended to limit the scope of the invention, which is defined by the appended claims, and any structural modifications, proportional changes, or dimensional adjustments, which may be made by the present disclosure, should fall within the scope of the present disclosure under the same or similar circumstances as the effects and objectives attained by the present invention.
As used in this disclosure, the singular forms "a," "an," and "the" include plural referents unless the content clearly dictates otherwise. As used in this disclosure, the term "or" is generally employed in its sense including "and/or" unless the content clearly dictates otherwise. As used in this disclosure, the term "plurality" is generally employed in its sense including "at least one" unless the content clearly dictates otherwise. As used in this disclosure, the term "at least two" is generally employed in its sense including "two or more", unless the content clearly dictates otherwise. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", "a third" may include one or at least two such features, either explicitly or implicitly.
In the description of the present invention, unless explicitly stated and limited otherwise, the terms "mounted," "connected," and "secured" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communicated with the inside of two elements or the interaction relationship of the two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
Referring to fig. 1-2, fig. 1 is a step diagram of a method for diagnosing and self-healing a CPU failure in a MySQL database according to an embodiment of the present invention, and fig. 2 is a flowchart of a method for diagnosing and self-healing a CPU failure in a MySQL database according to an embodiment of the present invention. The embodiment of the invention provides a method for diagnosing and self-healing a CPU fault of a MySQL database, which comprises the following steps:
s1, presetting a fault scene, wherein the fault scene comprises a fault reason, an acquisition index, a fault condition and a fault event;
s2, presetting self-healing rules according to different fault scenes, wherein the self-healing rules comprise the fault scenes and self-healing strategies corresponding to the fault scenes;
s3, acquiring an acquisition index when a CPU (central processing unit) fault occurs to a server in the operation and maintenance process of the MySQL database, and generating a fault event according to a fault condition in a fault scene matched with an acquisition result;
and S4, matching fault scenes in the self-healing rules according to the fault event, and completing automatic repair of the CPU fault by using a corresponding self-healing strategy.
Through presetting the fault scene and presetting the adaptive self-healing rules according to different fault scenes, the proper self-healing rules can be matched according to the fault scene when the CPU faults occur, so that the automatic repair of the CPU faults is finished, and the diagnosis is accurate and the repair efficiency is high.
Specifically, step S1 is executed first, and a fault scenario is preset, where the fault scenario includes a fault reason, an acquisition index, a fault condition and a fault event. In this embodiment, a database diagnosis platform may be configured, and a diagnosis policy may be configured on the database diagnosis platform, where the diagnosis policy includes a preset fault scenario, a preset self-healing rule, and an acquisition index.
The fault reasons include, but are not limited to, slow SQL, high concurrency of business and existence of MySQL spin locks, and the collection indexes comprise CPU utilization rate of a server, CPU utilization rate of a MySQL database, connection threads of SQL, query rate per second, slow log statistics and partition table statistics of the MySQL database.
In this embodiment, when the cause of the fault is slow SQL, the fault conditions include: the method comprises the steps that the increase rate of the utilization rate of MySQL database and the CPU of a server is the same within a period of time before a fault, the number of connecting lines of SQL is low and is in a straight line, the current existence of unfinished SQL is larger than an abnormal value and is at least at a minute level, the curve of the query rate per second within a period of time before the fault is in a non-rising trend, the current value is lower than a historical average value, and a partition table does not exist in the MySQL database, and at the moment, a corresponding fault event is preset to be slow SQL.
When the failure cause is high concurrency of the service, the failure condition includes: the method comprises the steps that the increase rate of the utilization rate of MySQL database and the CPU of a server is the same within a period of time before a fault, the number of connecting threads of SQL is high and is a curve with an ascending trend, and when the curve with unfinished SQL, execution time of which is smaller than an abnormal value and is in a second level and the query rate per second is high within a period of time before the fault, the corresponding fault event is high and concurrent with the service.
When the failure cause is MySQL self mechanism (e.g. MySQL spin lock exists), the failure condition includes: the method comprises the steps that the increase rate of the utilization rate of MySQL database and the CPU of a server is the same within a period of time before a fault, the number of connecting threads of SQL is high and is a curve with rising trend, the current existence of unfinished SQL is larger than an abnormal value and is at least at a minute level, the curve with high query rate per second within a period of time before the fault is in non-rising trend, the current value is lower than a historical average value, and when the MySQL database is stored in a partition table, the corresponding fault event is MySQL spin lock.
After the presetting of the fault scene is completed, the step S2 is executed, and the self-healing rules can be preset in the database diagnosis platform according to different fault scenes, wherein the self-healing rules comprise the fault scenes and the self-healing strategies corresponding to the fault scenes. For example, when a fault event of slow SQL is generated, the self-healing policy is to kill the connecting thread of slow SQL to prevent the slow SQL from continuing to execute; when generating a fault event with high concurrency of the service, the self-healing strategy is to adjust the resource configuration and concurrency parameters of the server; when generating a fault event with the MySQL spin lock, the self-healing strategy is to adjust the number of partition tables of the MySQL database and change the model of the CPU.
Furthermore, in order to facilitate the completion of automatic repair of the CPU fault, the self-healing rule may further preset an execution object and an automatic execution instruction.
After the fault scene and the self-healing rule are preset, step S3 is executed, indexes to be collected when the server generates CPU faults in the operation and maintenance process of the MySQL database are collected, and fault conditions in the fault scene are matched according to the collection results, so that a fault event is generated. Specifically, in the operation and maintenance process of the MySQL database, when a response slow or abnormal event of the MySQL database is received, a CPU fault is found, diagnosis needs to be initiated at the moment, various preset acquisition indexes are acquired by using a database diagnosis platform, and fault conditions in a fault scene are matched according to the acquisition results, so that a fault event is generated. If the fault condition is matched, the adaptive self-healing rule is selected, and if the fault is not found, the diagnosis strategy can be updated after the fault is found in other modes.
In this embodiment, when the fault conditions of the acquisition result matching include: and generating a slow SQL fault event when the MySQL database and the CPU of the server have the same rate of increase of utilization rate of the CPU within a period of time before the fault, the number of connecting threads of the SQL is low and is a straight line, the current unfinished SQL exists, the execution time is larger than an abnormal value and is at least at a minute level, the curve of the query rate per second within a period of time before the fault is in a non-rising trend, the current value is lower than a historical average value, and the MySQL database does not have a partition table.
The fault conditions when the acquired results are matched simultaneously comprise: and generating fault events with high business concurrency when the increase rate of the utilization rate of the MySQL database and the CPU of the server is the same within a period of time before the fault, the number of connecting threads of the SQL is high and is a curve with an ascending trend, the current existence of the unfinished SQL is smaller than an abnormal value, the execution time is in a second level, and the curve with high query rate per second within a period of time before the fault is in the ascending trend.
The fault conditions when the acquired results are matched simultaneously comprise: the method comprises the steps that the increase rate of the utilization rate of MySQL database and the CPU of a server is the same within a period of time before failure, the number of connecting threads of SQL is high and is a curve with rising trend, the current existence of unfinished SQL is larger than an abnormal value and is at least at a minute level, the curve with high query rate per second within a period of time before failure is in non-rising trend, the current value is lower than a historical average value, and a fault event with MySQL spin lock is generated when the MySQL database is stored in a partition table.
Preferably, an automatic acquisition program can be set to acquire periodically according to the index, and the acquired data is stored in an acquisition database.
And finally, executing step S4, matching fault scenes in the self-healing rules according to the fault event, and using a corresponding self-healing strategy to finish automatic repair of the CPU fault.
For example, when a failure event of slow SQL is generated, the connecting thread of slow SQL is killed to prevent the slow SQL from continuing to execute; when generating a fault event with high concurrency of the service, adjusting the resource configuration and concurrency parameters of the server; when generating a fault event with the MySQL spin lock, the number of partition tables of the MySQL database is adjusted and the model of the CPU is changed.
And after the adaptive self-healing rule is found, completing the automatic repair of the CPU fault by using the self-healing strategy, and if the adaptive self-healing rule cannot be found, updating the diagnosis strategy.
Based on the same inventive concept, the embodiment of the invention also provides a readable storage medium, on which a computer program is stored, and the computer program can realize the fault diagnosis and self-healing method of the MySQL database CPU when being executed.
The readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device, such as, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the preceding. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. The computer program described herein may be downloaded from a readable storage medium to a respective computing/processing device or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives the computer program from the network and forwards the computer program for storage in a readable storage medium in the respective computing/processing device. Computer programs for carrying out operations of the present invention may be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer program may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing electronic circuitry, such as programmable logic circuits, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information for a computer program, which can execute computer-readable program instructions.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, systems, and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer programs. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the programs, when executed by the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer programs may also be stored in a readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the readable storage medium storing the computer program includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the computer program which is executed on the computer, other programmable apparatus or other devices implements the functions/acts specified in the flowchart and/or block diagram block or blocks.
In summary, the embodiment of the invention provides a MySQL database CPU fault diagnosis and self-healing method and a readable storage medium, and by presetting fault scenes and presetting adaptive self-healing rules according to different fault scenes, when a CPU fault occurs, the proper self-healing rules can be matched according to the fault scenes, so that the automatic repair of the CPU fault is completed, and the diagnosis is accurate and the repair efficiency is high.
The above description is only illustrative of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention, and any alterations and modifications made by those skilled in the art based on the above disclosure shall fall within the scope of the present invention. It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, the present invention is intended to include such modifications and alterations insofar as they come within the scope of the invention or the equivalents thereof.

Claims (10)

1. The utility model provides a MySQL database CPU fault diagnosis and self-healing method which is characterized in that the method comprises the following steps:
presetting a fault scene, wherein the fault scene comprises a fault reason, an acquisition index, a fault condition and a fault event;
presetting a self-healing rule according to different fault scenes, wherein the self-healing rule comprises the fault scenes and a self-healing strategy corresponding to the fault scenes;
collecting collection indexes when a CPU (central processing unit) fault occurs to a server in the operation and maintenance process of the MySQL database, and generating the fault event according to the collection results and matching fault conditions in the fault scene;
and matching the fault scene in the self-healing rule according to the fault event, and completing the automatic repair of the CPU fault by using a corresponding self-healing strategy.
2. The method for diagnosing and self-healing a CPU fault in a MySQL database according to claim 1, wherein the fault causes comprise slow SQL, high concurrency of business and presence of MySQL spin lock.
3. The method for diagnosing and self-healing a CPU failure in a MySQL database according to claim 2, wherein the collection index comprises CPU usage of a server, CPU usage of a MySQL database, a connection thread of SQL, a query rate per second, a slow log statistics and partition table statistics of the MySQL database.
4. The method for diagnosing and self-healing a CPU fault in a MySQL database according to claim 3, wherein when the collected results match the fault conditions simultaneously include: and generating a fault event of the slow SQL when the MySQL database and the CPU of the server have the same rate of increase of the utilization rate of the CPU within a period of time before the fault, the connecting line of the SQL is low and is in a straight line, the current existence of the unfinished SQL is larger than an abnormal value and is at least at a minute level, the curve of the query rate per second within a period of time before the fault is in a non-ascending trend, the current value is lower than a historical average value, and the partition table does not exist in the MySQL database.
5. The method for diagnosing and self-healing a CPU fault in a MySQL database according to claim 4, wherein when generating a fault event of the slow SQL, connecting threads of the slow SQL are killed to prevent the slow SQL from continuing to execute.
6. The method for diagnosing and self-healing a CPU fault in a MySQL database according to claim 3, wherein when the collected results match the fault conditions simultaneously include: and generating fault events with high business concurrency when the MySQL database and the CPU of the server have the same utilization rate of increase in a period of time before the fault, the connecting line of the SQL is a curve with high number of strokes and an ascending trend, the current existence of the unfinished SQL is smaller than an abnormal value, the execution time is equal to a second level, and the curve with high query rate per second in the period of time before the fault is an ascending trend.
7. The method for diagnosing and self-healing a CPU failure in a MySQL database according to claim 6, wherein when generating a failure event with high concurrency of the service, adjusting the resource configuration and concurrency parameters of the server.
8. The method for diagnosing and self-healing a CPU fault in a MySQL database according to claim 3, wherein when the collected results match the fault conditions simultaneously include: and generating fault events with MySQL spin locks when the MySQL database and the CPU of the server have the same rate of increase of utilization rate of the CPU within a period of time before the fault, the SQL has high connecting thread number and is a curve with ascending trend, the current existence of the unfinished SQL has execution time larger than an abnormal value and is at least at a minute level, the curve with high query rate per second within a period of time before the fault has non-ascending trend and the current value is lower than a historical average value, and the MySQL database is in a partition table.
9. The method according to claim 8, wherein when generating the fault event that the MySQL spin lock exists, the number of partition tables of the MySQL database is adjusted and the model of the CPU is changed.
10. A readable storage medium having stored thereon a computer program, wherein the computer program, when executed, is capable of implementing a MySQL database CPU fault diagnosis and self-healing method according to any one of claims 1-9.
CN202310904872.XA 2023-07-24 2023-07-24 MySQL database CPU fault diagnosis and self-healing method and readable storage medium Active CN116662059B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310904872.XA CN116662059B (en) 2023-07-24 2023-07-24 MySQL database CPU fault diagnosis and self-healing method and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310904872.XA CN116662059B (en) 2023-07-24 2023-07-24 MySQL database CPU fault diagnosis and self-healing method and readable storage medium

Publications (2)

Publication Number Publication Date
CN116662059A true CN116662059A (en) 2023-08-29
CN116662059B CN116662059B (en) 2023-10-24

Family

ID=87722658

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310904872.XA Active CN116662059B (en) 2023-07-24 2023-07-24 MySQL database CPU fault diagnosis and self-healing method and readable storage medium

Country Status (1)

Country Link
CN (1) CN116662059B (en)

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005111867A2 (en) * 2004-05-03 2005-11-24 Microsoft Corporation Systems and methods for automatic database or file system maintenance and repair
US20080313498A1 (en) * 2007-06-15 2008-12-18 Jennings Derek M Diagnosing Changes in Application Behavior Based on Database Usage
US20110087517A1 (en) * 2009-10-12 2011-04-14 Abbott Patrick D Targeted Equipment Monitoring System and Method for Optimizing Equipment Reliability
US20120144234A1 (en) * 2010-12-03 2012-06-07 Teradata Us, Inc. Automatic error recovery mechanism for a database system
KR101296778B1 (en) * 2012-09-18 2013-08-14 (주)카디날정보기술 Method of eventual transaction processing on nosql database
KR20130091130A (en) * 2012-02-07 2013-08-16 에스케이씨앤씨 주식회사 Monitoring method for estimating system failure with multiple failure condition and monitoring server using the same
US20140052826A1 (en) * 2012-08-20 2014-02-20 International Business Machines Corporation Techniques for performing processing for database
US9274872B1 (en) * 2013-09-27 2016-03-01 Emc Corporation Set-based bugs discovery system via SQL query
US20180285239A1 (en) * 2017-03-31 2018-10-04 Microsoft Technology Licensing, Llc Scenarios based fault injection
CN108874642A (en) * 2018-05-25 2018-11-23 平安科技(深圳)有限公司 SQL method for monitoring performance, device, computer equipment and storage medium
CN109088773A (en) * 2018-08-24 2018-12-25 广州视源电子科技股份有限公司 Fault self-healing method and device, server and storage medium
CN110674014A (en) * 2019-09-16 2020-01-10 中国银联股份有限公司 Method and device for determining abnormal query request
CN112506951A (en) * 2020-12-07 2021-03-16 海南车智易通信息技术有限公司 Processing method, server, computing device and system for database slow query log
CN113886130A (en) * 2021-10-21 2022-01-04 深信服科技股份有限公司 Method, device and medium for processing database fault
CN114265860A (en) * 2021-12-22 2022-04-01 中国电信股份有限公司 Execution statement identification method and device
CN114996104A (en) * 2022-06-30 2022-09-02 建信金融科技有限责任公司 Data processing method and device
CN115658420A (en) * 2022-09-15 2023-01-31 交控科技股份有限公司 Database monitoring method and system
CN115994044A (en) * 2023-01-09 2023-04-21 苏州浪潮智能科技有限公司 Database fault processing method and device based on monitoring service and distributed cluster
CN116049146A (en) * 2023-02-13 2023-05-02 北京优特捷信息技术有限公司 Database fault processing method, device, equipment and storage medium
CN116166338A (en) * 2023-02-16 2023-05-26 平安付科技服务有限公司 Database fault removal method, device, equipment and storage medium
CN116186777A (en) * 2023-02-24 2023-05-30 企知道科技有限公司 Audit method and device for MPP database

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005111867A2 (en) * 2004-05-03 2005-11-24 Microsoft Corporation Systems and methods for automatic database or file system maintenance and repair
US20080313498A1 (en) * 2007-06-15 2008-12-18 Jennings Derek M Diagnosing Changes in Application Behavior Based on Database Usage
US20110087517A1 (en) * 2009-10-12 2011-04-14 Abbott Patrick D Targeted Equipment Monitoring System and Method for Optimizing Equipment Reliability
US20120144234A1 (en) * 2010-12-03 2012-06-07 Teradata Us, Inc. Automatic error recovery mechanism for a database system
KR20130091130A (en) * 2012-02-07 2013-08-16 에스케이씨앤씨 주식회사 Monitoring method for estimating system failure with multiple failure condition and monitoring server using the same
US20140052826A1 (en) * 2012-08-20 2014-02-20 International Business Machines Corporation Techniques for performing processing for database
KR101296778B1 (en) * 2012-09-18 2013-08-14 (주)카디날정보기술 Method of eventual transaction processing on nosql database
US9274872B1 (en) * 2013-09-27 2016-03-01 Emc Corporation Set-based bugs discovery system via SQL query
US20180285239A1 (en) * 2017-03-31 2018-10-04 Microsoft Technology Licensing, Llc Scenarios based fault injection
CN108874642A (en) * 2018-05-25 2018-11-23 平安科技(深圳)有限公司 SQL method for monitoring performance, device, computer equipment and storage medium
CN109088773A (en) * 2018-08-24 2018-12-25 广州视源电子科技股份有限公司 Fault self-healing method and device, server and storage medium
CN110674014A (en) * 2019-09-16 2020-01-10 中国银联股份有限公司 Method and device for determining abnormal query request
CN112506951A (en) * 2020-12-07 2021-03-16 海南车智易通信息技术有限公司 Processing method, server, computing device and system for database slow query log
CN113886130A (en) * 2021-10-21 2022-01-04 深信服科技股份有限公司 Method, device and medium for processing database fault
CN114265860A (en) * 2021-12-22 2022-04-01 中国电信股份有限公司 Execution statement identification method and device
CN114996104A (en) * 2022-06-30 2022-09-02 建信金融科技有限责任公司 Data processing method and device
CN115658420A (en) * 2022-09-15 2023-01-31 交控科技股份有限公司 Database monitoring method and system
CN115994044A (en) * 2023-01-09 2023-04-21 苏州浪潮智能科技有限公司 Database fault processing method and device based on monitoring service and distributed cluster
CN116049146A (en) * 2023-02-13 2023-05-02 北京优特捷信息技术有限公司 Database fault processing method, device, equipment and storage medium
CN116166338A (en) * 2023-02-16 2023-05-26 平安付科技服务有限公司 Database fault removal method, device, equipment and storage medium
CN116186777A (en) * 2023-02-24 2023-05-30 企知道科技有限公司 Audit method and device for MPP database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王伟;: "以开放式数据库玩转互联网金融", 金融电子化, no. 03 *

Also Published As

Publication number Publication date
CN116662059B (en) 2023-10-24

Similar Documents

Publication Publication Date Title
US11397722B2 (en) Applications of automated discovery of template patterns based on received requests
US8812481B2 (en) Management of interesting database statistics
US8127299B2 (en) Landscape reorganization algorithm for dynamic load balancing
US7356524B2 (en) Query runtime estimation using statistical query records
CN111539633A (en) Service data quality auditing method, system, device and storage medium
US11734292B2 (en) Cloud inference system
US10083208B2 (en) Query modification in a database management system
CN109656958B (en) Data query method and system
CN102724059A (en) Website operation state monitoring and abnormal detection based on MapReduce
AU2021244852B2 (en) Offloading statistics collection
US11803547B2 (en) System and method for query resource caching
CN111125056A (en) Automatic operation and maintenance system and method for information system database
CN111241059A (en) Database optimization method and device based on database
CN110888774A (en) Big data report processing method and device based on HBASE
CN110363381B (en) Information processing method and device
CN116662059B (en) MySQL database CPU fault diagnosis and self-healing method and readable storage medium
US20200293507A1 (en) Auto Unload
US20100011030A1 (en) Statistics collection using path-identifiers for relational databases
CN106528849B (en) Complete history record-oriented graph query overhead method
CN114860851A (en) Data processing method, device, equipment and storage medium
CN114244685A (en) Cloud service center access exception handling system
CN116467162A (en) Simulation environment construction method and system for index recommendation of distributed database
CN116610729A (en) Database intelligent statistical information management method, system, equipment and medium
Li et al. An SLA and Operation Cost Aware Performance Re-tuning Algorithm for Cloud Databases
CN115438069A (en) Data view generation method and device, computer equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant