CN117938624A - Online service fault exercise method, electronic equipment and storage medium - Google Patents

Online service fault exercise method, electronic equipment and storage medium Download PDF

Info

Publication number
CN117938624A
CN117938624A CN202311816267.3A CN202311816267A CN117938624A CN 117938624 A CN117938624 A CN 117938624A CN 202311816267 A CN202311816267 A CN 202311816267A CN 117938624 A CN117938624 A CN 117938624A
Authority
CN
China
Prior art keywords
fault
log
service
exercise
service instance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311816267.3A
Other languages
Chinese (zh)
Inventor
刘春锋
赵宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seashell Housing Beijing Technology Co Ltd
Original Assignee
Seashell Housing Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Seashell Housing Beijing Technology Co Ltd filed Critical Seashell Housing Beijing Technology Co Ltd
Priority to CN202311816267.3A priority Critical patent/CN117938624A/en
Publication of CN117938624A publication Critical patent/CN117938624A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention provides an online service fault exercise method, electronic equipment and a storage medium, wherein the method comprises the following steps: sending a fault exercise task by calling an interface corresponding to the service instance; the fault exercise task comprises a service ID, a service instance ID and fault exercise information, wherein the fault exercise information comprises pre-written log content, and the service ID and the service instance ID are uniquely corresponding to the service instance; writing the log content into a temporary log file of the service instance, monitoring the updated log content in the temporary log file by using a log acquisition component, and performing alarm decision based on the monitored log content and a preset alarm rule. The embodiment of the invention provides a service fault exercise scheme for on-line business nondestructive, which reduces the cost of on-line service fault exercise and improves the safety and effect of on-line fault exercise.

Description

Online service fault exercise method, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to an online service fault drilling method, electronic equipment and a storage medium.
Background
In the prior art, the online service fault exercise is realized by actually invading the online service through a self-research or source-opening tool so as to trigger the monitoring alarm of the service, in the actual service, the selected language of each service is different, the invasion means and the cost of the invasion code are different, and no tool for invading the code is developed so far by some languages, so that the fault exercise in the mode of invading the code has poor universality and higher cost. In addition, more importantly, faults after code intrusion can affect operation of the service.
Disclosure of Invention
Aiming at the defects existing in the prior art, the embodiment of the invention provides an online service fault drilling method, electronic equipment and a storage medium.
The embodiment of the invention provides an online service fault exercise method, which comprises the following steps: sending a fault exercise task by calling an interface corresponding to the service instance; the fault exercise task comprises a service ID, a service instance ID and fault exercise information, wherein the fault exercise information comprises pre-written log content, and the service ID and the service instance ID uniquely correspond to the service instance; writing the log content into a temporary log file of the service instance, monitoring the updated log content in the temporary log file by using a log acquisition component, and performing alarm decision based on the monitored log content and a preset alarm rule.
According to the online service fault exercise method provided by the embodiment of the invention, the alarm decision is made based on the monitored log content and the preset alarm rule, and the method comprises the following steps: acquiring a log storage mode when the service normally runs, and storing the monitored log content based on the log storage mode; and acquiring the log content stored based on the log storage mode, and performing alarm decision based on the log content and a preset alarm rule.
According to the online service fault exercise method provided by the embodiment of the invention, the interface is provided by the proxy component of the service instance; the writing the log content into the temporary log file of the service instance includes: replacing preset contents in the patterned script of the proxy component by using the log contents to obtain a proxy component operation script; and writing the log content into a temporary log file of the service instance by running the proxy component running script.
According to the online service fault exercise method provided by the embodiment of the invention, the fault exercise information also comprises the writing frequency; the writing the log content into the temporary log file of the service instance includes: and writing the log content into the temporary log file of the service instance according to the writing frequency.
According to the online service fault exercise method provided by the embodiment of the invention, the writing of the log content into the temporary log file of the service instance comprises the following steps: responding to the fact that the temporary log file does not exist under the preset directory of the service instance, creating the temporary log file under the preset directory, and writing the log content into the temporary log file; and in response to the existence of the temporary log file under the preset directory of the service instance, writing the log content into the temporary log file in a mode of adding the content.
According to the online service fault exercise method provided by the embodiment of the invention, the method further comprises the following steps: after detecting that the log content accords with the alarm rule, sending corresponding alarm information to terminal equipment held by a preset receiver so that the preset receiver can perform damage stopping operation according to the alarm information.
According to the online service fault exercise method provided by the embodiment of the invention, the method further comprises the following steps: and acquiring loss stopping records generated by the loss stopping operation, judging the correctness of the loss stopping operation according to the loss stopping records and the task content of the fault exercise task, and giving out a fault exercise score.
According to the online service fault exercise method provided by the embodiment of the invention, the fault exercise task is sent by calling the interface corresponding to the service instance, and the method comprises the following steps: and sending the fault drilling task to a dispatching center with the operation authority for calling the interface of the service instance, and sending the fault drilling task through the interface for calling the service instance by the dispatching center.
The embodiment of the invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps of any one of the online service fault exercise methods when executing the program.
The embodiments of the present invention also provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the online service fault exercise method as described in any of the above.
The embodiment of the invention also provides a computer program product, which comprises a computer program, wherein the computer program realizes the steps of the online service fault exercise method when being executed by a processor.
According to the online service fault exercise method, the electronic device and the storage medium, the fault exercise task is sent by calling the interface corresponding to the service instance, the fault exercise task comprises the service ID, the service instance ID and the fault exercise information, the fault exercise information comprises the pre-written log content, the service ID and the service instance ID are uniquely corresponding to the service instance, the log content is written into the temporary log file of the service instance, the log collection assembly is utilized to monitor the updated log content in the temporary log file, and an alarm decision is made based on the monitored log content and a preset alarm rule.
Drawings
In order to more clearly illustrate the technical solutions of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of an online service fault exercise method according to an embodiment of the present invention;
FIG. 2 is a second flowchart of an online service fault exercise method according to an embodiment of the present invention;
Fig. 3 is a schematic structural diagram of an online service fault exercise device according to an embodiment of the present invention;
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a schematic flow chart of an online service fault exercise method according to an embodiment of the present invention. As shown in fig. 1, the method includes:
Step S1, sending a fault exercise task by calling an interface corresponding to a service instance; the fault exercise task comprises a service ID, a service instance ID and fault exercise information, wherein the fault exercise information comprises pre-written log content, and the service ID and the service instance ID uniquely correspond to the service instance.
One service may run multiple service instances. Multiple service instances may run on the same or different virtual machines. The service is uniquely identified with a service ID. The service instance ID is an identification of the service instance of the service. The service ID and service instance ID combination may uniquely identify a service instance of a certain service. I.e. the service ID and the service instance ID uniquely correspond to the service instance.
The fault exercise needs to simulate the operation condition of the service instance, and therefore, the fault exercise information needs to be sent to the corresponding service instance. And sending a fault exercise task by calling an interface corresponding to the service instance, wherein the fault exercise task comprises a service ID, a service instance ID and fault exercise information, and the fault exercise information comprises log content.
Since multiple service instances with multiple services may be running on a service instance running device, different service instances run on different virtual machines. Thus, interfaces of virtual machines that include service IDs and service instance IDs to determine which service instance to invoke are needed in the fault drill task.
The fault drilling task also comprises log content. The log content is written in advance according to the fault exercise task. For example, log contents corresponding to various fault situations can be pre-compiled according to the pre-judging result of the log generated by some faults.
And S2, writing the log content into a temporary log file of the service instance, monitoring the updated log content in the temporary log file by using a log acquisition component, and performing alarm decision based on the monitored log content and a preset alarm rule.
It is desirable to give service instances processing power on the received log content to simulate the process of generating a log for normal operation of the service. The log content is written to a temporary log file of the service instance. This temporary log file and the service log file generated when the service instance is operating normally are two different files. A log collection component, such as log collection component filebeat, is deployed in advance on the service instance machine, and the log collection component is utilized to monitor the updated log content in the temporary log file. And carrying out alarm decision based on the monitored log content and a preset alarm rule.
According to the online service fault exercise method provided by the embodiment of the invention, the fault exercise task is sent by calling the interface corresponding to the service instance, the fault exercise task comprises the service ID, the service instance ID and the fault exercise information, the fault exercise information comprises the pre-written log content, the service ID and the service instance ID are only corresponding to the service instance, the log content is written into the temporary log file of the service instance, the log collection component is utilized to monitor the updated log content in the temporary log file, and an alarm decision is made based on the monitored log content and a preset alarm rule, so that an online service fault exercise scheme without damage to an online service is provided, the code invasion to an original service code is not needed, the online service fault exercise cost is reduced, and the safety and the online fault exercise effect are improved.
According to the online service fault exercise method provided by the embodiment of the invention, the alarm decision is made based on the monitored log content and the preset alarm rule, and the method comprises the following steps: acquiring a log storage mode when the service normally runs, and storing the monitored log content based on the log storage mode; and acquiring the log content stored based on the log storage mode, and performing alarm decision based on the log content and a preset alarm rule.
The generated service log is usually stored in a log center when the service is normally operated. And judging whether the service operates normally or not according to the log content acquired from the log center. Because the log content obtained from the log center also carries information such as the identification of the log center, in order to keep consistency with the operation of the service, avoid error reporting of data sources, data formats and the like, and also reduce cost by utilizing the existing equipment capability, in the embodiment of the invention, when an alarm decision is made based on the monitored log content and a preset alarm rule, a log storage mode in normal operation of the service is obtained, and the monitored log content is stored based on the log storage mode; and acquiring log content stored based on a log storage mode, and performing alarm decision based on the log content and a preset alarm rule. The log storage mode comprises a mode of storing the log into a log center and information of the corresponding log center.
When the log content is stored in the log center, if the service instance directly stores the service log in the log center, the monitored log content can also be directly stored in the log center; if the service data is to store the service log in kafka in advance and then store the service log in the log center by transmitting the kafka data, when the monitored log content is stored in the log center, the log content may also be stored in the subject of kafka for storing the service log file of the service instance first and then written in the log center by transmitting the kafka data.
According to the online service fault exercise method provided by the embodiment of the invention, the log storage mode of the normal operation of the service is obtained, the monitored log content is stored based on the log storage mode, the log content stored based on the log storage mode is obtained, the alarm decision is made based on the log content and the preset alarm rule, and the reliability of the online service fault exercise is improved.
According to the online service fault exercise method provided by the embodiment of the invention, the interface is provided by the proxy component of the service instance; the writing the log content into the temporary log file of the service instance includes: replacing preset contents in the patterned script of the proxy component by using the log contents to obtain a proxy component operation script; and writing the log content into a temporary log file of the service instance by running the proxy component running script.
The interface of the service instance for receiving the fault drill information is provided by a proxy component of the service instance, wherein the proxy component is also referred to as a agent component.
The agent component receives the fault exercise task through the interface and writes the log content in the fault exercise task into the temporary log file. The modeling script of the proxy component can be preset, after the fault drilling task is received, the preset content in the modeling script of the proxy component is replaced by the log content in the fault drilling information, so that the proxy component operation script is obtained, and the log content is written into the temporary log file through the operation proxy component operation script.
According to the online service fault exercise method provided by the embodiment of the invention, the agent component operation script is obtained by replacing the preset content in the script of the agent component with the log content, and the log content is written into the temporary log file by operating the agent component operation script, so that the rapidity and accuracy of writing the log content into the temporary log file are improved.
According to the online service fault exercise method provided by the embodiment of the invention, the fault exercise information also comprises the writing frequency; the writing the log content into the temporary log file of the service instance includes: and writing the log content into the temporary log file of the service instance according to the writing frequency.
The fault drill information includes log content and a write frequency, which is used to indicate a write speed of the log content, for example, in units of lines/second.
When the log content is written into the temporary log file, the log content is written into the temporary log file according to the writing frequency in the fault drilling information.
According to the online service fault exercise method provided by the embodiment of the invention, the log content is written into the temporary log file according to the writing frequency, so that the rate control of writing the log content is improved.
According to the online service fault exercise method provided by the embodiment of the invention, the writing of the log content into the temporary log file of the service instance comprises the following steps: responding to the fact that the temporary log file does not exist under the preset directory of the service instance, creating the temporary log file under the preset directory, and writing the log content into the temporary log file; and in response to the existence of the temporary log file under the preset directory of the service instance, writing the log content into the temporary log file in a mode of adding the content.
After receiving the fault exercise task, the proxy component of the service instance inquires whether a log file exists in a preset directory, if not, creates a temporary log file in the preset directory, and writes the log content in the fault exercise information into the temporary log file. If the temporary log file exists in the preset catalog, the log content is written into the temporary log file in a mode of adding the content, namely, the newly received log content is added into the temporary log file.
Thus, dynamic content addition can be performed on the newly created temporary log file. In addition, according to the requirement of the alarm rule, the writing frequency and the log content during the addition of the log can be dynamically adjusted.
According to the online service fault exercise method provided by the embodiment of the invention, the temporary log file is created under the preset catalog in response to the fact that the temporary log file does not exist under the preset catalog, the log content is written into the temporary log file in response to the fact that the temporary log file exists under the preset catalog, and the writing and the addition of the log content are realized by creating the temporary log file.
According to the online service fault exercise method provided by the embodiment of the invention, the method further comprises the following steps: after detecting that the log content accords with the alarm rule, sending corresponding alarm information to terminal equipment held by a preset receiver so that the preset receiver can perform damage stopping operation according to the alarm information.
After the fact that the log content accords with the alarm rule is detected, corresponding alarm information is sent to terminal equipment held by a preset receiver, and the preset receiver can be a fault alarm receiver. The fault alarm receiver can log in the cloud operation platform with the authority of deployment and operation service examples according to the alarm information to perform related damage stopping operation.
According to the online service fault exercise method provided by the embodiment of the invention, after the fact that the log content accords with the alarm rule is detected, the corresponding alarm information is sent to the terminal equipment held by the preset receiver, so that the preset receiver can perform damage stopping operation according to the alarm information, and the online fault elimination is promoted.
According to the online service fault exercise method provided by the embodiment of the invention, the method further comprises the following steps: and acquiring loss stopping records generated by the loss stopping operation, judging the correctness of the loss stopping operation according to the loss stopping records and the task content of the fault exercise task, and giving out a fault exercise score.
The method comprises the steps of obtaining loss stopping records generated by loss stopping operation through communication with equipment (such as a cloud operation platform) for performing the loss stopping operation, obtaining a preset loss stopping scheme according to task content of a fault exercise task, comparing the loss stopping records with the preset loss stopping scheme, judging the accuracy of the loss stopping operation, and giving a fault exercise score.
According to the online service fault exercise method provided by the embodiment of the invention, the loss stopping record generated by the loss stopping operation is obtained, the correctness of the loss stopping operation is judged according to the task content of the fault exercise task, and the fault exercise score is given, so that the evaluation of the fault exercise is realized.
According to the online service fault exercise method provided by the embodiment of the invention, the fault exercise task is sent by calling the interface corresponding to the service instance, and the method comprises the following steps: and sending the fault drilling task to a dispatching center with the operation authority for calling the interface of the service instance, and sending the fault drilling task through the interface for calling the service instance by the dispatching center.
To improve security, communications with the service instance need to have corresponding rights. When the fault exercise task is sent through the interface corresponding to the calling service instance, the fault exercise task can be sent to a dispatching center with the operation authority of the interface for calling the service instance, and the dispatching center sends the fault exercise task through the interface corresponding to the calling service instance. Of course, the operation authority of the interface for calling the service instance may be obtained before the fault exercise task is sent through the interface corresponding to the calling service instance.
According to the online service fault exercise method provided by the embodiment of the invention, the fault exercise task is sent through the dispatching center with the operation authority of the interface for calling the service instance, so that the safety is improved.
Fig. 2 is a second flowchart of an on-line service fault exercise method according to an embodiment of the present invention. As shown in fig. 2, the method includes the following:
deploying filebeat a log acquisition component when the virtual machine of the online instance is created, monitoring/data 0/www/applogs a.log log files under a catalog (a catalog set or selected according to the requirement of the log acquisition component) in real time, and redirecting file content into a designated kafka-log-topic;
deploying a agent component when the virtual machine of the online instance is created, and receiving a task issued by an api-server (a controller or a dispatching center has the authority of operating the service instance);
creating a fault exercise task on a fault exercise platform, setting an exercise service ID (serviceId), injecting log content, and injecting writing frequency (row/second);
selecting a service instance (with a service instance ID) at a fault drilling platform, and transmitting a fault drilling task to the selected service instance (the drilling task comprises SERVICEID, the service instance ID, log content and writing frequency) through an api-server scheduling center;
After receiving a task issued by an api-server, a agent program on a service instance temporarily creates an a.log file (temporary log file) under the conditions of/data 0/www/applogs, and writes set log contents into the a.log file according to the writing frequency set in the exercise task;
Filebeat program on service instance monitors that the log of data0/www/applogs/a.log has content update, and redirects the log content of the automatic acquisition update to the designated kafka-log-topic;
After receiving the message in the kafka-log-topic, the log center sends the log content to an alarm platform, and the alarm platform detects that the log meets an alarm rule and sends alarm information to terminal equipment held by a responsible person related to the service;
service related responsible persons perform damage stopping operation according to the cloud removing operation platform of the alarm content;
the fault drilling platform queries the drilling service related loss stopping records in real time, judges whether the loss stopping action is correct according to drilling task content, gives drilling scores, and informs service related responsible persons.
According to the embodiment of the invention, the abnormal content is directly written into the log file, so that the mode of writing the abnormal content through code invasion in the prior art is replaced, the cost of performing online service fault exercise is reduced, the safety and the effect are improved well, and the landing of the online fault exercise is promoted effectively.
The preferred embodiments of the present embodiment may be freely combined on the premise that the logic or structure does not conflict with each other, and the present invention is not limited to this.
The online service fault exercise device provided by the embodiment of the invention is described below, and the online service fault exercise device described below and the online service fault exercise method described above can be referred to correspondingly.
Fig. 3 is a schematic structural diagram of an on-line service fault exercise device according to an embodiment of the present invention. As shown in fig. 3, the apparatus includes a task sending module 10 and a fault exercise module 20, wherein: the task sending module 10 is configured to: sending a fault exercise task by calling an interface corresponding to the service instance; the fault exercise task comprises a service ID, a service instance ID and fault exercise information, wherein the fault exercise information comprises pre-written log content, and the service ID and the service instance ID uniquely correspond to the service instance; the fault drill module 20 is configured to: writing the log content into a temporary log file of the service instance, monitoring the updated log content in the temporary log file by using a log acquisition component, and performing alarm decision based on the monitored log content and a preset alarm rule.
According to the online service fault exercise device provided by the embodiment of the invention, the fault exercise task is sent by calling the interface corresponding to the service instance, the fault exercise task comprises the service ID, the service instance ID and the fault exercise information, the fault exercise information comprises the pre-written log content, the service ID and the service instance ID are only corresponding to the service instance, the log content is written into the temporary log file of the service instance, the log collection component is utilized to monitor the updated log content in the temporary log file, and an alarm decision is made based on the monitored log content and a preset alarm rule, so that an online service fault exercise scheme without damage to an online service is provided, the code invasion to an original service code is not needed, the cost of online service fault exercise is reduced, and the safety and the effect of online fault exercise are improved.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, as shown in fig. 4, the electronic device may include: processor 410, communication interface (Communications Interface) 420, memory 430, and communication bus 440, wherein processor 410, communication interface 420, and memory 430 communicate with each other via communication bus 440. The processor 410 may invoke logic instructions in the memory 430 to perform an online service fault drill method comprising: sending a fault exercise task by calling an interface corresponding to the service instance; the fault exercise task comprises a service ID, a service instance ID and fault exercise information, wherein the fault exercise information comprises pre-written log content, and the service ID and the service instance ID uniquely correspond to the service instance; writing the log content into a temporary log file of the service instance, monitoring the updated log content in the temporary log file by using a log acquisition component, and performing alarm decision based on the monitored log content and a preset alarm rule.
Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, an embodiment of the present invention further provides a computer program product, where the computer program product includes a computer program, where the computer program may be stored on a non-transitory computer readable storage medium, where the computer program when executed by a processor is capable of executing an online service fault exercise method provided by the foregoing methods, where the method includes: sending a fault exercise task by calling an interface corresponding to the service instance; the fault exercise task comprises a service ID, a service instance ID and fault exercise information, wherein the fault exercise information comprises pre-written log content, and the service ID and the service instance ID uniquely correspond to the service instance; writing the log content into a temporary log file of the service instance, monitoring the updated log content in the temporary log file by using a log acquisition component, and performing alarm decision based on the monitored log content and a preset alarm rule.
In still another aspect, an embodiment of the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, is implemented to perform the online service fault exercise method provided by the above methods, the method including: sending a fault exercise task by calling an interface corresponding to the service instance; the fault exercise task comprises a service ID, a service instance ID and fault exercise information, wherein the fault exercise information comprises pre-written log content, and the service ID and the service instance ID uniquely correspond to the service instance; writing the log content into a temporary log file of the service instance, monitoring the updated log content in the temporary log file by using a log acquisition component, and performing alarm decision based on the monitored log content and a preset alarm rule.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. An online service fault exercise method, comprising:
sending a fault exercise task by calling an interface corresponding to the service instance; the fault exercise task comprises a service ID, a service instance ID and fault exercise information, wherein the fault exercise information comprises pre-written log content, and the service ID and the service instance ID uniquely correspond to the service instance;
writing the log content into a temporary log file of the service instance, monitoring the updated log content in the temporary log file by using a log acquisition component, and performing alarm decision based on the monitored log content and a preset alarm rule.
2. The online service fault exercise method according to claim 1, wherein the performing an alarm decision based on the monitored log content and a preset alarm rule comprises:
acquiring a log storage mode when the service normally runs, and storing the monitored log content based on the log storage mode;
and acquiring the log content stored based on the log storage mode, and performing alarm decision based on the log content and a preset alarm rule.
3. The online service fault exercise method of claim 1, wherein the interface is provided by a proxy component of the service instance; the writing the log content into the temporary log file of the service instance includes:
Replacing preset contents in the patterned script of the proxy component by using the log contents to obtain a proxy component operation script;
And writing the log content into a temporary log file of the service instance by running the proxy component running script.
4. The online service fault drill method of claim 1, wherein the fault drill information further comprises a write frequency; the writing the log content into the temporary log file of the service instance includes:
And writing the log content into the temporary log file of the service instance according to the writing frequency.
5. The online service fault exercise method of claim 1, wherein the writing the log content to the temporary log file of the service instance comprises:
Responding to the fact that the temporary log file does not exist under the preset directory of the service instance, creating the temporary log file under the preset directory, and writing the log content into the temporary log file;
and in response to the existence of the temporary log file under the preset directory of the service instance, writing the log content into the temporary log file in a mode of adding the content.
6. The online service fault exercise method of claim 1, further comprising:
After detecting that the log content accords with the alarm rule, sending corresponding alarm information to terminal equipment held by a preset receiver so that the preset receiver can perform damage stopping operation according to the alarm information.
7. The online service fault drill-up method of claim 6, wherein the method further comprises:
And acquiring loss stopping records generated by the loss stopping operation, judging the correctness of the loss stopping operation according to the loss stopping records and the task content of the fault exercise task, and giving out a fault exercise score.
8. The online service fault exercise method according to claim 1, wherein the sending the fault exercise task through the interface corresponding to the calling service instance comprises:
and sending the fault drilling task to a dispatching center with the operation authority for calling the interface of the service instance, and sending the fault drilling task through the interface for calling the service instance by the dispatching center.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the online service fault exercise method according to any one of claims 1 to 8 when the program is executed.
10. A non-transitory computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the steps of the online service fault exercise method according to any of claims 1 to 8.
CN202311816267.3A 2023-12-26 2023-12-26 Online service fault exercise method, electronic equipment and storage medium Pending CN117938624A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311816267.3A CN117938624A (en) 2023-12-26 2023-12-26 Online service fault exercise method, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311816267.3A CN117938624A (en) 2023-12-26 2023-12-26 Online service fault exercise method, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117938624A true CN117938624A (en) 2024-04-26

Family

ID=90749824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311816267.3A Pending CN117938624A (en) 2023-12-26 2023-12-26 Online service fault exercise method, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117938624A (en)

Similar Documents

Publication Publication Date Title
US9229844B2 (en) System and method for monitoring web service
CN104834602B (en) A kind of program dissemination method, device and program delivery system
CN107800783B (en) Method and device for remotely monitoring server
CN108337266B (en) Efficient protocol client vulnerability discovery method and system
CN107241229A (en) A kind of business monitoring method and device based on interface testing instrument
CN113934621A (en) Fuzzy test method, system, electronic device and medium
CN109273045B (en) Storage device online detection method, device, equipment and readable storage medium
CN112653709A (en) Vulnerability detection method and device, electronic equipment and readable storage medium
CN112068935A (en) Method, device and equipment for monitoring deployment of kubernets program
CN108650123B (en) Fault information recording method, device, equipment and storage medium
CN113179180A (en) Basalt client disaster fault repairing method, basalt client disaster fault repairing device and basalt client disaster storage medium
CN117938624A (en) Online service fault exercise method, electronic equipment and storage medium
CN110069382B (en) Software monitoring method, server, terminal device, computer device and medium
CN111367934A (en) Data consistency checking method, device, server and medium
CN115883138A (en) Method, device, equipment and medium for polling running state of airborne entertainment system
CN111680974B (en) Method and device for positioning problems of electronic underwriting process
CN115190008A (en) Fault processing method, fault processing device, electronic device and storage medium
CN115065510A (en) Login method, device, system, electronic equipment and readable storage medium
CN113918405A (en) Log monitoring method and device, electronic equipment and storage medium
CN112363931A (en) Web system testing method and device
CN112230949A (en) Terminal software upgrading method and device, electronic equipment and storage medium
CN111400094A (en) Method, device, equipment and medium for restoring factory settings of server system
CN114816876B (en) Automatic test system for server Redfish interface specifications
CN113233269B (en) Method and device for diagnosing attack on elevator network
CN114356344B (en) Application deployment method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination