CN114201449A

CN114201449A - Log monitoring method and device, computer equipment and storage medium

Info

Publication number: CN114201449A
Application number: CN202111520439.3A
Authority: CN
Inventors: 张宇昂; 肖文浩; 於圣楠; 吴剑飞; 刘柏
Original assignee: Netease Hangzhou Network Co Ltd
Current assignee: Netease Hangzhou Network Co Ltd
Priority date: 2021-12-13
Filing date: 2021-12-13
Publication date: 2022-03-18

Abstract

The embodiment of the application discloses a log monitoring method and device, computer equipment and a storage medium. According to the scheme, log data generated by the same target service in the target application are transmitted to the same theme of the message system, a log monitoring task is created for the theme, then a monitoring subtask corresponding to the monitoring rule information is set under the log monitoring task according to the monitoring rule information of the log data applied to the target service, the log data are subjected to abnormal detection through the monitoring subtask according to the monitoring rule information, a detection result of the log data of the target service is obtained, finally, the abnormal log data are displayed according to the detection result, so that a user can conveniently check the abnormal log data in time, and the monitoring efficiency of the log data can be improved.

Description

Log monitoring method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a log monitoring method and apparatus, a computer device, and a storage medium.

Background

With the development of internet technology, various online games have been developed in order to meet the entertainment needs of users. When the online game is run through the game server, a large amount of log data can be generated, the log data is used for helping developers to quickly locate problems in the service or the program, and the processing results of the log and the events can be fed back to the users in time.

The method comprises the steps that log data generated in an existing game server are managed based on files, the log data are stored on the game server according to hours and stored in a local hard disk, when the service is abnormal, the log data are searched through problems fed back by a downstream, namely, the logs are downloaded according to possible time points after the problems occur, and then keywords are filtered and positioned to the problems and the context. However, the log file data detection is affected by the complex log data searching process and long time consumption due to the large amount of log data.

Disclosure of Invention

The embodiment of the application provides a log monitoring method and device, computer equipment and a storage medium, and log monitoring efficiency can be improved.

The embodiment of the application provides a log monitoring method, which comprises the following steps:

determining a target service needing to be monitored under a target application, and establishing a corresponding monitoring task for the target service, wherein the monitoring task is used for monitoring log data generated under the target service;

acquiring at least one piece of monitoring rule information of log data of the target application under the target service;

setting at least one monitoring subtask under the monitoring task based on the at least one monitoring rule information, wherein one monitoring subtask corresponds to one monitoring rule information;

and calling the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service to obtain a log monitoring result.

Correspondingly, the embodiment of the present application further provides a log monitoring device, including:

the system comprises a first determining unit, a second determining unit and a monitoring unit, wherein the first determining unit is used for determining a target service needing to be monitored under a target application and creating a corresponding monitoring task for the target service, and the monitoring task is used for monitoring log data generated under the target service;

a first obtaining unit, configured to obtain at least one piece of monitoring rule information of log data of the target application in the target service;

the setting unit is used for setting at least one monitoring subtask under the monitoring task based on the at least one monitoring rule information, wherein one monitoring subtask corresponds to one monitoring rule information;

and the detection unit is used for calling the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service to obtain a log monitoring result.

In some embodiments, the apparatus further comprises:

the updating unit is used for updating the monitoring rule information when the updating operation of the monitoring rule information is detected to obtain updated monitoring rule information;

a first adjusting unit, configured to adjust a monitoring subtask under the monitoring task based on the updated monitoring rule information to obtain an adjusted monitoring subtask;

in some embodiments, the detection unit comprises:

and the detection subunit is used for calling the adjusted monitoring subtask to perform abnormal data detection on the log data corresponding to the target service.

In some embodiments, the detection unit comprises:

the judging unit is used for judging whether the log data corresponding to the target service meets the monitoring rule information corresponding to the monitoring subtask;

and the first determining subunit is configured to determine, if the log data satisfies the monitoring rule information corresponding to the monitoring subtask, that the log data satisfying the monitoring rule information corresponding to the monitoring subtask is abnormal log data.

In some embodiments, the apparatus further comprises:

the second acquisition unit is used for counting the abnormal times of the abnormal log data within a preset time length if the abnormal log data is determined to exist according to the log detection result;

and the first display unit is used for displaying the abnormal log data and the abnormal times.

In some embodiments, the apparatus further comprises:

the generating unit is used for generating at least one task snapshot corresponding to the monitoring subtask based on a preset time interval in the process of calling the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service;

a third obtaining unit, configured to obtain a generation time of the task snapshot;

and the restarting unit is used for restarting the monitoring subtask based on the task snapshot and the generation moment when the monitoring subtask is detected to be abnormal in operation.

In some embodiments, the restart unit comprises:

the first acquisition subunit is used for acquiring the abnormal time when the monitoring subtask runs abnormally;

determining a target generation time closest to the abnormal time from a plurality of generation times;

the second determining subunit is configured to determine a task snapshot corresponding to the target generation time to obtain a target task snapshot;

and the restarting subunit is used for restarting the monitoring subtask based on the target task snapshot.

In some embodiments, the apparatus further comprises:

a fourth obtaining unit, configured to obtain log data generated in the target service;

and the uploading unit is used for uploading the log data to a target theme corresponding to the target service in a preset message system.

In some embodiments, the first determination unit comprises:

and the creating subunit is used for creating the monitoring task for the target service based on the target theme.

In some embodiments, the apparatus further comprises:

a fifth obtaining unit, configured to obtain a timestamp that the abnormal log data is uploaded to the preset message system;

the second determining unit is used for determining log data to be displayed from the preset message system based on the timestamp;

and the second display unit is used for displaying the abnormal log data and the log data to be displayed.

In some embodiments, the second display unit comprises:

the second obtaining subunit is configured to obtain a first importance level of the abnormal log data and a second importance level of the log data to be displayed;

a third determining subunit, configured to determine, based on the first importance level and the second importance level, a first display manner of the abnormal log data and a second display manner of the log data to be displayed, respectively;

and the display subunit is used for displaying the abnormal log data according to the first display mode and displaying the log data to be displayed according to the second display mode.

In some embodiments, the apparatus further comprises:

the sixth acquisition unit is used for acquiring the partition number of each topic and determining target processing resources which need to be allocated to the monitoring tasks of each business based on the partition number;

and the second adjusting unit is used for adjusting the processing resources of each service based on the target processing resources and the existing processing resources of each service.

According to the method and the device, log data generated by the same target service in the target application are transmitted to the same subject of the message system, a log monitoring task is created for the subject, then a monitoring subtask corresponding to the monitoring rule information is set under the log monitoring task according to the monitoring rule information of the log data of the target application under the target service, the log data are subjected to abnormal detection through the monitoring subtask according to the monitoring rule information, a detection result of the log data of the target service is obtained, and finally the abnormal log data are displayed according to the detection result, so that a user can conveniently check the abnormal log data in time, and therefore the monitoring efficiency of the log data can be improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flowchart of a log monitoring method according to an embodiment of the present application.

Fig. 2 is a schematic flowchart of another log monitoring method according to an embodiment of the present application.

Fig. 3 is a block diagram of a log monitoring apparatus according to an embodiment of the present application.

Fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described clearly and completely with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The embodiment of the application provides a log monitoring method and device, a storage medium and computer equipment. Specifically, the log monitoring method in the embodiment of the present application may be executed by a computer device, where the computer device may be a terminal or a server. The terminal can be a terminal device such as a smart phone, a tablet Computer, a notebook Computer, a touch screen, a Personal Computer (PC), a Personal Digital Assistant (PDA), and the like. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, and a big data and artificial intelligence platform.

For example, the computer device may be a server, and the server may determine a target service that needs to be monitored under a target application, and create a corresponding monitoring task for the target service, where the monitoring task is used to monitor log data generated under the target service; acquiring at least one piece of monitoring rule information of log data of a target application under a target service; setting at least one monitoring subtask under the monitoring task based on at least one piece of monitoring rule information, wherein one monitoring subtask corresponds to one piece of monitoring rule information; and calling the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service to obtain a log monitoring result.

Based on the above problems, embodiments of the present application provide a log monitoring method, an apparatus, a computer device, and a storage medium, which can improve log monitoring efficiency.

The following are detailed below. It should be noted that the following description of the embodiments is not intended to limit the preferred order of the embodiments.

The embodiment of the present application provides a log monitoring method, which may be executed by a terminal or a server.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a log monitoring method according to an embodiment of the present disclosure. The specific flow of the log monitoring method can be as follows:

101. and determining a target service needing to be monitored under the target application, and establishing a corresponding monitoring task for the target service.

In this embodiment, the target application refers to an application that generates log data, and the target application may include a plurality of services, and the service data of the target application is obtained through the log data generated under each service. The target service may be any service in the target application.

Specifically, the log data, i.e. the log file, records the modification operation of the data in the database by using the transaction log file, wherein each log records either the executed logic operation or the front image and the back image of the modified data. The former is a data copy before the operation is performed; this is then the data copy after the operation has been performed. The log file is used for helping developers to quickly locate problems in the service or program, and can feed back the processing results of the log and the events to the users in time.

The monitoring task is used for monitoring log data generated under the target service. Specifically, the monitoring task may be a monitoring program, which is used to perform detection processing on the log data to monitor the log data.

In some embodiments, in order to improve the efficiency of managing log data, before the step "create corresponding monitoring task for target service", the following steps may be further included:

acquiring log data generated under a target service;

uploading the log data to a target theme corresponding to a target service in a preset message system;

the step "create corresponding monitoring task for target service" may include the following operations:

and creating a monitoring task for the target business based on the target theme.

The preset message system is used for a publisher to publish messages and a subscriber to subscribe messages, the preset message system can be Kafka, the Kafka is a distributed, partitioned, multi-copy and multi-subscriber, the distributed log system based on zookeeper coordination can be commonly used for web (webpage) logs, access logs and message services, and the main application scenes of the Kafka are as follows: a log collection system and a messaging system. Kafka is a publish-subscribe model messaging system.

Further, each message issued to Kafka has a category, which is called Topic, that is, messages of physically different topics are stored separately, and although a logical message of a Topic is stored in one or more brokers (message forwarders), the user only needs to specify the Topic of the message.

In this embodiment of the present application, different service categories of the target application are different, and each service of the target application may correspond to one topic in the preset message system, for example, the target application may include: a first service, a second service, a third service, and the like, a first theme, a second theme, a third theme, and the like may be created in the preset message system, where the first service corresponds to the first theme, the second service corresponds to the second theme, and the third service corresponds to the third theme.

Specifically, the log data is uploaded to a target theme corresponding to the target service in the preset message system, and the log is uploaded through a log collection tool. For example, the log collection tool may be: filebeat, Filebeat is a lightweight transport for forwarding and concentrating log data. Filebeat monitors the designated log file or location, collects log events, and forwards them to the Elasticsearch or Logstash for indexing.

The working mode of the filebed can be as follows: when Filebeat is initiated, it will initiate one or more inputs that will be looked up in the location specified for the log data. For each log found by Filebeat, Filebeat will start the collector. Each collector reads a single log to obtain new content and sends the new log data to the libpeak, which aggregates the events and sends the aggregated data to the output configured for Filebeat.

The step of creating the monitoring task for the target service based on the target theme refers to creating the monitoring task under the target theme.

102. And acquiring at least one piece of monitoring rule information of the log data of the target application under the target service.

The monitoring rule information is also a monitoring rule for monitoring log data of the target service in the embodiment of the present application, and the monitoring rule may include multiple types, for example, the monitoring rule may include: statistics class, values accumulation class, raw logs class, containment class, and the like.

For example, the statistic category refers to counting the number of times a certain field appears in a log within five minutes, and an alarm is given when a threshold value is reached; the numerical value class is that when a certain numerical value in the log exceeds a threshold value, an alarm is given; the value accumulation class is that if a certain value accumulation exceeds a threshold value within five minutes, an alarm is given, and if a prop obtained by a certain player exceeds a certain value within 5 minutes, for example; the original log class monitors whether a field such as error appears in the log; an inclusion class refers to whether a monitored field is in a given set, such as monitoring whether a prop obtained by a user is in some type of system setting.

In the embodiment of the present application, when creating the monitoring task, a user may select a monitoring rule, and then set a detection condition for log data based on the monitoring rule.

103. And setting at least one monitoring subtask under the monitoring task based on the at least one monitoring rule information.

In the prior art, when monitoring log data of a target service, corresponding monitoring tasks are created according to set monitoring rules, for example, the monitoring rules include a first monitoring rule, a second monitoring rule, and a third monitoring rule, and correspondingly, the first monitoring task is created according to the first monitoring rule, the second monitoring task is created according to the second monitoring rule, the third monitoring task is created according to the third monitoring rule, and a corresponding number of monitoring tasks are created according to the number of the monitoring rules. However, creating one monitoring task requires allocating processing resources to the monitoring task, and when creating a plurality of monitoring tasks, it is necessary to allocate corresponding processing resources to each monitoring task for storing and processing log data, which results in excessive consumption of processing resources, and on the other hand, because each monitoring task is for storing and processing the same log data, the difference is only processing according to different monitoring rules, and there may be a problem that the same processing procedure exists in different monitoring rules, which may cause repeated processing of log data.

Therefore, the embodiment of the application is improved over the existing mode, and a monitoring task is created for a target service to monitor the log data of the target service. And then grouping under the monitoring task according to the monitoring rule information.

Specifically, the monitoring rule information may be grouped according to the type of the monitoring rule information, for example, the monitoring rule information may include: the first type of monitoring rule information, the second type of monitoring rule information and the third type of monitoring rule information can be divided into three groups, and further, a plurality of monitoring subtasks are set under the monitoring tasks according to the groups.

Wherein, one monitoring subtask corresponds to one monitoring rule information. The monitoring subtask is a subtask under the monitoring task, processing resources do not need to be additionally allocated to the monitoring subtask, and the processing resources of the monitoring task can be directly utilized to process the monitoring subtask, so that the consumption of the processing resources can be reduced.

104. And calling the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service to obtain a log monitoring result.

In the embodiment of the application, log data of a target service is monitored, namely abnormal data detection is performed on the log data, whether the log data are abnormal or not is judged, and a log monitoring result is obtained according to an abnormal data detection result.

In some embodiments, in order to improve the accuracy of log data detection, the step "invoking the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service" may include the following steps:

judging whether the log data corresponding to the target service meets monitoring rule information corresponding to the monitoring subtask;

and if the log data meets the monitoring rule information corresponding to the monitoring subtask, determining the log data meeting the monitoring rule information corresponding to the monitoring subtask as abnormal log data.

Specifically, for each monitoring subtask, abnormal data detection is performed on the log data of the target service according to the monitoring rule information corresponding to the monitoring subtask.

For example, the monitoring rule information corresponding to the monitoring subtask may be an original log class, and the original log class may detect whether an abnormal field occurs in the log data, for example, the abnormal field may be: an error. Further, whether the log data of the target service has an abnormal field is judged: an error, if the log data of the target service has an abnormal field: error, then the exception field may be determined to have occurred: the log data of the error is abnormal log data; if the log data of the target service has no abnormal field: error, then it may be determined that no exception log data containing an exception field has occurred.

Further, when a plurality of monitoring subtasks exist, each monitoring subtask can be sequentially called to respectively perform abnormal data detection on the log data of the target service, so that an abnormal data detection result of the log data of the target service under each monitoring rule is obtained.

For example, the monitoring subtasks may include: a first monitoring subtask, a second monitoring subtask, and a third monitoring subtask, wherein the first monitoring subtask corresponds to first monitoring rule information, the second monitoring subtask corresponds to second monitoring rule information, and the third monitoring subtask corresponds to third monitoring rule information, when abnormal data detection is performed on the log data of the target service, the first monitoring subtask may be first invoked to detect the log data according to the first monitoring rule information to obtain a first detection result, then, a second monitoring subtask is called to detect the log data according to the second monitoring rule information to obtain a second detection result, and finally, and calling a third monitoring subtask to detect the log data according to the third monitoring rule information to obtain a third detection result, and combining the first detection result, the second detection result and the third detection result to obtain a log monitoring result of the log data of the target service.

In some embodiments, in order to avoid frequent prompting caused by frequent occurrence of abnormal log data, after the step "calling a monitoring subtask to perform abnormal data detection on the log data corresponding to the target service to obtain a log monitoring result", the following steps may be further included:

if the abnormal log data are determined to exist according to the log detection result, counting the abnormal times of the abnormal log data in the preset duration;

and displaying the abnormal log data and the abnormal times.

The method comprises the steps of presetting a time waiting window with a preset duration, obtaining the occurrence frequency of abnormal log data, namely the abnormal frequency, in the time waiting window after the abnormal log data are detected to exist in the log data, and then prompting according to the abnormal log data and the abnormal frequency of the abnormal log data appearing in the time waiting window, so that the prompting frequency can be reduced when the abnormal log data are frequently generated, and the influence of frequent prompting on program operation is avoided.

In some embodiments, to ensure high availability of the monitoring task, the method may further comprise the steps of:

generating at least one task snapshot corresponding to the monitoring subtask based on a preset time interval in the process of calling the monitoring subtask to perform abnormal data detection on log data corresponding to a target service;

acquiring the generation time of a task snapshot;

and when the monitoring subtask is detected to be abnormal in operation, restarting the monitoring subtask based on the task snapshot and the generation time.

The task snapshot is used for storing a state value of the monitoring task in a memory at a specified time point, namely running information of the monitoring task.

Specifically, the step of generating the task snapshot corresponding to the monitoring subtask based on the preset time interval refers to acquiring the running information of the monitoring subtask at intervals of the preset time interval, and storing the task snapshots of the monitoring subtask at different time points according to the running information. For example, the preset time interval may be 5 seconds, and the like, and a task snapshot of the monitoring subtask may be generated every 5 seconds.

The method comprises the following steps of detecting whether the operation of a monitoring subtask is abnormal or not, detecting through a heartbeat mechanism, sending a request mode to an opposite side by the heartbeat mechanism to detect whether a client side or a server side is alive or not, wherein the common heartbeat detection has two types: a heartbeat mechanism of the socket SO _ KEEPALIVE periodically sends heartbeat packets to the opposite side, and the opposite side can automatically reply after receiving the heartbeat packets; the application itself implements the heartbeat mechanism, again using a periodic request transmission.

When the heartbeat of the monitoring subtask is detected to be overtime, the monitoring subtask can be determined to be abnormal in operation, and further, the task snapshot and the generation moment restart the monitoring subtask.

In some embodiments, the data of the task snapshot corresponding to the monitoring subtask may be multiple, each task snapshot corresponds to a generation time, and in order to further ensure high availability of the monitoring task, the step "restart the monitoring subtask based on the task snapshot and the generation time" may include the following steps:

acquiring abnormal time of monitoring the abnormal operation of the subtasks;

determining a target generation time closest to the abnormal time from the plurality of generation times;

determining a task snapshot corresponding to a target generation moment to obtain a target task snapshot;

restarting the monitoring subtask based on the target task snapshot.

The abnormal moment when the monitoring subtask runs abnormally refers to the heartbeat overtime moment of the monitoring subtask when the state of the monitoring subtask is detected based on a heartbeat mechanism.

For example, monitoring a task snapshot corresponding to a subtask may include: the method includes the steps of a first task snapshot, a second task snapshot, a third task snapshot, a fourth task snapshot and the like, wherein the generation time of the first task snapshot may be: the first time, the generation time of the second task snapshot may be the second time, and the generation time of the third task snapshot may be: and at the third moment, the generation moment of the fourth task snapshot may be a fourth moment, and the generation moments of the task snapshots from far to near according to the distance from the abnormal moment are respectively: the first time, the second time, the third time, and the fourth time, the target generation time closest to the abnormal time may be determined as: and a fourth time.

Further, the task snapshot corresponding to the fourth time is obtained as follows: and restarting the monitoring subtask with abnormal operation based on the operation state of the monitoring subtask in the fourth snapshot, thereby ensuring the availability and the operation accuracy of the monitoring subtask.

In some embodiments, in order to meet the real-time adjustment of the monitoring rule by the user, after the step "at least one monitoring subtask is set under the monitoring task based on at least one monitoring rule information", before the step "the monitoring subtask is invoked to perform abnormal data detection on log data corresponding to the target service", the method may further include the following steps:

when the updating operation of the monitoring rule information is detected, updating the monitoring rule information to obtain updated monitoring rule information;

adjusting the monitoring subtask under the monitoring task based on the updated monitoring rule information to obtain an adjusted monitoring subtask;

the step of invoking the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service may include the following operations:

and calling the adjusted monitoring subtask to perform abnormal data detection on the log data corresponding to the target service.

The updating operation refers to editing the monitoring rule information under the monitoring task, for example, the updating operation may include adding the monitoring rule information, deleting the monitoring rule information, modifying the monitoring rule information, and the like.

For example, the initial monitoring rule information under the monitoring task may include: the first monitoring rule information, the second monitoring rule information, and the updating operation may be adding monitoring rule information, for example, the added monitoring rule information may be: and third monitoring rule information. Further, obtaining updated monitoring rule information corresponding to the monitoring task through the updating operation includes: first monitoring rule information, second monitoring rule information and third monitoring rule information.

Further, because the monitoring rule information is added, correspondingly, a monitoring subtask is set for the newly added monitoring rule information, that is, the monitoring subtask initially set under the monitoring task includes: the first monitoring subtask corresponding to the first monitoring rule information and the second monitoring subtask corresponding to the second monitoring rule information are adjusted through the updated monitoring rule information, namely, a third monitoring subtask corresponding to the third monitoring rule information is newly set, and the adjusted monitoring subtask obtained under the monitoring task comprises the following steps: a first monitoring subtask, a second monitoring subtask, and a third monitoring subtask. Finally, the first monitoring subtask, the second monitoring subtask, and the third monitoring subtask may be sequentially invoked to detect the log data of the target service, so as to obtain a detection result of the log data of the target service, that is, a log monitoring result.

In some embodiments, the log monitoring result includes abnormal log data, and in order to alarm the abnormal log data, after the step "call the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service, and obtain the log monitoring result", the method may further include the following steps:

acquiring a timestamp for uploading abnormal log data to a preset message system;

determining log data to be displayed from a preset message system based on the timestamp;

and displaying the abnormal log data and the log data to be displayed.

The time stamp of uploading the abnormal log data to the preset message system refers to a time point of uploading the abnormal log data to the preset message system for the first time through the log collection tool.

The log data to be displayed are determined from the preset message system according to the time stamp, and the log data to be displayed can be obtained according to the log data which is uploaded to the preset message system before and after the preset time stamp.

For example, the exception log data may include an exception field: the log data of error, and the time for acquiring the exception log data to be sent to Topic of the preset message system may be: 11/09/17/27 s in 2021, resulting in a time stamp that can be automatically recorded by Topic. Further, according to the timestamp, data located 5 minutes before and after the timestamp (which may be 2.5 minutes before the timestamp and 2.5 minutes after the timestamp) may be acquired from Topic, so as to obtain a context of the abnormal log data, that is, the log data to be displayed, and display the abnormal log data and the log data to be displayed, so that a user can modify the abnormal log data according to the context, and the like.

In some embodiments, in order to improve the effectiveness of prompting the abnormal log data, the step "displaying the abnormal log data and the log data to be displayed" may include the following operations:

acquiring a first importance level of abnormal log data and a second importance level of the log data to be displayed;

respectively determining a first display mode of abnormal log data and a second display mode of the log data to be displayed based on the first importance level and the second importance level;

and displaying the abnormal log data according to a first display mode, and displaying the log data to be displayed according to a second display mode.

In the embodiment of the present application, the abnormal log data is displayed, that is, an alarm is given according to the abnormal log data, and the alarm manner may include multiple manners, for example, the alarm manner may include: intercom software, mailbox, telephone, etc. Further, for a plurality of pieces of log data of the target service, different alarm triggering conditions can be set for different pieces of log data, for example, immediate alarm, threshold alarm, condition triggering alarm, repeated log folding alarm, and the like can be performed on the log data to be monitored.

Specifically, the respective alarm modes can be determined according to the abnormal log data and the importance levels of the log data to be displayed. The importance level of the log data can be set by a user or can be set by the user through background monitoring.

For example, the importance levels of the log data may include, in order from high to low: the importance level A, the importance level B, the third important basic C and the like, and the alarm modes can comprise the following steps in sequence from strong to weak according to the obvious degree: the first alarm mode, the second alarm mode, the third alarm mode, and the like, it may be determined that the alarm mode of the log data with the importance level a may be: a first alarm mode; the alarm mode of the log data with the importance level B may be: a second alarm mode; the alarm mode of the log data with the importance level C may be: and a third alarm mode. Therefore, the alarm mode of the log data is determined according to the importance level of the log data, the important log data is remarkably reminded when the log data is abnormal, and the use problem of the target service of the target application is reduced.

In some embodiments, log data of each service of the target application is stored in each topic of the preset message system, and one service corresponds to one topic, and in order to ensure balanced allocation of processing resources of each task to improve data processing efficiency, the method may further include the following steps:

acquiring the partition number of each topic, and determining target processing resources required to be allocated to the monitoring tasks of each business based on the partition number;

and adjusting the processing resources of each service based on the target processing resources and the existing processing resources of each service.

In the embodiment of the present application, the preset message system may be Kafka, Kafka may divide a theme into a plurality of partitions (Partition), and a message may be selectively stored in each Partition according to a Partition rule, so that load balancing and horizontal expansion may be achieved.

Specifically, the multiple partitions can improve data throughput, improve parallelism when data is consumed at downstream, accelerate consumption speed, and reduce delay, and one partition can be understood as one file, and one Topic has multiple files, so that a program can read multiple log files at the same time, and the reading speed is accelerated.

For example, a Topic may be divided into 10 partitions, and Kafka internally distributes the 10 partitions as evenly as possible to different servers according to a certain algorithm, such as: the a server is responsible for partition 1 of Topic and the B server is responsible for partition 2 of Topic, in which case Kafka will, according to a certain algorithm, possibly partition 1 for the last message and partition 2 for the next message if the publisher sends the message without specifying which partition to send.

When determining target processing resources to be allocated to the monitoring tasks of each service based on the number of the partitions, allocating the target processing resources in combination with the flow of the log data of the service, and when the flow of the log data of the service is large, allocating more processing resources to the log data of the service; when the traffic volume of the log data of the service is small, less processing resources can be allocated to it.

The target processing resources are processing resources needed by the monitoring tasks corresponding to the services calculated in real time, and then the processing resources needed by the monitoring tasks of the services are dynamically adjusted according to the current existing processing resources of the services and the target processing resources, so that reasonable allocation of the processing resources can be ensured, and the processing efficiency of the log data is improved.

The embodiment of the application discloses a log monitoring method, which comprises the following steps: determining a target service needing to be monitored under a target application, and establishing a corresponding monitoring task for the target service, wherein the monitoring task is used for monitoring log data generated under the target service; acquiring at least one piece of monitoring rule information of log data of a target application under a target service; setting at least one monitoring subtask under the monitoring task based on at least one piece of monitoring rule information, wherein one monitoring subtask corresponds to one piece of monitoring rule information; and calling the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service to obtain a log monitoring result. Therefore, the log monitoring efficiency can be improved.

Based on the above description, the log monitoring method of the present application will be further described below by way of example. Referring to fig. 2, fig. 2 is a schematic flowchart illustrating another log monitoring method according to an embodiment of the present disclosure. Taking the application of the log monitoring method to the log data monitoring of the game application as an example, the specific process can be as follows:

201. and acquiring log data generated by the target game under the target game service, and transmitting the log data to a target theme of a preset message system through a log acquisition tool.

In the embodiment of the application, the target game can be classified into different game services according to the game data type. The log data generated by different game services can be monitored respectively.

The log collection tool is used for transmitting log data to a preset message system, the log collection tool can be a Filebeat, and the preset message system can be Kafka. Specifically, before transmitting the log data of the target game service to the preset message system, a theme corresponding to the target game service may be created in the preset message system in advance to obtain the target theme. Furthermore, the log data of the target game service is transmitted to the target theme of the preset message system, so that the log data of the same target game service can be managed conveniently.

202. And creating a corresponding monitoring task for the log data of the target game service according to the target theme.

In the embodiment of the application, a monitoring task can be created for one theme, and the monitoring task is used for carrying out anomaly detection on log data. For example, the monitoring task may be a Flink monitoring task.

Wherein, Flink refers to an open source stream processing framework, and the core of the framework is a distributed stream data stream engine written in Java and Scala. Flink executes arbitrary stream data programs in a data parallel and pipelined manner, and Flink's pipelined runtime system can execute batch and stream processing programs.

203. And acquiring a monitoring rule corresponding to the log data of the target game service, and setting a monitoring subtask corresponding to the monitoring rule under the monitoring task.

Specifically, the user may pre-configure different log monitoring modes according to the game data type of the target game service, and configure and store the log monitoring modes in mysql (relational database management system).

Wherein, the log monitoring mode refers to a monitoring rule, and the log monitoring rule may include: monitoring is performed for a certain keyword of the log, such as error; monitoring a certain value of the log, such as the transaction amount is larger than the certain value; monitoring the number of times a certain keyword appears, and the like. The user can select the monitoring rule and set the detection condition according to the selected monitoring rule. Specifically, the monitoring rule may include various categories, for example, the monitoring rule includes a statistics category, a value accumulation category, an original log category, a containment category, and the like.

Specifically, monitoring rules corresponding to log data of the target game service can be obtained from mysql, the monitoring rules are grouped, one monitoring rule can be divided into one group, and then for each group, a monitoring subtask is set under the monitoring task to obtain the monitoring subtask corresponding to each monitoring rule.

For example, the monitoring rules may include: the statistical monitoring rule, the numerical monitoring rule and the numerical accumulation monitoring rule can set a first monitoring subtask, a second monitoring subtask and a third monitoring subtask according to the statistical monitoring rule, the numerical monitoring rule and the numerical accumulation monitoring rule under the monitoring task. Furthermore, the monitoring program submits the monitoring tasks to the yann (a resource coordinator is a new Hadoop resource manager which is a universal resource management system and can provide uniform resource management and scheduling for upper-layer application) for management, different groups correspond to different monitoring subtasks, and the monitoring subtasks are isolated from each other.

204. And calling the monitoring subtask to perform abnormity detection on the log data according to the corresponding monitoring rule, and determining abnormal log data.

Further, a monitoring task is started, namely a monitoring program is operated, log data meeting the alarm condition is sent to downstream as an alarm message by analyzing the log data, the occurrence frequency of all the alarm messages in the time window is counted according to the time window, and the alarm messages are sent to the downstream at regular time.

The alarm condition refers to a monitoring rule, the alarm condition is satisfied, that is, the monitoring rule is satisfied, and the log data satisfying the monitoring rule can be determined as abnormal log data. And generating an alarm message according to the abnormal log data, and then carrying out aggregation in a window.

In some embodiments, because the data volume of log data is large, multiple threads are required to perform consumption and calculation simultaneously, each thread corresponds to one or more partitions of Topic, window statistics is required after initial calculation, and data is sent to downstream threads in sequence through polling of a self-defined algorithm, so that the data can be guaranteed to be consumed uniformly.

For example, when window aggregation is performed, data is distributed to each parallelism in the form of round-robin (the principle of the round-robin algorithm is to alternately distribute requests from a user to servers in the interior from 1 to N (the number of internal servers) each time, and then to start a loop again).

In some embodiments, the monitoring task may trigger a snapshot at regular time in the running process, the snapshot may be used to store a state value in a memory at a specified time point, in order to ensure high availability of the monitoring program, the program may read the Flink application state at regular time, continuously update the heartbeat of different tasks, detect that the heartbeat is overtime, restart the task according to the latest snapshot, and ensure high availability and accuracy.

205. And carrying out alarm prompt based on the abnormal log data.

Specifically, after determining the abnormal log data, the abnormal log data needs to be read from the target Topic, when reading data from Topic, a timestamp of the log data written into Topic can be analyzed, then the log data is pulled from the Topic according to the timestamp near the timestamp, that is, the context data of the abnormal log data can be obtained, and then the context data and the abnormal log data are displayed at the front end, so that a user can view the abnormal log data.

In some embodiments, after log data is read from Topic of Kafka, the log data may go through different operation logics, including filtering, mapping, aggregation by field, triggering alarm, window statistics triggering, etc., where these operation logics are operators, such as filtering, mapping, similar log aggregation, and triggering alarm and statistics.

For example, mysql configured for storing log monitoring may be used as a data source, monitoring rules in the same group are aggregated into a list and propagated to a downstream operator as a broadcast stream, and when the monitoring rules in the group are added, modified, and deleted, the corresponding monitoring rules are automatically detected and automatically broadcast to the downstream operator, so as to dynamically modify the monitoring rules in the group in real time.

The embodiment of the application discloses a log monitoring method, which comprises the following steps: the method comprises the steps of obtaining log data generated by a target game under a target game service, transmitting the log data to a target theme of a preset message system through a log collecting tool, creating a corresponding monitoring task for the log data of the target game service according to the target theme, obtaining a monitoring rule corresponding to the log data of the target game service, setting a monitoring subtask corresponding to the monitoring rule under the monitoring task, calling the monitoring subtask to carry out abnormity detection on the log data according to the corresponding monitoring rule, determining abnormal log data, and carrying out alarm prompt based on the abnormal log data. Thus, the efficiency of detecting abnormal log data can be improved.

In order to better implement the log monitoring method provided by the embodiment of the present application, the embodiment of the present application further provides a log monitoring device based on the log monitoring method. The terms are the same as those in the log monitoring method, and specific implementation details can refer to the description in the method embodiment.

Referring to fig. 3, fig. 3 is a block diagram of a log monitoring apparatus according to an embodiment of the present disclosure, where the apparatus includes:

a first determining unit 301, configured to determine a target service that needs to be monitored in a target application, and create a corresponding monitoring task for the target service, where the monitoring task is used to monitor log data generated in the target service;

a first obtaining unit 302, configured to obtain at least one monitoring rule information of log data of the target application in the target service;

a setting unit 303, configured to set at least one monitoring subtask under the monitoring task based on the at least one monitoring rule information, where one monitoring subtask corresponds to one monitoring rule information;

the detecting unit 304 is configured to invoke the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service, so as to obtain a log monitoring result.

In some embodiments, the apparatus may further comprise:

in some embodiments, the detection unit 304 may include:

In some embodiments, the detection unit 304 may include:

In some embodiments, the apparatus may further comprise:

In some embodiments, the restart unit may include:

In some embodiments, the apparatus may further comprise:

In some embodiments, the first determining unit 301 may include:

In some embodiments, the apparatus may further comprise:

In some embodiments, the second presentation unit may comprise:

In some embodiments, the apparatus may further comprise:

The embodiment of the application discloses a log monitoring device, which determines a target service needing monitoring under a target application through a first determining unit 301, and establishes a corresponding monitoring task for the target service, wherein the monitoring task is used for monitoring log data generated under the target service, a first obtaining unit 302 obtains at least one monitoring rule information of the log data of the target application under the target service, a setting unit 303 sets at least one monitoring subtask under the monitoring task based on the at least one monitoring rule information, one monitoring subtask corresponds to one monitoring rule information, and a detecting unit 304 calls the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service, so as to obtain a log monitoring result. Therefore, the log monitoring efficiency can be improved.

Correspondingly, the embodiment of the application also provides a computer device, and the computer device can be a server. As shown in fig. 4, fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present application. The computer apparatus 500 includes a processor 501 having one or more processing cores, a memory 502 having one or more computer-readable storage media, and a computer program stored on the memory 502 and executable on the processor. The processor 501 is electrically connected to the memory 502. Those skilled in the art will appreciate that the computer device configurations illustrated in the figures are not meant to be limiting of computer devices and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components.

The processor 501 is a control center of the computer device 500, connects various parts of the entire computer device 500 using various interfaces and lines, performs various functions of the computer device 500 and processes data by running or loading software programs and/or modules stored in the memory 502, and calling data stored in the memory 502, thereby monitoring the computer device 500 as a whole.

In this embodiment of the application, the processor 501 in the computer device 500 loads instructions corresponding to processes of one or more applications into the memory 502, and the processor 501 runs the applications stored in the memory 502, so as to implement various functions as follows:

determining a target service needing to be monitored under a target application, and establishing a corresponding monitoring task for the target service, wherein the monitoring task is used for monitoring log data generated under the target service; acquiring at least one piece of monitoring rule information of log data of a target application under a target service; setting at least one monitoring subtask under the monitoring task based on at least one piece of monitoring rule information, wherein one monitoring subtask corresponds to one piece of monitoring rule information; and calling the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service to obtain a log monitoring result.

The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.

Optionally, as shown in fig. 4, the computer device 500 further includes: touch-sensitive display screen 503, radio frequency circuit 504, audio circuit 505, input unit 506 and power 507. The processor 501 is electrically connected to the touch display screen 503, the radio frequency circuit 504, the audio circuit 505, the input unit 506, and the power supply 507, respectively. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 4 does not constitute a limitation of computer devices, and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components.

The touch display screen 503 can be used for displaying a graphical user interface and receiving an operation instruction generated by a user acting on the graphical user interface. The touch display screen 503 may include a display panel and a touch panel. The display panel may be used, among other things, to display information entered by or provided to a user and various graphical user interfaces of the computer device, which may be composed of graphics, guide information, icons, video, and any combination thereof. Alternatively, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. The touch panel may be used to collect touch operations of a user on or near the touch panel (for example, operations of the user on or near the touch panel using any suitable object or accessory such as a finger, a stylus pen, and the like), and generate corresponding operation instructions, and the operation instructions execute corresponding programs. Alternatively, the touch panel may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 501, and can receive and execute commands sent by the processor 501. The touch panel may overlay the display panel, and when the touch panel detects a touch operation thereon or nearby, the touch panel transmits the touch operation to the processor 501 to determine the type of the touch event, and then the processor 501 provides a corresponding visual output on the display panel according to the type of the touch event. In the embodiment of the present application, the touch panel and the display panel may be integrated into the touch display screen 503 to implement input and output functions. However, in some embodiments, the touch panel and the display panel may be implemented as two separate components to perform the input and output functions. That is, the touch display 503 can also be used as a part of the input unit 506 to implement an input function.

The rf circuit 504 may be used for transceiving rf signals to establish wireless communication with a network device or other computer device via wireless communication, and for transceiving signals with the network device or other computer device.

Audio circuitry 505 may be used to provide an audio interface between a user and a computer device through speakers, microphones. The audio circuit 505 may transmit the electrical signal converted from the received audio data to a speaker, and convert the electrical signal into a sound signal for output; on the other hand, the microphone converts the collected sound signal into an electrical signal, which is received by the audio circuit 505 and converted into audio data, which is then processed by the audio data output processor 501, and then transmitted to, for example, another computer device via the rf circuit 504, or output to the memory 502 for further processing. The audio circuitry 505 may also include an earbud jack to provide communication of a peripheral headset with the computer device.

The input unit 506 may be used to receive input numbers, character information, or user characteristic information (e.g., fingerprint, iris, facial information, etc.), and generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control.

The power supply 507 is used to power the various components of the computer device 500. Optionally, the power supply 507 may be logically connected to the processor 501 through a power management system, so as to implement functions of managing charging, discharging, power consumption management, and the like through the power management system. The power supply 507 may also include any component including one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

Although not shown in fig. 4, the computer device 500 may further include a camera, a sensor, a wireless fidelity module, a bluetooth module, etc., which are not described in detail herein.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

As can be seen from the above, the computer device provided in this embodiment determines a target service to be monitored in a target application, and creates a corresponding monitoring task for the target service, where the monitoring task is used to monitor log data generated in the target service; acquiring at least one piece of monitoring rule information of log data of a target application under a target service; setting at least one monitoring subtask under the monitoring task based on at least one piece of monitoring rule information, wherein one monitoring subtask corresponds to one piece of monitoring rule information; and calling the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service to obtain a log monitoring result.

It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.

To this end, embodiments of the present application provide a computer-readable storage medium, in which a plurality of computer programs are stored, where the computer programs can be loaded by a processor to execute the steps in any log monitoring method provided by the embodiments of the present application. For example, the computer program may perform the steps of:

acquiring at least one piece of monitoring rule information of log data of a target application under a target service;

setting at least one monitoring subtask under the monitoring task based on at least one piece of monitoring rule information, wherein one monitoring subtask corresponds to one piece of monitoring rule information;

Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

Since the computer program stored in the storage medium can execute the steps in any log monitoring method provided in the embodiment of the present application, beneficial effects that can be achieved by any log monitoring method provided in the embodiment of the present application can be achieved, which are detailed in the foregoing embodiments and will not be described herein again.

The log monitoring method, the log monitoring device, the log monitoring storage medium and the computer device provided by the embodiments of the present application are introduced in detail, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the embodiments is only used to help understanding the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method of log monitoring, the method comprising:

2. The method according to claim 1, wherein after the setting of at least one monitoring subtask under the monitoring task based on the at least one monitoring rule information, before the invoking of the monitoring subtask performs abnormal data detection on log data corresponding to the target service, further comprising:

the calling the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service includes:

3. The method according to claim 1, wherein the invoking of the monitoring subtask to perform abnormal data detection on log data corresponding to the target service includes:

judging whether the log data corresponding to the target service meets the monitoring rule information corresponding to the monitoring subtask;

4. The method of claim 3, further comprising:

if the abnormal log data are determined to exist according to the log detection result, counting the abnormal times of the abnormal log data in a preset time;

and displaying the abnormal log data and the abnormal times.

5. The method of claim 1, further comprising:

generating at least one task snapshot corresponding to the monitoring subtask based on a preset time interval in the process of calling the monitoring subtask to perform abnormal data detection on the log data corresponding to the target service;

acquiring the generation moment of the task snapshot;

6. The method according to claim 5, wherein the number of the task snapshots is multiple, and different task snapshots correspond to different generation moments;

the restarting the monitoring subtask based on the task snapshot and the generation time comprises:

acquiring abnormal time when the monitoring subtask is abnormally operated;

determining a task snapshot corresponding to the target generation moment to obtain a target task snapshot;

restarting the monitoring subtask based on the target task snapshot.

7. The method of claim 1, further comprising, before the creating the corresponding monitoring task for the target service:

acquiring log data generated under the target service;

uploading the log data to a target theme corresponding to the target service in a preset message system;

the creating of the corresponding monitoring task for the target service includes:

and establishing the monitoring task for the target service based on the target theme.

8. The method of claim 7, wherein the log monitoring results comprise exception log data;

after the calling of the monitoring subtask performs abnormal data detection on the log data corresponding to the target service to obtain a log monitoring result, the method further includes:

acquiring a timestamp of the abnormal log data uploaded to the preset message system;

determining log data to be displayed from the preset message system based on the timestamp;

and displaying the abnormal log data and the log data to be displayed.

9. The method according to claim 8, wherein the exposing the abnormal log data and the log data to be exposed comprises:

acquiring a first importance level of the abnormal log data and a second importance level of the log data to be displayed;

respectively determining a first display mode of the abnormal log data and a second display mode of the log data to be displayed based on the first importance level and the second importance level;

and displaying the abnormal log data according to the first display mode, and displaying the log data to be displayed according to the second display mode.

10. The method according to claim 1, wherein log data of each service of the target application is stored under each topic of a preset message system, one service corresponding to one topic;

the method further comprises the following steps:

11. A log monitoring apparatus, the apparatus comprising:

12. A computer device comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements the log monitoring method of any of claims 1 to 10 when executing the program.

13. A storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the log monitoring method of any one of claims 1 to 10.