CN114595127A - Log exception handling method, device, equipment and storage medium - Google Patents

Log exception handling method, device, equipment and storage medium Download PDF

Info

Publication number
CN114595127A
CN114595127A CN202011395195.6A CN202011395195A CN114595127A CN 114595127 A CN114595127 A CN 114595127A CN 202011395195 A CN202011395195 A CN 202011395195A CN 114595127 A CN114595127 A CN 114595127A
Authority
CN
China
Prior art keywords
log
component
data processing
target
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011395195.6A
Other languages
Chinese (zh)
Inventor
昝杰
杨杰
李志阳
邢家树
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202011395195.6A priority Critical patent/CN114595127A/en
Publication of CN114595127A publication Critical patent/CN114595127A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging

Abstract

The application provides a log exception handling method, a log exception handling device, log exception handling equipment and a storage medium, wherein the scheme is applied to a service system, and log analysis equipment of the service system obtains logs related to component operation in data processing equipment; determining a target keyword set matched with a log generation source from a plurality of keyword sets, wherein the target keyword set comprises a plurality of keywords used for representing the log generation source and having abnormity; matching the log with each keyword in the target keyword set; if at least one keyword matched with the log exists in the target keyword set, reconstructing and starting a substitute component of the target component or a substitute operation environment of the component operation environment in the service system based on the log generation source of the log, and finishing the operation of the target component or the component operation environment in the data processing equipment corresponding to the log. The scheme of this application can in time discover the subassembly unusual operation and solve the subassembly unusual operation, resumes the normal operating of subassembly.

Description

Log exception handling method, device, equipment and storage medium
Technical Field
The present application relates to the field of intelligent technologies, and in particular, to a log exception handling method, apparatus, device, and storage medium.
Background
A service system such as a cloud platform and a block chain is generally configured by a plurality of data processing apparatuses.
In a service system such as a cloud platform or a block chain, each service system generally runs at least one component, and each component has a different corresponding function. If the component in the data processing device runs abnormally, the data processing device cannot provide the function corresponding to the component, so that the function which can be realized by the service system is influenced, and the service system is caused to have abnormal service. Therefore, how to reduce the service exception caused by the abnormal operation of the components in the service system is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
In view of this, the present application provides a log exception handling method, apparatus, device and storage medium, so as to find out an exception of component operation in time and solve the exception of component operation.
In order to achieve the purpose, the application provides the following technical scheme:
in one aspect, the present application provides a log exception handling method applied to a log processing device in a service system, where the service system includes multiple data processing devices and at least one log processing device, and at least one component is run in the data processing device, including:
obtaining a log related to component operation in the data processing equipment, wherein the log comprises log metadata of the log, the log metadata comprises a log generation source of the log, and the log generation source represents a target component or a component operation environment to which the log belongs;
determining a target keyword set matched with the log generation source from a plurality of keyword sets, wherein the target keyword set comprises a plurality of keywords used for representing that the log of the log generation source has abnormity;
matching the log with each keyword in a target keyword set;
if at least one keyword matched with the log exists in the target keyword set, reconstructing and starting a substitute component of the target component or a substitute operation environment of the component operation environment in a service system based on a log generation source of the log, and finishing the operation of the target component or the component operation environment in the data processing equipment corresponding to the log.
In a possible implementation manner, if at least one keyword matching the log exists in the target keyword set, reconstructing and starting an alternative component of the target component or an alternative operating environment of the component operating environment in a service system based on a log generation source of the log, and ending the operation of the target component or the component operating environment in the data processing device corresponding to the log, the method includes:
if at least one keyword matched with the log exists in the target keyword set, if the log generation source represents the log generated by the log as a target component, starting and operating a backup component of the target component in the data processing equipment, and ending the operation of the target component in the data processing equipment;
and if at least one keyword matched with the log exists in the target keyword set, if the log generation source represents that the log is a log related to the component operation environment, starting standby equipment corresponding to the data processing equipment and finishing the operation of the data processing equipment, wherein the standby equipment has the same operation environment as the data processing equipment.
In yet another possible implementation, each keyword in the target keyword set has a risk level;
if at least one keyword matched with the log exists in the target keyword set, reconstructing and starting a substitute component of the target component or a substitute operation environment of the component operation environment in a service system based on a log generation source of the log, and ending the operation of the target component or the component operation environment in the data processing device corresponding to the log, the method includes:
if at least one keyword matched with the log exists in the target keyword set and the risk level of the at least one keyword belongs to a set low risk level, adding one to the risk number corresponding to the log generation source of the log;
if the risk times corresponding to the log generation source of the log reach set times, reconstructing and starting the alternative component of the target component or the alternative operating environment of the component operating environment in the service system, and ending the operation of the target component or the component operating environment in the data processing equipment corresponding to the log.
In another possible implementation manner, the obtaining a log related to the running of a component in the data processing device includes:
obtaining a log queue to be processed, wherein the log queue comprises a plurality of logs from the plurality of data processing devices;
before determining a target keyword set matching the log generation source from the plurality of keyword sets, the method further comprises:
and distributing a set number of logs in the log queue to a set number of log processing threads according to the sequence of the logs in the log queue, so that the set number of log processing threads process the set number of logs in parallel.
In another aspect, the present application further provides a log exception handling method applied to a service system, where the service system includes multiple data processing devices and at least one log processing device, and at least one component runs in the data processing device, and the method includes:
the data processing equipment obtains a log related to component operation on the data processing equipment, and adds log metadata of the log to the log, wherein the log metadata of the log comprises a log generation source of the log, and the log generation source represents a target component or a component operation environment to which the log belongs;
the data processing equipment sends the log added with the log metadata to the log processing equipment;
the log processing equipment determines a target keyword set matched with the log generation source from a plurality of keyword sets, wherein the target keyword set comprises a plurality of keywords used for representing the log generation source and having abnormity;
the log processing equipment matches the log with each keyword in a target keyword set;
and under the condition that at least one keyword matched with the log exists in the target keyword set, reconstructing and starting a substitute component of the target component or a substitute operation environment of the component operation environment in a service system based on a log generation source of the log by the log processing equipment, and finishing the operation of the target component or the component operation environment in the data processing equipment corresponding to the log.
In another aspect, the present application further provides a log exception handling apparatus, which is applied to a log processing device in a service system, where the service system includes a plurality of data processing devices and at least one log processing device, and at least one component runs in the data processing device, and the apparatus includes:
a log obtaining unit, configured to obtain a log related to component operation in the data processing apparatus, where the log includes log metadata of the log, the log metadata includes a log generation source of the log, and the log generation source represents a target component or a component operation environment to which the log belongs;
the set determining unit is used for determining a target keyword set matched with the log generation source from a plurality of keyword sets, wherein the target keyword set comprises a plurality of keywords used for representing the log generation source and having abnormity;
the keyword matching unit is used for matching the log with each keyword in the target keyword set;
and if at least one keyword matched with the log exists in the target keyword set, reconstructing and starting a substitute component of the target component or a substitute operation environment of the component operation environment in a service system based on a log generation source of the log, and ending the operation of the target component or the component operation environment in the data processing equipment corresponding to the log.
In yet another aspect, the present application further provides a computer device comprising a processor and a memory, wherein the memory has stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the log exception handling method as described in any one of the above.
In yet another aspect, the present application further provides a computer-readable storage medium having at least one instruction, at least one program, a set of codes, or a set of instructions stored therein, which is loaded and executed by a processor to implement the log exception handling method as described in any one of the above.
As can be seen from the above, in the service system of the present application, the data processing device reports the log related to the operation of the component to the log processing device, and the log processing device performs keyword matching on the log and the keyword set matched with the log generation source of the log. Because the keyword set stores the keyword which represents that the log of the log generation source has abnormity, if the log is matched with at least one keyword in the keyword set, the log is indicated to belong to the abnormal log with abnormity, and the abnormal log actually represents that the component corresponding to the log generation source or the component operating environment has abnormity, so that the abnormal component operation in each data processing device of the server system can be found in time.
On the basis, under the condition that the log is matched with at least one keyword in the keyword set, the method and the system can reconstruct and start the alternative component of the target component corresponding to the log or the alternative operating environment of the component operating environment corresponding to the log in the service system, so that the abnormity of some components or component operating environments in the service system can be repaired in time, and the availability of the service system is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on the provided drawings without creative efforts.
FIG. 1 is a schematic diagram of a service system architecture to which the present application is applicable;
FIG. 2 is a flow chart illustrating an embodiment of a log exception handling method provided by the present application;
FIG. 3 is a flow chart illustrating a log exception handling method according to yet another embodiment of the present application;
FIG. 4 is a schematic diagram illustrating a principle of parallel processing of logs by log processing threads in the present application;
FIG. 5 is a block diagram of an implementation principle of the log exception handling method of the present application;
FIG. 6 is a schematic view showing a flow interaction of the log exception handling method of the present application;
FIG. 7 is a schematic diagram illustrating an exemplary embodiment of a log exception handling apparatus according to the present application;
fig. 8 is a schematic diagram illustrating a component architecture of a computer device according to the present application.
Detailed Description
The scheme of the application is suitable for a service system formed by a plurality of data processing devices, and the service system can be used for data storage or data calculation and the like.
For ease of understanding, the following description will first describe the architecture of a service system to which the solution of the present application is applied. Fig. 1 is a schematic diagram illustrating a structure of a service system to which the present application is applied.
As shown in fig. 1, the service system may include: a plurality of data processing devices 101, and at least one log analysis device 102.
The data processing device 101 is a device used by the service system to provide services, such as processing of service data of services related to the service system. The log analysis device is a device, other than the data processing device, for monitoring operation abnormality of components in the data processing device, for example, the log analysis device may also be a commonly-called peripheral device.
In the application, each data processing device may be in communication with the log analysis device via a network.
Wherein at least one component is installed and operated in the data processing device. Wherein a component is a program running in the operating system of a computer or a set of programs providing a specific service to a user. Each component corresponds to a function provided by the data processing apparatus.
Components in a data processing apparatus operate in a component operating environment of the data processing apparatus. The component execution environment refers to a software environment such as a container in which the component is executed and a hardware environment. For example, the component execution environment may include a linux system or a virtual machine.
It will be appreciated that the functionality that the service system can implement will vary depending on the application requirements.
For example, the service system may be a cloud service platform, such as a cloud database or a cloud computer platform.
Correspondingly, the data processing device in the service system can be a cloud server providing basic cloud computing services such as cloud service, cloud computing, cloud storage, cloud communication, big data and artificial intelligence platforms and the like.
The cloud computing service is a computing service based on cloud technology. Cloud technology (cloud technology) refers to a hosting technology for unifying series resources such as hardware, software, and network in a wide area network or a local area network to realize calculation, storage, processing, and sharing of data.
Cloud technology (Cloud technology) is based on a general term of network technology, information technology, integration technology, management platform technology, application technology and the like applied in a Cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have an own identification mark and needs to be transmitted to a background system for logic processing, data of different levels can be processed separately, and various industry data need strong system background support and can be realized only through cloud computing.
As another example, the service system may be a blockchain system, and based thereon, the data processing device may be a blockchain node in a blockchain.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, and an application service layer.
The block chain underlying platform can comprise processing modules such as user management, basic service, intelligent contract and operation monitoring. The user management module is responsible for identity information management of all blockchain participants, and comprises public and private key generation maintenance (account management), key management, user real identity and blockchain address corresponding relation maintenance (authority management) and the like, and under the authorization condition, the user management module supervises and audits the transaction condition of certain real identities and provides rule configuration (wind control audit) of risk control; the basic service module is deployed on all block chain node equipment and used for verifying the validity of the service request, recording the service request to storage after consensus on the valid request is completed, for a new service request, the basic service firstly performs interface adaptation analysis and authentication processing (interface adaptation), then encrypts service information (consensus management) through a consensus algorithm, transmits the service information to a shared account (network communication) completely and consistently after encryption, and performs recording and storage; the intelligent contract module is responsible for registering and issuing contracts, triggering the contracts and executing the contracts, developers can define contract logics through a certain programming language, issue the contract logics to a block chain (contract registration), call keys or other event triggering and executing according to the logics of contract clauses, complete the contract logics and simultaneously provide the function of upgrading and canceling the contracts; the operation monitoring module is mainly responsible for deployment, configuration modification, contract setting, cloud adaptation in the product release process and visual output of real-time states in product operation, such as: alarm, monitoring network conditions, monitoring node equipment health status, and the like.
The platform product service layer provides basic capability and an implementation framework of typical application, and developers can complete block chain implementation of business logic based on the basic capability and the characteristics of the superposed business. The application service layer provides the application service based on the block chain scheme for the business participants to use.
Of course, the service system may also be any system with multiple data processing devices deployed to provide corresponding functional implementation through the multiple data processing devices.
In the embodiment of the application, a log collection program runs in the data processing equipment, logs related to component running in the data processing equipment can be collected through the log collection program, and the collected logs are reported to the log analysis equipment.
In an optional manner, the log analysis device may obtain a log collection rule configured by a user, store the configured log collection rule, and send the log collection rule to the data processing device. The log analysis device may send the log collection rule of the latest version to the data processing device, and if the log analysis device detects that the log collection rule has a version update, the log analysis device sends the log collection rule of the latest version to the data processing device.
For example, the log analysis device may maintain a connection with the data processing device through a heartbeat mechanism, and on this basis, the log analysis device may carry the log collection rule by sending a heartbeat packet to the data processing device. In this case, the log analysis device may transmit the latest version of the log collection rule through the heartbeat packet. For example, the log analysis device may assign a version number to the configured log collection rule, and on this basis, the data processing device may determine whether the log collection rule of the version already exists according to the version number of the log collection rule in the heartbeat packet, and if not, may save the log collection rule of the version and delete the log collection rule saved before.
The log collection rule includes a log type to be collected, metadata required by the log, and the like. The log metadata included in the configured log collection rule may include path information of the log to be collected and a number of a file in which the log is located, such as a number of an index node, for example, an inode number in linux, so that the data processing device may locate and collect the file in which the log is located according to the path information and the code.
The log type is used to specify which log data the component or component operating environment is to collect. For example, the log types may be divided into an operation log file, an output data log, a component test result log, and a state log of an operation environment, and in practical application, one or more log types may be included in the log collection rule as needed.
Wherein, the running log file records some error or key information of the component in the running process. Specifically, the component outputs error information or key information in the running process to a specific file in the form of text, for example, a disk is accessed in the running process of the component, and if the component is abnormally read, an error is reported and recorded in the specific file. The run log file may also include a run log associated with the run environment,
of course, the execution log file may also be an execution log file including the execution log file of the component and the execution environment.
For example, with the service system as a cloud database, the components running in the cloud database may include Structured Query Language (SQL) instances, and the running environment of the components may be a linux system. In a scenario, the logs to be collected may include SQL error logs; and the system log of linux.
The output data log is the data output by the component during operation.
The component test result is a test output obtained by sending an execution command for a test to the component.
The state log of the execution environment includes state values that characterize the state of the execution environment.
Accordingly, the data processing device can collect logs related to component operation in the data processing device according to the log collection rule. For example, collecting the log in the running log file and the output result of the component.
The log analysis device can also be configured with a keyword set used for analyzing log abnormity, wherein the keyword set comprises a plurality of keywords used for representing the abnormity of the log. It is to be understood that, considering that there may be differences in the log from different components or operating environments, the present application may configure different keyword sets for different log generation sources. The log generation source characterizes what component or run-time environment the log is producing.
Correspondingly, the log analysis device can determine a keyword set according to which the abnormal analysis is performed by combining the log generation source of the log to be analyzed, and analyze whether the log has the abnormality or not according to the keyword set.
The log exception handling method of the present application is described below with reference to a flowchart.
First, a log abnormality processing method according to the present application will be described from the log analysis device side. Fig. 2 is a schematic flowchart illustrating an embodiment of a log exception handling method according to the present application. The method of the embodiment is applied to the log analysis equipment.
The method of the embodiment may include:
s201, obtaining logs related to component operation in the data processing equipment.
The log is used for recording information related to the operation of the component, and the log can reflect the operation condition and possible abnormality in the operation of the component. For example, in the present application, the log may be a log generated by a component, or a log corresponding to a component execution environment of a component in the data processing apparatus.
In this application each log includes log metadata. The log metadata of the log is metadata added by the data processing apparatus for the log collected by the data processing apparatus. Specifically, after the data processing device obtains a log related to the operation of the component on the data processing device, log metadata corresponding to the log is added to the log.
The log metadata may be attribute information representing a collection device of the log and a source of the log in the computing device. In this application, the journal metadata of the journal includes at least a journal generation source of the journal. The log generation source characterizes a target component or a component running environment to which the log belongs, that is, the log generation source can determine whether the log is a log generated by the component or a log of the component running environment. For the sake of distinction, the component that generates the log is referred to as a target component.
It can be understood that, in order to distinguish which data processing device the log reports for, the log metadata of the log may further include information of the data processing device that collected the log.
S202, determining a target keyword set matched with the log generation source from the plurality of keyword sets.
In the application, different keyword sets are configured for different types of components and different component operating environments respectively. Each keyword set comprises a plurality of keywords, and the keywords are alarm words for representing that abnormal conditions exist in the components or the component operating environments corresponding to the keyword sets.
For example, a keyword for a component runtime environment may be "device error," which may characterize a disk exception.
Correspondingly, the log generation source of the log can represent the component to which the log belongs or the component operating environment, so that the keyword set suitable for anomaly detection can be determined by combining the log generation source, and for the convenience of distinguishing, the keyword set matched with the log generation source is called a target keyword set. The target keyword set comprises a plurality of keywords used for representing the abnormal condition of the log generation source.
If the log generation source of the log represents that the log is generated by the target component, the target keyword set corresponding to the target component can be determined according to the category corresponding to the target component. If the log generation source representation log of the log is a log of the component operation environment in the data processing equipment, a target keyword set corresponding to the component operation environment can be inquired.
S203, matching the log with each keyword in the target keyword set.
Keywords present in the log that belong to the set of target keywords can be located by matching the log to the keywords.
The specific implementation manner of matching the log with each keyword in the target keyword set may be multiple, and the present application does not limit this.
In an alternative manner, in order to improve matching efficiency, the application may use an AC automaton (Aho-corascik) algorithm to match the log with each keyword in the target keyword set. The AC automaton is a character string multi-mode matching algorithm, and the matching efficiency and accuracy can be improved by matching keywords through the AC automaton algorithm.
S204, if at least one keyword matched with the log exists in the target keyword set, reconstructing and starting a substitute component of the target component or a substitute operation environment of the component operation environment in the service system based on the log generation source of the log, and finishing the operation of the target component or the component operation environment in the data processing equipment corresponding to the log.
It can be understood that if at least one keyword matching the log exists in the target keyword set, the keyword indicating that there is an exception is hit in the log, which also indicates that there is an exception in the component or the component operating environment generating the log.
In a possible case, the log generation source of the log characterizes that the log is a log generated by the target component, that is, the log generation source is the target component, then the substitute component of the target component is reconstructed and started in the service system, and the operation of the original target component is ended.
In one embodiment, the replacement component that reconstructs and launches the target component in the service system can be: and starting and running the backup component of the target component in the data processing equipment, and correspondingly, running the target component in the data processing equipment capable of collecting the log.
For example, a backup component of the target component may be stored in the data processing apparatus in the form of a component pool or the like, and on this basis, if the target component has an abnormality, the log analysis apparatus may control to start the backup component in the data processing apparatus.
Alternatively, the log analysis device stores therein a backup component package of a plurality of components. On this basis, if it is determined that the target component in the data processing device is abnormal, the backup component package corresponding to the target component can be transmitted to the computer, and the computer is controlled to start and run the backup component package, so that the backup component of the target component is started and run in the data processing device.
Of course, the log analysis device may also reconstruct the target component, obtain the data of the backup component of the target component, and transmit the data to the data processing device.
In another specific manner, the replacement component that reconstructs and starts the target component in the service system may be to start the target component in another data processing device than the data processing device corresponding to the log, so as to replace the target component running in the data processing device.
In another possible case, the log generation source of the log characterizes that the log is a log generated by the component operating environment, that is, the log generation source is the component operating environment, and then an alternative operating environment of the target component operating environment is reconstructed and started in the service system.
Specifically, if the log generation source is the component operating environment, the standby device of the data processing device corresponding to the log may be started, and the operation of the data processing device may be ended. Wherein the standby device has the same operating environment as the data processing device.
By starting the standby equipment, the standby equipment can replace the original data processing equipment to provide corresponding services in the service system, so that the services provided by the original data processing equipment in the service system can be recovered in time, and the perception of the user on the abnormity is reduced.
As can be seen from the above, in the service system of the present application, the data processing device reports the log related to the operation of the component to the log processing device, and the log processing device performs keyword matching on the log and the keyword set matched with the log generation source of the log. Because the keyword set stores the keyword which represents that the log of the log generation source has abnormity, if the log is matched with at least one keyword in the keyword set, the log is indicated to belong to the abnormal log with abnormity, and the abnormal log actually represents that the component corresponding to the log generation source or the component operating environment has abnormity, so that the abnormal component operation in each data processing device of the server system can be found in time.
On the basis, under the condition that the log is matched with at least one keyword in the keyword set, the method and the system can reconstruct and start the alternative component of the target component corresponding to the log or the alternative operating environment of the component operating environment corresponding to the log in the service system, so that the abnormity of some components or component operating environments in the service system can be repaired in time, and the availability of the service system is improved.
It is understood that the above embodiment is described by taking the processing of one log by the log analysis device as an example, but in the case that a plurality of sets of logs to be processed exist in the log analysis device at the same time, the logs may be processed in sequence according to the method of the above embodiment of the present application, and the specific process is similar.
In an optional manner, the logs reported by each data processing device in the service system are stored in a log queue, and on this basis, the log analysis device can obtain a log queue to be processed. The log queue includes a plurality of logs from a plurality of data processing devices. The journal queue may be in the order of journal, e.g., the journal queue may be a time sequence.
Correspondingly, the log analysis device can allocate a set number of logs in the log queue to a set number of log processing threads according to the sequence of the logs in the log queue, so that the set number of log processing threads process the set number of logs in parallel.
For each log processing thread, after the log processing thread obtains the allocated log, the process of processing the log may refer to the related operations of S202 to S204 in the foregoing embodiment, which is not described again.
It will be appreciated that for any type of component or component operating environment, there may be instances where there is a low risk of an anomaly being present in the log, or where an anomaly in the log may be a false positive. In order to ensure the normal operation of the components in the data processing equipment, under the condition that the abnormity existing in a certain component or component operation environment is low risk, the corresponding component or component operation environment can be reconstructed and started after the abnormity appears for many times.
For convenience of understanding, reference may be made to fig. 3, which is a schematic flowchart illustrating a log exception handling method according to another embodiment of the present application, where the method according to the present embodiment is applied to a log analysis device. The embodiment may include:
s301, obtaining a log queue to be processed.
The log queue comprises a plurality of logs reported by a plurality of data processing equipment components in the service system. The log metadata of the log comprises a log generation source of the log, and the log generation source represents a target component or a component running environment to which the log belongs.
S302, according to the sequence of the logs in the log queue, distributing the set number of logs in the log queue to the set number of log processing threads.
The logs can be sequentially taken out and distributed to the idle log processing threads according to the sequence of the logs in the log queue, so that the set number of log threads can process the set number of log processing threads in parallel.
For example, assuming that 5 log processing threads are deployed in the log analysis device, 5 different logs are processed in parallel by the 5 log processing threads each time. Meanwhile, if a certain thread completes log processing, so that the log processing thread is idle, a log to be processed is continuously taken out from the log queue and allocated to the log processing thread.
It should be noted that, in the present embodiment, the case of processing logs in parallel by a plurality of log processing threads is taken as an example for explanation, and in an actual application, the case of processing logs one by one is also applicable to the present embodiment, and the present invention is not limited to this.
S303, for each log processing thread, the log processing thread determines a target keyword set matching the log generation source from the plurality of keyword sets according to the log generation source corresponding to the log allocated to the log processing thread, and performs step S304.
In the application, the target keyword set comprises a plurality of keywords for characterizing the abnormal condition of the log from which the log is generated. Meanwhile, each keyword in the target keyword set has a risk level respectively.
For example, the risk level may be divided into a high risk level and a low risk level, and accordingly, the risk level of each keyword may be a high risk level or a low risk level according to actual situations.
Of course, in practical applications, the risk level may also be divided into finer granularity, which is not limited.
S304, matching the log with each keyword in the target keyword set through a log processing thread.
The above steps S303 and S304 can refer to the related description of the previous embodiment, and are not described herein again.
S305, if at least one keyword matched with the log exists in the target keyword set, detecting whether the at least one keyword has a keyword belonging to a set high risk level, if so, executing the step S306; if not, step S307 is executed.
S306, reconstructing and starting the substitute component of the target component or the substitute operation environment of the component operation environment in the service system, and ending the operation of the target component or the component operation environment in the data processing device corresponding to the log.
This step S306 can refer to the related description of the previous embodiment, and is not described herein again.
S305, adding one to the risk occurrence frequency corresponding to the log generation source of the log.
S308, if the risk occurrence frequency corresponding to the log generation source of the log exceeds the set frequency, reconstructing and starting the alternative component of the target component or the alternative operating environment of the component operating environment in the service system, and ending the operation of the target component or the component operating environment in the data processing device corresponding to the log.
For example, taking a log generation source of the log as a target component, when the at least one keyword matched with the log of the target component does not belong to the set keyword with the high risk level, the method determines that the target component has a low risk, and adds one to the number of times of occurrence of the target component that the target component has the low risk. On this basis, the risk occurrence number may characterize the total number of times the target component has currently experienced a low risk.
It can be understood that, for a certain component or a component operating environment in a data processing device, the data processing device generally reports a log of the component and the component operating environment with a set frequency, and therefore, when the risk occurrence frequency corresponding to a log generation source (such as a target component or a component operating environment) exceeds the set frequency, it indicates that the frequency of the log generation source with low risk is higher.
Correspondingly, if the risk occurrence frequency corresponding to the log generation source exceeds the set frequency, it indicates that the target component or the component operating environment corresponding to the log generation source is abnormal, and therefore abnormal repair of the target component or the component operating environment is required.
For the convenience of understanding fig. 3, reference may be made to fig. 4, which is a schematic diagram illustrating an implementation principle of the present application that employs multithreading to process logs in parallel.
In fig. 4, 5 log processing threads may run in the processor, and the 5 log processing threads are respectively represented as 5 thick lines with arrows below the processor.
The small white square in fig. 4 represents a log. In fig. 4 each of the 5 log processing threads in the processor processes one log.
The blocks to the right of the processor in FIG. 4 represent logs that have not been allocated for processing by multiple log processing threads. As shown by the white square on the right side of the processor in fig. 4.
In fig. 4, the log is processed by the log processing thread to obtain the abnormal condition corresponding to the log, for example, the block on the left side of the processor in fig. 4 represents the abnormal condition corresponding to the log, for example, the black block represents that the log has a high abnormal risk, the block with a plurality of vertical stripes represents that the log has a low abnormal risk, and the block with horizontal stripes represents that the log does not have an abnormal risk.
In fig. 4, each log processing thread of the processor matches the log with the corresponding keyword set to obtain the keyword matched with the log. Wherein if the matched keywords of the log have the keywords with high risk, the log pair has high abnormal risk; similarly, a log has a low risk of anomaly if none of the keywords it matches have a high risk, but keywords with a low risk. Accordingly, if the log does not match the keyword, the log is not at risk of abnormality. In this case, after each log processing thread completes processing of one log, one log is taken from the log which is not processed yet on the back side of the processor and is continuously processed.
For convenience of description, the journal is taken as an example of the journal of the component in the data processing apparatus. The log analysis device can execute different operations according to the matching situation of the log analyzed by the log processing thread and the keywords, namely the abnormal risk condition of the log. For example, if a log is detected to have a high abnormal risk level, the data processing apparatus may be controlled to start a backup component of the corresponding component and to shut down the operation of the original component. Meanwhile, if the fact that a certain component corresponds to the low abnormal risk is analyzed, the number of the logs at the low abnormal risk can be counted, so that the data processing equipment is controlled to start the backup component of the component and the like under the condition that the counted low abnormal risk of the logs corresponding to the component exceeds the set number.
It is understood that the embodiment of fig. 3 also belongs to an implementation manner of reducing abnormal false alarm. In practical applications, in order to further reduce the false alarm of the abnormality of the component or the component operating environment, in any of the embodiments of the present application, before the substitute component of the target component or the substitute operating environment of the component operating environment is reconfigured and started in the service system, the log generation source of the log may be tested according to the set false alarm test rule.
Wherein, the false alarm test rules are different for different components or component operating environments. For the same component, under the condition that the keywords matched with the log of the component are different, the determined abnormal conditions of the component are also different, and correspondingly, the false alarm test rules are also possibly different.
For example, the keyword representation matched by the component-based log represents: on the basis that the read-write operation of the component cannot apply for the memory, a read-write instruction of the simulation component can be sent to the corresponding data processing equipment, a simulation result is obtained, and if the simulation result indicates that the content required by the read-write of the component can be applied, the abnormal condition in the log is a false alarm.
Correspondingly, if the abnormity corresponding to the log is determined to be a false alarm according to the test result, the target component or the component operating environment corresponding to the target generation source cannot be reconstructed and started. And if the log is determined to have no abnormal false alarm according to the test result, reconstructing and starting the substitute component of the target component or the substitute operation environment of the component operation environment in the service system.
In the application, the specific manner of collecting and reporting the logs by the data processing device may be that a corresponding log collection rule is configured in the log processing device, and the log processing device issues the corresponding log collection rule to the computer device, so as to control the types of the logs collected by the computer device.
For ease of understanding, reference may be made to fig. 5, which illustrates a functional block diagram of one implementation of the log exception handling method of the present application.
As can be seen from fig. 5, a configuration module is provided in the service system, and the configuration module is provided in the log processing device. The configuration module can obtain log collection rules configured by a user and respective keyword sets of different components and component operating environments.
Meanwhile, the service system is also provided with a log acquisition module, wherein each data processing device is provided with the log acquisition module, the log acquisition module can acquire the log acquisition rule configured in the configuration module and acquire logs based on the log acquisition rule, and the acquired logs can be reported to the data processing and warning module.
The data processing and warning module is also arranged in the log processing equipment. The data processing and alarm module can obtain the respective keyword sets of the different components and the component operating environments of the configuration from the configuration module. And meanwhile, performing anomaly detection on the logs reported by the log collection module by combining the keyword set.
And after the data processing and warning module detects that the log is abnormal, the data processing and warning module informs the component replacement module to repair the component or the component running environment corresponding to the abnormal log. The component replacement module is arranged in the log processing device.
For convenience of understanding, the following description takes the log collection rule configured in the configuration module as an example for collecting the log in the running log file.
The log exception handling method of the present application is described below from a flow interaction between a data processing apparatus and a log processing apparatus. As shown in fig. 6, which shows a flow interaction diagram of a log exception handling method according to the present application, the method of this embodiment may include:
s601, the data processing equipment determines the position point of the operation log file where the log collection is completed.
Wherein, the operation log file at least comprises a log of operation errors in the operation process of the component.
And the position point of the completed log collection in the running log file is used for representing the last log of which the collection processing is completed in the running log file. If so, the location point where the log collection is completed is the offset corresponding to the last collected log in the running log file. The starting point of the log to be collected in the running log file can be represented by the position point.
In an alternative mode, the application records the position points of the log collection in the running log file through a position point recording file. In consideration of the fact that the file number (such as the number of the index node) in the same path changes, the file is rewritten, so that the path information and the file number of the log can also be recorded in the location point record file in the present application, so that after the file number changes, the log in the file under the path information is read again.
S602, the data processing device obtains the log which is not collected from the running log file according to the position point, and adds log metadata to the log.
In this embodiment, the log metadata may include a log generation source of the log, and the log generation source may generate a specific component of the log or a component execution environment of the log.
For example, each log in the log file may be associated with information of the component that generated the log, based on which the log generation source of the log may be obtained. Of course, the data processing device may also determine a log generation source corresponding to the log stored in the data processing device by other manners, which is not limited in this application.
Meanwhile, in order that the subsequent log processing device can distinguish which data processing device the log is reported as the log, the log metadata of the log may further include information of the data processing device that collects the log, such as a device identifier of the data processing device.
It will be appreciated that the log metadata also includes an offset of the log in the running log file for purposes of subsequently updating the location point, and allowing the log processing device to determine the log collection progress of the data processing device, etc.
S603, the data processing apparatus transmits the log having the log metadata to the log processing apparatus.
It should be noted that, in this embodiment, the log type to be collected configured in the configuration module is taken as a log running file as an example, and in an actual application, the log to be collected may further include the aforementioned other situations.
For example, the log to be collected includes output data output during the operation of the component. Correspondingly, the data processing device acquires output data output in the running process of the component, determines the output data as a log, and can send the log to the log processing device after adding log metadata to the log.
For another example, the log to be collected includes a test result obtained by the data processing device performing an operation test on the component by using the test command, and the data processing device obtains the test result and adds the identifier of the component to the test result to obtain a log.
S604, the data processing device records the offset of the log in the running log file as the position point of the running log file where the log collection is completed.
The logs which are acquired and reported can be determined by recording the offset corresponding to the currently reported logs. On the basis, no matter the log collection is interrupted due to any reason, the logs which are not collected can be determined according to the position point of the completed log collection, and the log collection can be continued without repeatedly collecting the logs from the beginning of running the log file.
For example, the location point recorded in the location point record file, at which the log collection is completed, may be updated as the offset of the log.
It is understood that, considering that the data processing device may continuously generate logs, the log processing device has limited log processing capability, and the processing pressure of the log processing device is reduced to avoid log accumulation. When the logs reported by the data processing device and not processed by the log processing device exceed the maximum processing number of the log processing device, the logs acquired by the data processing device can be discarded, and on the basis, the offset of the acquired and discarded logs is also recorded as the position point of the acquired logs.
And S605, the log processing equipment stores the logs reported by the data processing equipment into a log queue.
And S606, the log processing equipment allocates the log with the set number in the log queue to the log processing threads with the set number according to the sequence of the logs in the log queue.
Note that, the present embodiment is described by taking an example of processing logs in parallel by a plurality of log processing threads, and the case of processing the logs one by one as mentioned above is also applicable to the present embodiment.
S607, for each log processing thread, the log processing thread determines a target keyword set matching the log generation source from the plurality of keyword sets according to the log generation source corresponding to the distributed log, and performs step S608.
S608, the log processing device matches the log with each keyword in the target keyword set through the log processing thread.
And S609, when at least one keyword matched with the log exists in the target keyword set, if the log generated by the log as the target component is determined based on the log generation source of the log, according to the information of the data processing equipment in the log metadata of the log, indicating the data processing equipment to start and operate the backup component of the component, and ending the operation of the target component in the data processing equipment.
The step S609 is an example of an implementation manner of reconstructing and starting the backup component of the target component, and other implementation manners are also applicable to the embodiment.
S610, under the condition that at least one keyword matched with the log exists in the target keyword set, if the log generation source characterization log is the log related to the component operation environment, starting the standby equipment corresponding to the data processing equipment according to the information of the data processing equipment in the log metadata of the log, and finishing the operation of the data processing equipment.
And the standby equipment corresponding to the data processing equipment has the same operating environment with the data processing equipment.
In this embodiment, the data processing device further performs self-check on the log collection condition for log collection exception.
For example, in a possible case, the location point of the running log file where log collection is completed is saved in the location point record file, and the location point where log collection is completed is an offset of the last log where log collection is completed.
Correspondingly, the data processing device can detect the offset difference between the offset of the last log and the size of the running log file, determine that the log collection is abnormal if the offset difference is smaller than a set value, and send alarm information of the log collection abnormality to the data processing device.
If the offset difference between the offset of the last log and the size of the running log file is small, the data processing device starts to collect the logs from the initial position of the running log file again, so that the log collection is abnormal. By sending the alarm information of the log collection abnormity to the log processing equipment, the log collection abnormity can be processed by the log processing equipment in time. For example, the log processing device may determine a location point of the acquired log according to the log obtained by the log processing device, and instruct the data processing device to acquire the log in the running log file according to the location point.
In addition, the data processing device can also send a location point update exception alarm to the log processing device when detecting that the location point recorded in the location point recording file and completing log collection has not been updated after exceeding a set time length, so that the log processing device can process the exception condition in time.
The application also provides a log exception handling device corresponding to the operation of the log handling device side in the log exception handling method.
As shown in fig. 7, which shows a schematic flowchart of a log exception handling method according to another embodiment of the present application, the method of this embodiment is applied to a log handling apparatus in a service system, the service system includes a plurality of data handling apparatuses and at least one log handling apparatus, at least one component is operated in the data handling apparatus, and the method includes:
a log obtaining unit 701, configured to obtain a log related to component operation in the data processing apparatus, where the log includes log metadata of the log, the log metadata includes a log generation source of the log, and the log generation source represents a target component or a component operation environment to which the log belongs;
a set determining unit 702, configured to determine, from a plurality of keyword sets, a target keyword set matching the log generation source, where the target keyword set includes a plurality of keywords used for characterizing that the log of the log generation source has an abnormality;
a keyword matching unit 703, configured to match the log with each keyword in the target keyword set;
an exception recovery unit 704, configured to, if at least one keyword matching the log exists in the target keyword set, reconstruct and start an alternative component of the target component or an alternative operating environment of the component operating environment in the service system based on a log generation source of the log, and end the operation of the target component or the component operating environment in the data processing device corresponding to the log.
In one possible implementation, the exception recovery unit may include:
a first exception handling unit, configured to, if at least one keyword matching the log exists in the target keyword set, if the log generation source represents the log generated by the log for a target component, start and run a backup component of the target component in the data processing apparatus, and end running of the target component in the data processing apparatus;
and the second exception handling unit is used for starting a standby device corresponding to the data processing device and ending the operation of the data processing device if the log generation source represents that the log is a log related to the component operation environment under the condition that at least one keyword matched with the log exists in the target keyword set, wherein the standby device has the same operation environment as the data processing device.
In yet another possible implementation manner, each keyword in the set of target keywords determined by the set determining unit has a risk level;
the exception recovery unit includes:
a risk accumulation subunit, configured to add one to the risk number corresponding to the log generation source of the log if at least one keyword matching the log exists in the target keyword set and the risk level of the at least one keyword belongs to a set low risk level;
and the abnormal recovery subunit is used for reconstructing and starting the alternative component of the target component or the alternative operating environment of the component operating environment in the service system if the risk frequency corresponding to the log generation source of the log reaches the set frequency, and ending the operation of the target component or the component operating environment in the data processing equipment corresponding to the log.
In another possible implementation manner, the log obtaining unit includes:
a queue obtaining unit configured to obtain a log queue to be processed, the log queue including a plurality of logs from the plurality of data processing apparatuses;
the device also includes:
and the thread allocation unit is used for allocating a set number of logs in the log queue to a set number of log processing threads according to the sequence of the logs in the log queue before determining the target keyword set matched with the log generation source from the plurality of keyword sets, so that the set number of log processing threads process the set number of logs in parallel.
In yet another aspect, the present application further provides a computer device, which may be the aforementioned log processing device or data processing device.
Fig. 8 is a schematic diagram illustrating an architecture of a computer device provided in the present application. In fig. 8, the computer device 800 may include: a processor 801 and a memory 802.
Optionally, the computer device may further include: a communication interface 803, an input unit 804, and a display 805 and a communication bus 806.
The processor 801, the memory 802, the communication interface 803, the input unit 804 and the display 805 all communicate with each other via a communication bus 806.
In this embodiment, the processor 801 may be a central processing unit, an application specific integrated circuit, or the like.
The memory has stored therein at least one instruction, at least one program, set of codes or set of instructions that is loaded and executed by the processor to implement the log exception handling method as mentioned in the above embodiments.
In one possible implementation, the memory 802 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, the above-mentioned programs, and the like; the storage data area may store data created during use of the computer device.
The communication interface 803 may be an interface of a communication module.
The present application may further include an input unit 804, which may include a touch sensing unit, a keyboard, and the like.
The display 805 includes a display panel, such as a touch display panel or the like.
Of course, the computer device structure shown in fig. 8 does not constitute a limitation of the computer device in the embodiment of the present application, and in practical applications, the computer device may include more or less components than those shown in fig. 8, or some components may be combined.
In another aspect, the present application further provides a computer-readable storage medium, in which at least one instruction, at least one program, a code set, or a set of instructions is stored, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by a processor to implement the log exception handling method according to any one of the above embodiments.
The present application also proposes a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instruction from the computer-readable storage medium, and executes the computer instruction, so that the computer device executes the methods provided in the various optional implementation manners in the aspect of the log exception handling method or the aspect of the log exception handling apparatus.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. Also, the features described in the embodiments of the present specification may be replaced or combined with each other to enable one skilled in the art to make or use the present application. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that it is obvious to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements should also be considered as the protection scope of the present invention.

Claims (10)

1. A log exception handling method is applied to log processing equipment in a service system, the service system comprises a plurality of data processing equipment and at least one log processing equipment, at least one component runs in the data processing equipment, and the method comprises the following steps:
obtaining a log related to component operation in the data processing equipment, wherein the log comprises log metadata of the log, the log metadata comprises a log generation source of the log, and the log generation source represents a target component or a component operation environment to which the log belongs;
determining a target keyword set matched with the log generation source from a plurality of keyword sets, wherein the target keyword set comprises a plurality of keywords used for representing that the log of the log generation source has abnormity;
matching the log with each keyword in a target keyword set;
if at least one keyword matched with the log exists in the target keyword set, reconstructing and starting a substitute component of the target component or a substitute operation environment of the component operation environment in a service system based on a log generation source of the log, and ending the operation of the target component or the component operation environment in the data processing equipment corresponding to the log.
2. The method according to claim 1, wherein if at least one keyword matching the log exists in the target keyword set, reconstructing and starting an alternative component of the target component or an alternative operating environment of the component operating environment in a service system based on a log generation source of the log, and ending the operation of the target component or the component operating environment in a data processing device corresponding to the log, comprises:
if at least one keyword matched with the log exists in the target keyword set, if the log generation source represents the log generated by the log as a target component, starting and operating a backup component of the target component in the data processing equipment, and ending the operation of the target component in the data processing equipment;
and under the condition that at least one keyword matched with the log exists in the target keyword set, if the log generation source represents that the log is a log related to the component operating environment, starting standby equipment corresponding to the data processing equipment and finishing the operation of the data processing equipment, wherein the standby equipment has the same operating environment as the data processing equipment.
3. The method of claim 1, wherein each keyword in the set of target keywords has a risk level;
if at least one keyword matched with the log exists in the target keyword set, reconstructing and starting a substitute component of the target component or a substitute operation environment of the component operation environment in a service system based on a log generation source of the log, and ending the operation of the target component or the component operation environment in the data processing device corresponding to the log, the method includes:
if at least one keyword matched with the log exists in the target keyword set and the risk level of the at least one keyword belongs to a set low risk level, adding one to the risk number corresponding to the log generation source of the log;
if the risk times corresponding to the log generation source of the log reach set times, reconstructing and starting the alternative component of the target component or the alternative operating environment of the component operating environment in the service system, and ending the operation of the target component or the component operating environment in the data processing equipment corresponding to the log.
4. The method of claim 1, wherein obtaining a log related to the operation of a component in the data processing device comprises:
obtaining a log queue to be processed, wherein the log queue comprises a plurality of logs from the plurality of data processing devices;
before determining a target keyword set matching the log generation source from the plurality of keyword sets, the method further comprises:
and distributing a set number of logs in the log queue to a set number of log processing threads according to the sequence of the logs in the log queue, so that the set number of log processing threads process the set number of logs in parallel.
5. A log exception handling method is applied to a service system, the service system comprises a plurality of data processing devices and at least one log processing device, at least one component runs in the data processing device, and the method comprises the following steps:
the data processing equipment obtains a log related to component operation on the data processing equipment, and adds log metadata of the log to the log, wherein the log metadata of the log comprises a log generation source of the log, and the log generation source represents a target component or a component operation environment to which the log belongs;
the data processing equipment sends the log added with the log metadata to the log processing equipment;
the log processing equipment determines a target keyword set matched with the log generation source from a plurality of keyword sets, wherein the target keyword set comprises a plurality of keywords used for representing the log generation source and having abnormity;
the log processing equipment matches the log with each keyword in a target keyword set;
and under the condition that at least one keyword matched with the log exists in the target keyword set, reconstructing and starting a substitute component of the target component or a substitute operation environment of the component operation environment in a service system based on a log generation source of the log by the log processing equipment, and finishing the operation of the target component or the component operation environment in the data processing equipment corresponding to the log.
6. The method of claim 5, wherein the data processing device obtains a log related to the running of the component on the data processing device, and adds log metadata of the log to the log, and the method comprises:
the data processing equipment determines a position point of completed log collection in an operation log file, wherein the operation log file at least comprises a log of operation errors in the operation process of the component;
according to the position point, acquiring a log which is not collected from the running log file, and adding the log metadata to the log, wherein the log metadata also comprises an offset of the log in the running log file;
after the data processing apparatus sends the log added with the log metadata to the log processing apparatus, the method further includes:
and the data processing equipment records the offset of the log in the running log file as a position point of the running log file where log collection is completed.
7. The method of claim 6, wherein the location point of the log collection completed in the log file is saved in a location point record file, and the location point of the log collection completed is an offset of the last log of the log collection completed,
the method further comprises the following steps:
the data processing equipment detects the offset difference between the offset of the last log and the size of the running log file, if the offset difference is smaller than a set value, the log collection is determined to be abnormal, and alarm information of the log collection abnormality is sent to the log processing equipment;
and when detecting that the position point recorded in the position point recording file and completing log collection does not have updating after the set time length is exceeded, the data processing equipment sends a position point updating abnormity alarm to the log processing equipment.
8. A log exception handling device is applied to a log processing device in a service system, wherein the service system comprises a plurality of data processing devices and at least one log processing device, at least one component runs in the data processing device, and the log exception handling device comprises:
a log obtaining unit, configured to obtain a log related to component operation in the data processing apparatus, where the log includes log metadata of the log, the log metadata includes a log generation source of the log, and the log generation source represents a target component or a component operation environment to which the log belongs;
the set determining unit is used for determining a target keyword set matched with the log generation source from a plurality of keyword sets, wherein the target keyword set comprises a plurality of keywords used for representing the log generation source and having abnormity;
the keyword matching unit is used for matching the log with each keyword in the target keyword set;
and if at least one keyword matched with the log exists in the target keyword set, reconstructing and starting a substitute component of the target component or a substitute operation environment of the component operation environment in a service system based on a log generation source of the log, and ending the operation of the target component or the component operation environment in the data processing equipment corresponding to the log.
9. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, set of codes, or set of instructions, the at least one instruction, the at least one program, set of codes, or set of instructions being loaded and executed by the processor to implement the log exception handling method of claims 1-4 or 5-7.
10. A computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method of log exception handling as claimed in claims 1 to 4 or 5 to 7.
CN202011395195.6A 2020-12-03 2020-12-03 Log exception handling method, device, equipment and storage medium Pending CN114595127A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011395195.6A CN114595127A (en) 2020-12-03 2020-12-03 Log exception handling method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011395195.6A CN114595127A (en) 2020-12-03 2020-12-03 Log exception handling method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114595127A true CN114595127A (en) 2022-06-07

Family

ID=81802949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011395195.6A Pending CN114595127A (en) 2020-12-03 2020-12-03 Log exception handling method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114595127A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115034220A (en) * 2022-08-12 2022-09-09 苏州浪潮智能科技有限公司 Abnormal log detection method and device, electronic equipment and storage medium
CN115658441A (en) * 2022-12-13 2023-01-31 济南丽阳神州智能科技有限公司 Method, equipment and medium for monitoring abnormality of household service system based on log

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115034220A (en) * 2022-08-12 2022-09-09 苏州浪潮智能科技有限公司 Abnormal log detection method and device, electronic equipment and storage medium
CN115034220B (en) * 2022-08-12 2023-01-10 苏州浪潮智能科技有限公司 Abnormal log detection method and device, electronic equipment and storage medium
CN115658441A (en) * 2022-12-13 2023-01-31 济南丽阳神州智能科技有限公司 Method, equipment and medium for monitoring abnormality of household service system based on log
CN115658441B (en) * 2022-12-13 2023-03-10 济南丽阳神州智能科技有限公司 Method, equipment and medium for monitoring abnormality of household service system based on log

Similar Documents

Publication Publication Date Title
CN105357038B (en) Monitor the method and system of cluster virtual machine
EP3616066B1 (en) Human-readable, language-independent stack trace summary generation
US8719784B2 (en) Assigning runtime artifacts to software components
US8521865B2 (en) Method and apparatus for populating a software catalog with automated use signature generation
US10922164B2 (en) Fault analysis and prediction using empirical architecture analytics
CN107533504A (en) Anomaly analysis for software distribution
CN110752969B (en) Performance detection method, device, equipment and medium
CN111563016B (en) Log collection and analysis method and device, computer system and readable storage medium
CN115989483A (en) Automated root cause analysis and prediction for large dynamic process execution systems
CN111552556A (en) GPU cluster service management system and method
CN114595127A (en) Log exception handling method, device, equipment and storage medium
Chen et al. Invariants based failure diagnosis in distributed computing systems
CN114661319A (en) Software upgrade stability recommendation
Gurumdimma et al. Towards detecting patterns in failure logs of large-scale distributed systems
Tao et al. A survey of software trustworthiness measurements
CN114048099A (en) Java application monitoring method and device, storage medium and electronic equipment
US9929921B2 (en) Techniques for workload toxic mapping
Pecchia et al. Assessing invariant mining techniques for cloud-based utility computing systems
KR20170122874A (en) Apparatus for managing log of application based on data distribution service
Schmieders et al. Architectural runtime models for privacy checks of cloud applications
ZHANG et al. Approach to anomaly detection in microservice system with multi-source data streams
WO2021096346A1 (en) A computer-implemented system for management of container logs and its method thereof
Meng et al. IT troubleshooting with drift analysis in the DevOps era
Bouguerra et al. Failure prediction: what to do with unpredicted failures
Bhatia et al. Efficient failure diagnosis of OpenStack using Tempest

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination