CN110990350B - Log analysis method and device - Google Patents

Log analysis method and device Download PDF

Info

Publication number
CN110990350B
CN110990350B CN201911190459.1A CN201911190459A CN110990350B CN 110990350 B CN110990350 B CN 110990350B CN 201911190459 A CN201911190459 A CN 201911190459A CN 110990350 B CN110990350 B CN 110990350B
Authority
CN
China
Prior art keywords
regular expression
field
identifier
requirement
regular
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911190459.1A
Other languages
Chinese (zh)
Other versions
CN110990350A (en
Inventor
韩佩利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taikang Insurance Group Co Ltd
Original Assignee
Taikang Insurance Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taikang Insurance Group Co Ltd filed Critical Taikang Insurance Group Co Ltd
Priority to CN201911190459.1A priority Critical patent/CN110990350B/en
Publication of CN110990350A publication Critical patent/CN110990350A/en
Application granted granted Critical
Publication of CN110990350B publication Critical patent/CN110990350B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a log analysis method and device. The method comprises the steps of obtaining a current field requirement of a target application service, wherein the current field requirement comprises at least one target field identifier and an order of the at least one target field identifier; searching a stored regular expression library according to the stored regular expression of the target application service and the current field requirement, and acquiring a field identifier to be updated and an updating operation aiming at the stored regular expression; according to the field identification to be updated and the updating operation, updating the stored regular expression to generate a regular expression meeting the current field requirement; and analyzing the log of the target application service by adopting a regular expression meeting the current field requirement, and obtaining log analysis content. Compared with the prior art, the method avoids manual modification of the regular expression, and improves the modification efficiency and the accuracy of modifying the regular expression.

Description

Log analysis method and device
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method and an apparatus for analyzing logs.
Background
Currently, each company stores log information of a plurality of Application services, such as Application (APP), used through a log system having the same log template. The log information generated by the APP needs to be accessed to a log analysis platform, and an application service expects fields analyzed by data analysis (ETL) on the distributed full text search system after accessing the log analysis platform.
The method adopted at present is that a logstack log collection tool is installed on terminal equipment operated by an APP, log information under a designated directory is collected and stored in a configuration file of the logstack, a regular expression is adopted for analysis, a target analysis field of each log is analyzed, and then the fields are written into an elastic search, and service viewing, statistics and the like are carried out.
As described above, for the existing technical solution, since there are more APPs with log access requirements, although the log format of each APP is substantially the same, the fields that each application service desires to parse may be different, for example, the fields that APP1 desires to parse are the a field, the B field, and the C field, and the fields that APP2 desires to parse are the a field and the B field, so in actual development, for each APP specific situation, regular expressions need to be manually set one by one. When the parsing requirements (or "field requirements") of an application service change, such as adding fields or subtracting fields, the regular expression also needs to be manually modified.
Because the regular expressions analyzed by the logs are longer, the manual modification of the regular expressions analyzed by the logs is more complicated based on the current manual modification mode, the efficiency is lower, and meanwhile, the manual modification errors exist, the modification accuracy is not high, and the log analysis errors are caused.
Disclosure of Invention
The embodiment of the application provides a log analysis method and device, which solve the problems existing in the prior art, avoid manual modification of regular expressions, and improve modification efficiency and accuracy of the regular expressions.
In a first aspect, a method for parsing a log is provided, where the method may include:
acquiring a current field requirement of the target application service, wherein the current field requirement comprises at least one target field identifier and an order of the at least one target field identifier;
searching a stored regular expression library according to the stored regular expression of the target application service and the current field requirement, and acquiring a field identifier to be updated and an updating operation aiming at the stored regular expression, wherein the regular expression library is used for storing a regular expression of the field identifier and a regular expression of a field identifier combination;
updating the stored regular expression according to the field identification to be updated and the updating operation to generate a regular expression meeting the current field requirement;
and analyzing the log of the target application service by adopting a regular expression meeting the current field requirement to acquire log analysis content.
In an alternative implementation, before obtaining the current field requirement of the target application service, the method further includes:
acquiring an initial field requirement of a target application service, wherein the initial field requirement comprises at least one field identifier and an order of the at least one field identifier;
searching a stored regular expression library, and acquiring a regular expression of each field identifier in the at least one field identifier;
and combining the regular expressions of each field identifier in the at least one field identifier according to the sequence of the at least one field identifier by adopting an expression combination algorithm to generate a regular expression meeting the initial field requirement and a field identifier combined regular expression.
In an alternative implementation, an expression combining algorithm is adopted, and a regular expression of each field identifier in the at least one field identifier is combined according to the sequence of the at least one field identifier, so as to generate a regular expression meeting the requirement of the initial field, and the regular expression of the field identifier combination includes:
ordering the regular expressions of the at least one field identifier according to the order of the at least one field identifier;
And combining adjacent regular expressions in the ordered regular expressions by adding a preset regular expression, so as to generate a regular expression meeting the requirement of the initial field and a field identification combined regular expression, wherein the preset regular expression is a regular expression representing matching any character.
In an optional implementation, according to the field identifier to be updated and the updating operation, updating the stored regular expression to generate a regular expression meeting the current field requirement, including:
acquiring the regular expression of the field identification to be updated;
if the updating operation is an increasing operation, adding the regular expression of the field identifier to be updated in the stored regular expression to generate a regular expression meeting the current field requirement;
and if the updating operation is a deleting operation, deleting the regular expression of the field identification to be updated from the stored regular expressions, and generating the regular expression meeting the current field requirement.
In an optional implementation, adding the regular expression of the field identifier to be updated to the stored regular expression, generating a regular expression meeting the current field requirement includes:
Searching the regular expression library according to the sequence of the at least one target field mark, and acquiring a regular expression of a field mark or a regular expression of a field mark combination corresponding to the stored regular expression;
and combining the regular expression of the field identification to be updated with the acquired regular expression of the field identification or the regular expression of the field identification combination by adopting an expression combination algorithm to generate a regular expression meeting the current field requirement.
In an alternative implementation, field identifications and corresponding regular expressions in the regular expression library are stored in the form of key-value pairs.
In an alternative implementation, the regular expression library includes a generic regular expression library and a custom regular expression library;
before obtaining the current field requirement of the target application service, the method further comprises:
acquiring an input field identification and a corresponding regular expression, and a field identification combination and a corresponding regular expression;
counting the number of application services using the input field identification or field identification combination;
if the number of the used field identifiers is not smaller than a preset number threshold, adding the input field identifiers or field identifier combinations and corresponding regular expressions into the general regular expression library;
And if the number of the used field identifiers is smaller than the preset number threshold, adding the input field identifiers or field identifier combinations and corresponding regular expressions into the customized regular expression library.
In an alternative implementation, the method further comprises:
counting the number of application services using each field identifier or field identifier combination in the customized regular expression library;
and if the real-time counted number of the used fields is not smaller than the preset number threshold, removing the counted field identifications or field identification combinations and the corresponding regular expressions from the customized regular expression library, and adding the field identifications or field identification combinations and the corresponding regular expressions into the general regular expression library.
In a second aspect, a log parsing apparatus is provided, which may include: an acquisition unit, a generation unit and an analysis unit;
the acquiring unit is configured to acquire a current field requirement of the target application service, where the current field requirement includes at least one target field identifier and an order of the at least one target field identifier;
searching a stored regular expression library according to the stored regular expression of the target application service and the current field requirement, and acquiring field identification to be updated and updating operation of the stored regular expression, wherein the regular expression library is used for storing a regular expression of a field identification and a regular expression of a field identification combination;
The generating unit is used for updating the stored regular expression according to the field identification to be updated and the updating operation to generate a regular expression meeting the current field requirement;
the analysis unit is used for analyzing the log of the target application service by adopting the regular expression meeting the current field requirement to acquire log analysis content.
In an optional implementation, the obtaining unit is further configured to obtain an initial field requirement of the target application service, where the initial field requirement includes at least one field identifier and an order of the at least one field identifier;
searching a stored regular expression library, and acquiring a regular expression of each field identifier in the at least one field identifier;
the generating unit is further configured to combine the regular expressions of each field identifier in the at least one field identifier according to the sequence of the at least one field identifier by using an expression combining algorithm, so as to generate a regular expression meeting the requirement of the initial field and a regular expression of the field identifier combination.
In an optional implementation, the generating unit is specifically configured to sort the regular expressions of the at least one field identifier according to the order of the at least one field identifier;
And combining adjacent regular expressions in the ordered regular expressions by adding a preset regular expression, so as to generate a regular expression meeting the requirement of the initial field and a field identification combined regular expression, wherein the preset regular expression is a regular expression representing matching any character.
In an optional implementation, the generating unit is further configured to obtain a regular expression of the field identifier to be updated;
if the updating operation is an increasing operation, adding the regular expression of the field identifier to be updated in the stored regular expression to generate a regular expression meeting the current field requirement;
and if the updating operation is a deleting operation, deleting the regular expression of the field identification to be updated from the stored regular expressions, and generating the regular expression meeting the current field requirement.
In an optional implementation, the generating unit is further specifically configured to search the regular expression library according to the order of the at least one target field identifier, and obtain a regular expression of a field identifier or a regular expression of a field identifier combination corresponding to the stored regular expression;
And combining the regular expression of the field identification to be updated with the acquired regular expression of the field identification or the regular expression of the field identification combination by adopting an expression combination algorithm to generate a regular expression meeting the current field requirement.
In an alternative implementation, field identifications and corresponding regular expressions in the regular expression library are stored in the form of key-value pairs.
In an alternative implementation, the regular expression library includes a generic regular expression library and a custom regular expression library; the device also comprises a statistics unit and an adding unit;
the acquisition unit is also used for acquiring the input field identification and the corresponding regular expression, and the field identification combination and the corresponding regular expression;
the statistics unit is used for counting the number of application services using the input field identification or field identification combination;
the adding unit is used for adding the input field identification or field identification combination and the corresponding regular expression into the general regular expression library if the number of the used field identifications or field identification combination is not smaller than a preset number threshold;
And if the number of the used field identifiers is smaller than the preset number threshold, adding the input field identifiers or field identifier combinations and corresponding regular expressions into the customized regular expression library.
In an optional implementation, the statistics unit is further configured to count a number of uses of the application service using each field identifier or field identifier combination in the custom regular expression library;
the adding unit is further configured to remove the counted field identifier or field identifier combination and the corresponding regular expression from the custom regular expression library and add the field identifier or field identifier combination to the generic regular expression library if the usage number counted in real time is not less than the preset number threshold.
In a third aspect, an electronic device is provided, the electronic device comprising a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory are in communication with each other via the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any one of the above first aspects when executing a program stored on a memory.
In a fourth aspect, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the method steps of any of the first aspects.
After the current field requirement of the target application service is acquired, the current field requirement comprises at least one target field identifier and the sequence of the at least one target field identifier; searching a stored regular expression library according to the stored regular expression of the target application service and the current field requirement, and acquiring a field identification to be updated and an updating operation aiming at the stored regular expression, wherein the regular expression library is used for storing a regular expression of the field identification and a regular expression of the field identification combination; according to the field identification to be updated and the updating operation, updating the stored regular expression to generate a regular expression meeting the current field requirement; and analyzing the log of the target application service by adopting a regular expression meeting the current field requirement, and obtaining log analysis content. Compared with the prior art, the method avoids manual modification of the regular expression, and improves the modification efficiency and the accuracy of modifying the regular expression.
Drawings
Fig. 1 is a system architecture diagram to which a log parsing method according to an embodiment of the present invention is applied;
fig. 2 is a flow chart of a log parsing method according to an embodiment of the present invention;
Fig. 3 is a schematic structural diagram of a log analyzing device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, and not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.
The method for analyzing the log provided by the embodiment of the invention can be applied to a system shown in fig. 1, and the system can comprise a terminal and a server.
The server may be an application server or a cloud server; the terminal may be a Mobile phone, a smart phone, a notebook computer, a digital broadcast receiver, a Personal Digital Assistant (PDA), a User Equipment (UE) such as a tablet computer (PAD), a computing device or other processing device connected to a wireless modem, a Mobile Station (MS), etc.
The terminal may include a series of application services, such as a series of different APPs, with the APP generated log being uploaded by the terminal to a server. The log format of each APP may include at least one field, and a series of logs of different APPs are stored in a log system of the server having the same log template.
Because of the same log template, the log formats of different APPs are similar, and have many identical fields, and also have some unique fields, for example, the log format of APP1 comprises field 1, field 2 and field 3; the log format of APP2 includes field 1, field 3.
According to the embodiment of the invention, under the condition that the field requirements of the application service are changed, such as adding fields or reducing fields, the log is analyzed and split, the field identification meeting the field requirements of the application service is extracted, the stored field identification is traversed, the regular expression is matched, the regular expression meeting the field requirements is finally output, the regular expression meeting the field requirements can be directly loaded into the configuration file of the logstack, the regular expression in the configuration file is prevented from being directly modified manually, and the modification efficiency and the accuracy of modifying the regular expression are improved.
The preferred embodiments of the present application will be described below with reference to the accompanying drawings of the specification, it being understood that the preferred embodiments described herein are for illustration and explanation only, and are not intended to limit the present invention, and the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
Fig. 2 is a flow chart of a log parsing method according to an embodiment of the present invention. As shown in fig. 2, the method may include:
step 210, obtaining the current field requirement of the target application service.
Prior to performing this step, a regular expression of the target application service may be obtained, including:
firstly, acquiring an initial field requirement of a target application service, wherein the initial field requirement can comprise at least one field identifier and the sequence of the at least one field identifier;
the initial field requirement is the analysis requirement of the initial target application service, namely the log content which the initial target application service wants to obtain from the log. The initial field requirement may include an order of at least one field identification and at least one target field identification. For example, the initial field requirement may include an arrangement sequence number of at least one field identification of an application service name (APPName), a user IP address (usersip), a device identification (DeviceId), an employee identification (StaffNumber), and the like with a corresponding field identification.
And secondly, searching a stored regular expression library, and acquiring a regular expression of each field identifier in at least one field identifier.
The regular expression library is used for storing the regular expression of the field identification and the regular expression of the field identification combination, the field identification and the corresponding regular expression are stored in the form of key-value, the key represents the field identification, and the value represents the corresponding regular expression. The regular expression library may be stored in a mysql database, a redis database, or a configuration file as needed for the actual project.
Optionally, if there is a field identifier that fails to match, i.e., there is no field identifier in the regular expression library that matches the field identifier, notification information is generated to notify a technician to update the regular expression library.
Further, an expression combination algorithm is adopted, regular expressions of each field identifier in at least one field identifier are combined according to the sequence of the at least one field identifier, and a regular expression meeting the requirement of an initial field and a regular expression of the field identifier combination are generated.
Specifically, the regular expressions of at least one field identifier are ordered according to the sequence of the at least one field identifier;
and combining adjacent regular expressions in the ordered regular expressions by adding a preset regular expression, so as to generate the regular expression meeting the requirement of an initial field and the regular expression combined by the field identification, wherein the preset regular expression is a regular expression for representing matching any character.
The preset regular expression is a regular expression for representing that any character is matched, for example "\s" is matched with any blank character, and the blank character comprises a space, a tab, a page replacement character and the like; "\w" matches include any word character underlined, equivalent to "[ A-Za-z 0-9_"; "\d" matches a numeric character, equivalent to [0-9].
Or directly splicing and combining adjacent regular expressions in the ordered regular expressions to generate a regular expression meeting the requirement of the initial field and a regular expression of field identification combination.
The regular expression of the regular expression and field identification combination is stored.
Returning to step 210, the current field requirements of the target application service, i.e., the parsing requirements of the current target application service, are obtained, which may include at least one target field identification and an order of the at least one target field identification.
The current field requirement may be the same as or different from the initial field requirement, and when the current field requirement is different from the initial field requirement, the target application service updates the analysis content of the log.
Step 220, searching a stored regular expression library according to the stored regular expression of the target application service and the current field requirement, and acquiring field identification and updating operation to be updated aiming at the stored regular expression.
Searching a stored regular expression of the target application service, and if the current field requirement of the target application service is the requirement of second acquisition, searching the regular expression of the target application service as an initial regular expression.
Searching a field identifier corresponding to the stored regular expression in the regular expression library;
comparing the field identifier corresponding to the stored regular expression with at least one target field identifier in the current field requirement to obtain a field identifier to be updated and an updating operation, namely obtaining redundant or missing field identifiers in the field identifiers corresponding to the stored regular expression relative to the at least one target field identifier, wherein the redundant field identifiers are determined to be the field identifiers to be deleted, the deleting operation is determined, or the missing field identifiers are determined to be the field identifiers to be added, and the adding operation is determined.
And 230, updating the stored regular expression according to the field identification to be updated and the updating operation, and generating the regular expression meeting the current field requirement.
Acquiring a regular expression of a field identifier to be updated;
if the updating operation is a deleting operation, deleting the regular expression of the field identifier to be updated from the stored regular expressions, and generating a regular expression meeting the current field requirement;
if the updating operation is an increasing operation, adding a regular expression of the field identification to be updated in the stored regular expression, and generating the regular expression meeting the current field requirement.
Wherein, the process for adding the regular expression may include:
searching a regular expression library according to the sequence of at least one target field identifier, and acquiring a stored regular expression of a field identifier corresponding to the regular expression or a regular expression of a field identifier combination;
and combining the regular expression of the field mark to be updated with the obtained regular expression of the field mark or the regular expression of the field mark combination by adopting an expression combination algorithm to generate the regular expression meeting the current field requirement.
It should be noted that, after deleting the regular expression of the field identifier to be updated from the stored regular expressions, the regular expressions reserved at the front end and the rear end are still combined by adopting an expression combination algorithm.
For example, when the fields corresponding to the stored regular expression are identified as A, B and C and the field to be added is identified as X:
if the sequence of the at least one target field identifier is A, B, C, X, adding the regular expression of X after the stored regular expression, or respectively acquiring A, B, X, C regular expressions, and combining by adopting an expression combination algorithm.
If the sequence of the at least one target field identifier is A, B, X, C, acquiring the regular expression of the combination of A and B, and then sequentially adding the regular expressions of X and C, or respectively acquiring the regular expressions of A, B, X, C, and combining by adopting an expression combination algorithm.
And 240, analyzing the log of the target application service by adopting a regular expression meeting the current field requirement, and obtaining log analysis content.
Further, for ease of management, the regular expression library may include a generic regular expression library (or "generic regular expression template library") and a custom regular expression library (or "custom regular expression template library").
The forming process of the regular expression library may include: acquiring field identifications and corresponding regular expressions which are manually input by technicians, and field identification combinations and corresponding regular expressions; counting the number of application services using each input field identifier in the use process;
if the number of the used field identifiers or the field identifier combinations and the corresponding regular expressions are not less than the preset number threshold, adding the input field identifiers or field identifier combinations and the corresponding regular expressions into a general regular expression library;
if the number of the used field identifiers or the field identifier combinations and the corresponding regular expressions are added into the customized regular expression library if the number of the used field identifiers or the field identifier combinations is smaller than a preset number threshold value.
For the general regular expression library, because the number of application services corresponding to the regular expressions in the general regular expression library is large, each modification can affect the analysis of the related fields by the application services with large number of use, so that the regular expressions in the general regular expression library need to be modified with great care, and more strict authority needs to be allocated for managing the general template library
For the customized regular expression library, the number of application services corresponding to the regular expressions in the customized regular expression library is small, so that slightly loose authority can be allocated, and a user can update configuration more flexibly.
Further, in actual business, some fields are not available for every application service at first, and some application services may be used first, that is, regular expressions corresponding to these application services are stored in a custom regular expression library. However, as the business evolves and field requirements change, not all fields of each application service may be used by a large number of application services initially, and because the field identifications of these fields have been stored for a period of time for the custom regular expression library to work stably, these field identifications may be deleted from the custom regular expression library and added to the generic regular expression library.
Specifically, the number of application services used by each field identifier or field identifier combination in the custom regular expression library can be counted in real time or periodically;
and if the real-time counted number of the used fields is not smaller than the preset number threshold, removing the counted field identifications or the field identification combinations and the corresponding regular expressions from the customized regular expression library, and adding the field identifications or the field identification combinations and the corresponding regular expressions into the general regular expression library.
Optionally, in order to improve the matching efficiency of the field identifier, when searching the stored regular expression library, the general regular expression library can be searched first, and then the customized regular expression library can be searched.
The method for analyzing the log provided by the embodiment of the invention acquires the current field requirement of the target application service, wherein the current field requirement comprises at least one target field identifier and the sequence of the at least one target field identifier; searching a stored regular expression library according to the stored regular expression of the target application service and the current field requirement, and acquiring field identification to be updated and updating operation of the stored regular expression, wherein the regular expression library is used for storing the regular expression of the field identification and the regular expression of the field identification combination; according to the field identification to be updated and the updating operation, updating the stored regular expression to generate a regular expression meeting the current field requirement; and analyzing the log of the target application service by adopting a regular expression meeting the current field requirement, and obtaining log analysis content. Compared with the prior art, the method avoids manual modification of the regular expression, and improves the modification efficiency and the accuracy of modifying the regular expression.
Corresponding to the method, the embodiment of the invention also provides a log analysis device, as shown in fig. 3, the log analysis regular expression generation device comprises: an acquisition unit 310, a generation unit 320, and an analysis unit 330;
an obtaining unit 310, configured to obtain a current field requirement of the target application service, where the current field requirement includes at least one target field identifier and an order of the at least one target field identifier;
searching a stored regular expression library according to the stored regular expression of the target application service and the current field requirement, and acquiring field identification to be updated and updating operation of the stored regular expression, wherein the regular expression library is used for storing a regular expression of a field identification and a regular expression of a field identification combination;
a generating unit 320, configured to update the stored regular expression according to the field identifier to be updated and the update operation, and generate a regular expression that meets the current field requirement;
and the parsing unit 330 is configured to parse the log of the target application service by using a regular expression that meets the current field requirement, so as to obtain log parsing content.
In an optional implementation, the obtaining unit 310 is further configured to obtain an initial field requirement of the target application service, where the initial field requirement includes at least one field identifier and an order of the at least one field identifier;
searching a stored regular expression library, and acquiring a regular expression of each field identifier in the at least one field identifier;
the generating unit 320 is further configured to combine the regular expressions of each field identifier in the at least one field identifier according to the order of the at least one field identifier by using an expression combining algorithm, to generate a regular expression meeting the initial field requirement, and a regular expression of field identifier combination.
In an optional implementation, the generating unit 320 is specifically configured to sort the regular expressions of the at least one field identifier according to the order of the at least one field identifier;
and combining adjacent regular expressions in the ordered regular expressions by adding a preset regular expression, so as to generate a regular expression meeting the requirement of the initial field and a field identification combined regular expression, wherein the preset regular expression is a regular expression representing matching any character.
In an optional implementation, the generating unit 320 is further configured to obtain a regular expression of the field identifier to be updated;
if the updating operation is an increasing operation, adding the regular expression of the field identifier to be updated in the stored regular expression to generate a regular expression meeting the current field requirement;
and if the updating operation is a deleting operation, deleting the regular expression of the field identification to be updated from the stored regular expressions, and generating the regular expression meeting the current field requirement.
In an optional implementation, the generating unit 320 is further specifically configured to search the regular expression library according to the order of the at least one target field identifier, and obtain a regular expression of a field identifier or a regular expression of a field identifier combination corresponding to the stored regular expression;
and combining the regular expression of the field identification to be updated with the acquired regular expression of the field identification or the regular expression of the field identification combination by adopting an expression combination algorithm to generate a regular expression meeting the current field requirement.
In an alternative implementation, field identifications and corresponding regular expressions in the regular expression library are stored in the form of key-value pairs.
In an alternative implementation, the regular expression library includes a generic regular expression library and a custom regular expression library;
the apparatus further comprises a statistics unit 340 and an adding unit 350;
the obtaining unit 310 is further configured to obtain an input field identifier and a corresponding regular expression, and a field identifier combination and a corresponding regular expression;
a statistics unit 340 for counting the number of use of application services using the input field identification or field identification combination;
an adding unit 350, configured to add the input field identifier or the field identifier combination and the corresponding regular expression to the generic regular expression library if the number of uses is not less than a preset number threshold;
and if the number of the used field identifiers is smaller than the preset number threshold, adding the input field identifiers or field identifier combinations and corresponding regular expressions into the customized regular expression library.
In an optional implementation, the statistics unit 340 is further configured to count a number of usage of an application service using each field identifier or a combination of field identifiers in the custom regular expression library;
the adding unit 350 is further configured to remove the counted field identifier or the field identifier combination and the corresponding regular expression from the custom regular expression library and add the field identifier or the field identifier combination to the generic regular expression library if the real-time counted usage number is not less than the preset number threshold.
The functions of each functional unit of the log analyzing device provided in the above embodiment of the present invention may be implemented by the above method steps, so that the specific working process and beneficial effects of each unit in the log analyzing device provided in the embodiment of the present invention are not repeated herein.
The embodiment of the present invention further provides an electronic device, as shown in fig. 4, including a processor 410, a communication interface 420, a memory 430, and a communication bus 440, where the processor 410, the communication interface 420, and the memory 430 complete communication with each other through the communication bus 440.
A memory 430 for storing a computer program;
the processor 410 is configured to execute the program stored in the memory 430, and implement the following steps:
acquiring a current field requirement of the target application service, wherein the current field requirement comprises at least one target field identifier and an order of the at least one target field identifier;
searching a stored regular expression library according to the stored regular expression of the target application service and the current field requirement, and acquiring a field identification to be updated and an updating operation of the stored regular expression, wherein the regular expression library is used for storing a regular expression of the field identification and a regular expression of the field identification combination;
Updating the stored regular expression according to the field identification to be updated and the updating operation to generate a regular expression meeting the current field requirement;
and analyzing the log of the target application service by adopting a regular expression meeting the current field requirement to acquire log analysis content.
In an alternative implementation, before obtaining the current field requirement of the target application service, the method further includes:
acquiring an initial field requirement of a target application service, wherein the initial field requirement comprises at least one field identifier and an order of the at least one field identifier;
searching a stored regular expression library, and acquiring a regular expression of each field identifier in the at least one field identifier;
and combining the regular expressions of each field identifier in the at least one field identifier according to the sequence of the at least one field identifier by adopting an expression combination algorithm to generate a regular expression meeting the initial field requirement and a field identifier combined regular expression.
In an alternative implementation, an expression combining algorithm is adopted, and a regular expression of each field identifier in the at least one field identifier is combined according to the sequence of the at least one field identifier, so as to generate a regular expression meeting the requirement of the initial field, and the regular expression of the field identifier combination includes:
Ordering the regular expressions of the at least one field identifier according to the order of the at least one field identifier;
and combining adjacent regular expressions in the ordered regular expressions by adding a preset regular expression, so as to generate a regular expression meeting the requirement of the initial field and a field identification combined regular expression, wherein the preset regular expression is a regular expression representing matching any character.
In an optional implementation, according to the field identifier to be updated and the updating operation, updating the stored regular expression to generate a regular expression meeting the current field requirement, including:
acquiring the regular expression of the field identification to be updated;
if the updating operation is an increasing operation, adding the regular expression of the field identifier to be updated in the stored regular expression to generate a regular expression meeting the current field requirement;
and if the updating operation is a deleting operation, deleting the regular expression of the field identification to be updated from the stored regular expressions, and generating the regular expression meeting the current field requirement.
In an optional implementation, adding the regular expression of the field identifier to be updated to the stored regular expression, generating a regular expression meeting the current field requirement includes:
searching the regular expression library according to the sequence of the at least one target field mark, and acquiring a regular expression of a field mark or a regular expression of a field mark combination corresponding to the stored regular expression;
and combining the regular expression of the field identification to be updated with the acquired regular expression of the field identification or the regular expression of the field identification combination by adopting an expression combination algorithm to generate a regular expression meeting the current field requirement.
In an alternative implementation, field identifications and corresponding regular expressions in the regular expression library are stored in the form of key-value pairs.
In an alternative implementation, the regular expression library includes a generic regular expression library and a custom regular expression library;
before obtaining the current field requirement of the target application service, the method further comprises:
acquiring an input field identification and a corresponding regular expression, and a field identification combination and a corresponding regular expression;
Counting the number of application services using the input field identification or field identification combination;
if the number of the used field identifiers is not smaller than a preset number threshold, adding the input field identifiers or field identifier combinations and corresponding regular expressions into the general regular expression library;
and if the number of the used field identifiers is smaller than the preset number threshold, adding the input field identifiers or field identifier combinations and corresponding regular expressions into the customized regular expression library.
In an alternative implementation, the method further comprises:
counting the number of application services using each field identifier or field identifier combination in the customized regular expression library;
and if the real-time counted number of the used fields is not smaller than the preset number threshold, removing the counted field identifications or field identification combinations and the corresponding regular expressions from the customized regular expression library, and adding the field identifications or field identification combinations and the corresponding regular expressions into the general regular expression library.
The communication bus mentioned above may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, or the like. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
Since the implementation manner and the beneficial effects of the solution to the problem of each device of the electronic apparatus in the foregoing embodiment may be implemented by referring to each step in the embodiment shown in fig. 2, the specific working process and the beneficial effects of the electronic apparatus provided by the embodiment of the present invention are not repeated herein.
In yet another embodiment of the present invention, a computer readable storage medium is provided, where instructions are stored, which when executed on a computer, cause the computer to perform the method for parsing a log according to any of the above embodiments.
In yet another embodiment of the present invention, a computer program product containing instructions that, when run on a computer, cause the computer to perform the method of parsing a log as described in any of the above embodiments is also provided.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted to embrace the preferred embodiments and all such variations and modifications as fall within the scope of the embodiments herein.
It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present application without departing from the spirit and scope of the embodiments of the present application. Thus, if such modifications and variations of the embodiments in the present application fall within the scope of the claims and the equivalents thereof in the embodiments of the present application, such modifications and variations are also intended to be included in the embodiments of the present application.

Claims (9)

1. A method for parsing a log, the method comprising:
acquiring an initial field requirement of a target application service, wherein the initial field requirement comprises at least one field identifier and an order of the at least one field identifier;
searching a stored regular expression library, and acquiring a regular expression of each field identifier in the at least one field identifier;
combining the regular expressions of each field identifier in the at least one field identifier according to the sequence of the at least one field identifier by adopting an expression combination algorithm to generate a regular expression meeting the initial field requirement and a field identifier combined regular expression; storing the regular expression of the regular expression and field identification combination;
Acquiring a current field requirement of the target application service, wherein the current field requirement comprises at least one target field identifier and an order of the at least one target field identifier;
searching a stored regular expression library according to the stored regular expression of the target application service and the current field requirement, and acquiring a field identifier to be updated and an updating operation aiming at the stored regular expression, wherein the regular expression library is used for storing a regular expression of the field identifier and a regular expression of a field identifier combination;
updating the stored regular expression according to the field identification to be updated and the updating operation to generate a regular expression meeting the current field requirement;
and analyzing the log of the target application service by adopting a regular expression meeting the current field requirement to acquire log analysis content.
2. The method of claim 1, wherein combining the regular expression of each of the at least one field identification in the order of the at least one field identification using an expression combining algorithm to generate a regular expression that meets the initial field requirement, and the field identification combined regular expression comprises:
Ordering the regular expressions of the at least one field identifier according to the order of the at least one field identifier;
and combining adjacent regular expressions in the ordered regular expressions by adding a preset regular expression, so as to generate a regular expression meeting the requirement of the initial field and a field identification combined regular expression, wherein the preset regular expression is a regular expression representing matching any character.
3. The method of claim 1, wherein,
updating the stored regular expression according to the field identifier to be updated and the updating operation to generate a regular expression meeting the current field requirement, wherein the updating operation comprises the following steps:
acquiring the regular expression of the field identification to be updated;
if the updating operation is an increasing operation, adding the regular expression of the field identifier to be updated in the stored regular expression to generate a regular expression meeting the current field requirement;
and if the updating operation is a deleting operation, deleting the regular expression of the field identification to be updated from the stored regular expressions, and generating the regular expression meeting the current field requirement.
4. The method of claim 3, wherein adding the regular expression of the field identification to be updated to the stored regular expression generates a regular expression that meets the current field requirement, comprising:
searching the regular expression library according to the sequence of the at least one target field mark, and acquiring a regular expression of a field mark or a regular expression of a field mark combination corresponding to the stored regular expression;
and combining the regular expression of the field identification to be updated with the acquired regular expression of the field identification or the regular expression of the field identification combination by adopting an expression combination algorithm to generate a regular expression meeting the current field requirement.
5. The method of claim 1, wherein the regular expression library comprises a generic regular expression library and a custom regular expression library;
before obtaining the current field requirement of the target application service, the method further comprises:
acquiring an input field identification and a corresponding regular expression, and a field identification combination and a corresponding regular expression;
counting the number of application services using the input field identification or field identification combination;
If the number of the used field identifiers is not smaller than a preset number threshold, adding the input field identifiers or field identifier combinations and corresponding regular expressions into the general regular expression library;
and if the number of the used field identifiers is smaller than the preset number threshold, adding the input field identifiers or field identifier combinations and corresponding regular expressions into the customized regular expression library.
6. The method of claim 5, wherein the method further comprises:
counting the number of application services using each field identifier or field identifier combination in the customized regular expression library;
and if the real-time counted number of the used fields is not smaller than the preset number threshold, removing the counted field identifications or field identification combinations and the corresponding regular expressions from the customized regular expression library, and adding the field identifications or field identification combinations and the corresponding regular expressions into the general regular expression library.
7. A log parsing apparatus, the apparatus comprising: an acquisition unit, a generation unit and an analysis unit;
the acquiring unit is configured to acquire an initial field requirement of a target application service, where the initial field requirement includes at least one field identifier and an order of the at least one field identifier; searching a stored regular expression library, and acquiring a regular expression of each field identifier in the at least one field identifier; combining the regular expressions of each field identifier in the at least one field identifier according to the sequence of the at least one field identifier by adopting an expression combination algorithm to generate a regular expression meeting the initial field requirement and a field identifier combined regular expression; storing the regular expression of the regular expression and field identification combination;
Acquiring a current field requirement of the target application service, wherein the current field requirement comprises at least one target field identifier and the sequence of the at least one target field identifier; searching a stored regular expression library according to the stored regular expression of the target application service and the current field requirement, and acquiring a field identifier to be updated and an updating operation aiming at the stored regular expression, wherein the regular expression library is used for storing a regular expression of the field identifier and a regular expression of a field identifier combination;
the generating unit is used for updating the stored regular expression according to the field identification to be updated and the updating operation to generate a regular expression meeting the current field requirement;
the analysis unit is used for analyzing the log of the target application service by adopting the regular expression meeting the current field requirement to acquire log analysis content.
8. An electronic device, characterized in that the electronic device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are in communication with each other through the communication bus;
A memory for storing a computer program;
a processor for implementing the method steps of any of claims 1-6 when executing a program stored on a memory.
9. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of claims 1-6.
CN201911190459.1A 2019-11-28 2019-11-28 Log analysis method and device Active CN110990350B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911190459.1A CN110990350B (en) 2019-11-28 2019-11-28 Log analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911190459.1A CN110990350B (en) 2019-11-28 2019-11-28 Log analysis method and device

Publications (2)

Publication Number Publication Date
CN110990350A CN110990350A (en) 2020-04-10
CN110990350B true CN110990350B (en) 2023-06-16

Family

ID=70087779

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911190459.1A Active CN110990350B (en) 2019-11-28 2019-11-28 Log analysis method and device

Country Status (1)

Country Link
CN (1) CN110990350B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111966641B (en) * 2020-08-18 2022-12-06 国家工业信息安全发展研究中心 Universal log normalization model configuration method and device
CN112667672A (en) * 2021-01-06 2021-04-16 北京启明星辰信息安全技术有限公司 Log analysis method and analysis device
CN115543950B (en) * 2022-09-29 2023-06-16 杭州中电安科现代科技有限公司 Log-normalized data processing system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10161916A (en) * 1996-11-28 1998-06-19 Hitachi Ltd Detection of update conflict accompanying duplication of data base
WO2016161381A1 (en) * 2015-04-03 2016-10-06 Oracle International Corporation Method and system for implementing a log parser in a log analytics system
CN107590169A (en) * 2017-04-14 2018-01-16 南方科技大学 A kind of preprocess method and system of carrier gateway data
US10275449B1 (en) * 2018-02-19 2019-04-30 Sas Institute Inc. Identification and parsing of a log record in a merged log record stream

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105245394A (en) * 2014-07-07 2016-01-13 北京风行在线技术有限公司 Method and equipment for analyzing network access log based on layered approach
CN108156131B (en) * 2017-10-27 2020-08-04 上海观安信息技术股份有限公司 Webshell detection method, electronic device and computer storage medium
CN109783330B (en) * 2018-12-10 2023-04-07 京东科技控股股份有限公司 Log processing method, log display method, and related device and system
CN110321457A (en) * 2019-04-19 2019-10-11 杭州玳数科技有限公司 Access log resolution rules generation method and device, log analytic method and system
CN110175161B (en) * 2019-04-25 2023-11-14 平安科技(深圳)有限公司 Method, device, computer equipment and storage medium for recording log

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10161916A (en) * 1996-11-28 1998-06-19 Hitachi Ltd Detection of update conflict accompanying duplication of data base
WO2016161381A1 (en) * 2015-04-03 2016-10-06 Oracle International Corporation Method and system for implementing a log parser in a log analytics system
CN107590169A (en) * 2017-04-14 2018-01-16 南方科技大学 A kind of preprocess method and system of carrier gateway data
US10275449B1 (en) * 2018-02-19 2019-04-30 Sas Institute Inc. Identification and parsing of a log record in a merged log record stream

Also Published As

Publication number Publication date
CN110990350A (en) 2020-04-10

Similar Documents

Publication Publication Date Title
CN110990350B (en) Log analysis method and device
CN112800095B (en) Data processing method, device, equipment and storage medium
CN110908997A (en) Data blood margin construction method and device, server and readable storage medium
CN111221726A (en) Test data generation method and device, storage medium and intelligent equipment
CN109582289B (en) Method, system, storage medium and processor for processing rule flow in rule engine
US20170154123A1 (en) System and method for processing metadata to determine an object sequence
CN107797823B (en) Business rule management method and device, storage medium and computer equipment
CN114328566A (en) Relationship graph updating method, device, medium, equipment and generating method
CN107330031B (en) Data storage method and device and electronic equipment
CN106940710B (en) Information pushing method and device
CN112433757A (en) Method and device for determining interface calling relationship
CN111290961A (en) Interface test management method and device and terminal equipment
CN110020166B (en) Data analysis method and related equipment
CN115408034A (en) Vehicle-mounted controller upgrading method and device, electronic equipment and storage medium
CN112667631B (en) Automatic editing method, device, equipment and storage medium for business field
CN111736848B (en) Packet conflict positioning method, device, electronic equipment and readable storage medium
CN114817389A (en) Data processing method, data processing device, storage medium and electronic equipment
CN109298831B (en) Information storage method and device
US20200073928A1 (en) Method and Apparatus for Updating Information
CN111078671A (en) Method, device, equipment and medium for modifying data table field
CN112416401B (en) Data updating method, device and equipment
CN109905475B (en) Method for outputting cloud computing monitoring data in specified format based on SQL
CN114153830B (en) Data verification method and device, computer storage medium and electronic equipment
CN112733516B (en) Method, device, equipment and storage medium for processing quick message
CN114996364B (en) Classification and classification method and device for audit logs of PaaS cloud database and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant