Disclosure of Invention
In view of the above, the present invention is proposed to provide a log data processing method and apparatus that overcomes or at least partially solves the above problems.
In one aspect of the present invention, a method for processing log data is provided, where the method includes:
acquiring an identification code and parameter data of each piece of log information, wherein the log information comprises formatted data and parameter data, and the identification code is a unique log generation identification corresponding to each piece of log information in the original log data;
generating a log file by using the identification code, the parameter data and the corresponding relation between the identification code and the parameter data of each log information, and storing the log file;
when log analysis is carried out, a format file corresponding to the log file is obtained, wherein the format file comprises an identification code of each piece of log information in the original log data, formatted data and a corresponding relation between the identification code and the formatted data;
and analyzing the log file and the format file according to the identification code to obtain corresponding log information.
Wherein, the analyzing the log file and the format file according to the identification code comprises:
and combining the parameter data and the formatted data with the same identification code in the log file and the format file according to an appointed rule.
Before the obtaining of the identification code and the parameter data of each piece of log information, the method further includes:
reading the original log data, numbering each piece of log information in the original log data, and taking the corresponding number as an identification code of the log information;
analyzing the original log data, extracting formatting data which repeatedly appears in each piece of log information of the original log data, and generating the formatting file by using the formatting data, the identification code and the corresponding relation between the identification code and the formatting data of each piece of log information.
Reading the original log data, numbering each piece of log information in the original log data, and including:
and reading log information from the original log data according to the log printing interface keywords, and numbering the read log information in sequence.
After the log file and the format file are analyzed according to the identification code, the method further comprises the following steps:
and outputting the analyzed log information to a specified file.
In another aspect of the present invention, an apparatus for processing log data is provided, the apparatus includes a log storage unit and a log parsing unit, the log storage unit includes an extraction module and a first generation module, and the log parsing unit includes an acquisition module and a parsing module:
the extraction module is used for extracting an identification code and parameter data of each piece of log information, the log information comprises formatted data and parameter data, and the identification code is a unique log generation identification corresponding to each piece of log information in the original log data;
the first generation module is used for generating a log file by the identification code, the parameter data and the corresponding relation between the identification code and the parameter data of each log information, and storing the log file;
the acquisition module is used for acquiring a format file corresponding to the log file when log analysis is carried out, wherein the format file comprises an identification code of each piece of log information in the original log data, formatted data and a corresponding relation between the identification code and the formatted data;
and the analysis module is used for analyzing the log file and the format file according to the identification code to obtain corresponding log information.
The analysis module is specifically configured to combine the parameter data and the formatted data with the same identification code in the log file and the formatted file according to an assigned rule to obtain corresponding log information.
The device further comprises a log preprocessing unit, wherein the log preprocessing unit comprises an identification module and a second generation module:
the identification module is used for reading the original log data, numbering each piece of log information in the original log data, and taking the corresponding number as an identification code of the log information;
and the second generation module is used for analyzing the original log data, extracting the formatting data which repeatedly appears in each piece of log information of the original log data, and generating the formatting file by using the formatting data, the identification code and the corresponding relation between the identification code and the formatting data of each piece of log information.
Furthermore, the invention also provides a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method as described above.
Furthermore, the present invention also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method as described above when executing the program.
The log data processing method and device provided by the embodiment of the invention can greatly reduce the storage space occupied by log data, improve the log writing performance and reduce the risk of secret leakage caused by printing of plaintext logs on the premise of not influencing later reading aiming at the problems in the existing log data processing method. The invention not only can safely and efficiently store more detailed log data, but also can not influence the overall performance of the communication system.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The invention aims to reduce the storage space occupied by log data, improve the log writing performance and reduce the risk of secret leakage caused by printing of plain logs on the premise of not influencing the reading of logs in the later period.
Fig. 1 schematically shows a flowchart of a processing method of log data according to an embodiment of the present invention. Referring to fig. 1, a method for processing log data according to an embodiment of the present invention specifically includes the following steps:
s11, acquiring an identification code and parameter data of each piece of log information, wherein the log information comprises formatted data and parameter data, and the identification code is a unique log generation identification corresponding to each piece of log information in the original log data. The identification code can be a character with identification function such as a number and an identification.
The original log data in this embodiment refers to a system source code, the system source code includes various pieces of log information, and each piece of log information includes formatted data and parameter data. The formatted data is specifically a formatted character string in the log information, and the parameter data is specifically the data part content of the variable parameter in the log information. In the embodiment of the invention, each log information in the source code has a unique log generation identification code.
S12, generating a log file by the identification code of each log information, the parameter data and the corresponding relation between the identification code and the parameter data, and storing the log file.
In this embodiment, when the log is printed in the system running process, the implementation only stores the identification code of the log and the data part content of the variable parameter, and does not need to format the log and store the formatted character string. Finally, the log data is saved as a binary log file, such as log _ data.
S13, acquiring a format file corresponding to the log file during log analysis, wherein the format file comprises the identification code of each piece of log information in the original log data, the formatted data and the corresponding relationship between the identification code and the formatted data.
In this embodiment, the identification code, the formatting data, and the corresponding relationship between the identification code and the formatting data of each piece of log information are generated into a format file in advance, and the format file is stored as a text file, such as log _ description. When the log is analyzed, firstly, the format file corresponding to the log file is obtained, so that the log file can be analyzed and restored according to the format file.
S14, analyzing the log file and the format file according to the identification code to obtain corresponding log information.
The log file and the format file are analyzed according to the identification code, and the method is realized by the following steps: and combining the parameter data and the formatted data with the same identification code in the log file and the format file according to an appointed rule.
In this embodiment, the log _ descriptor.dat and log _ data.dat files are processed, so that the binary log file is restored to readable information.
The log data processing method provided by the embodiment of the invention aims at the problems in the existing log data processing method, and when log printing is carried out, the identification code and the parameter data of each piece of log information are obtained, the identification code and the parameter data of each piece of log information and the corresponding relation between the identification code and the parameter data are generated into a log file, and the log file is stored; the method and the device for analyzing the log have the advantages that the format file corresponding to the log file is obtained when the log is analyzed, the log file and the format file are analyzed according to the identification code, and the corresponding log information is obtained.
The invention not only can safely and efficiently store more detailed log data, but also can not influence the overall performance of the communication system.
In an optional embodiment of the present invention, after parsing the log file and the format file according to the identification code in step S14, the method further comprises the following steps: and outputting the analyzed log information to a specified file.
According to the embodiment of the invention, the analyzed log information is output to a specified file, such as log. The content in the log.txt is the text log which can be read usually.
Fig. 2 schematically shows a flowchart of a log data processing method according to another embodiment of the present invention. Referring to fig. 2, the method for processing log data according to the embodiment of the present invention further includes, before acquiring the identification code and parameter data of each piece of log information in step S11, the following steps:
s01, reading the original log data, numbering each piece of log information in the original log data, and taking the corresponding number as the identification code of the log information. In this embodiment, reading the original log data, and numbering each piece of log information in the original log data is specifically implemented by the following steps: and reading log information from the original log data according to the log printing interface keywords, and numbering the read log information in sequence.
S10, analyzing the original log data, extracting the formatting data which appears repeatedly in each piece of log information of the original log data, and generating the format file by the formatting data, the identification code and the corresponding relation between the identification code and the formatting data of each piece of log information.
The original log data in this embodiment is specifically system source code. The embodiment of the invention sequentially encodes each piece of log information in the source code by scanning the source code of the system, so that each piece of log information is printed with a unique number. And then, by scanning the source code, extracting the serial number and the formatting character string of each piece of log information in the source code, and storing the serial number and the formatting character string into a text file.
The following describes an implementation procedure of the technical solution of the present invention in a specific embodiment, and specifically refer to fig. 3.
If there is a line of print code such as WRITE _ LOG (0xffffff, "hello world,% d \ n", 2017) in the system source code, the source code is scanned first, each print is numbered in turn according to the LOG print interface keyword WRITE _ LOG, and the number is unique in the system source code, for example, the print code exemplified above is WRITE _ LOG (0x10000001, "hello world,% d \ n", 2017), and is written back into the source code.
And scanning the source code, extracting the number and the formatting character string of each piece of printing according to the log printing interface keywords, such as 0x10000001 and "hello world,% d \ n" in the example, and storing the number and the formatting character string into log _ descriptor.
In the process of system operation, other modules call the LOG storage unit interface WRITE _ LOG to perform the LOG writing operation.
The LOG unique number and parameter data are taken from the parameters passed in from the WRITE LOG interface, such as 0x10000001 and 2017 in the example.
And saving the log number 0x10000001 and the parameter data 2017 into a binary file log _ data.
The log data is read from log _ data. dat item by item, and the log unique number and the corresponding print parameter data (e.g., 0x10000001 and 2017) are obtained.
The corresponding formatted string (e.g., "hello world,% d \ n" for example) is found from the log _ descriptor.
And reducing the formatted character string 'hello world,% d \ n' and the corresponding parameter data 2017 into text content 'hello world, 2017'.
And outputting the restored text content 'hello world, 2017' to a file, such as log. The content in the log.txt is the same as the content of the text log which we see at ordinary times.
Compared with the prior art, the log data processing method provided by the embodiment of the invention has the following beneficial effects: only the log numbers and the numerical values of the variable parameters are saved, and the formatted character strings occupying a larger space are not saved, so that the storage space occupied by log data is greatly reduced; because the log does not need to be formatted, the performance of writing the log is improved; because only the binary data such as the log number and the variable parameter value are saved, if the binary data are not analyzed by the log analysis system, the binary data cannot be interpreted at all, and therefore the risk of secret leakage caused by printing of the plain-text log is reduced.
For simplicity of explanation, the method embodiments are described as a series of acts or combinations, but those skilled in the art will appreciate that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently with other steps in accordance with the embodiments of the invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Fig. 4 schematically shows a schematic structural diagram of a log data processing apparatus according to an embodiment of the present invention. Referring to fig. 4, the apparatus for processing log data according to the embodiment of the present invention includes a log storage unit 10 and a log parsing unit 20.
In this embodiment, the log storage unit 10 further includes an extracting module 101 and a first generating module 102, where: the extraction module 101 is configured to extract an identification code and parameter data of each piece of log information, where the log information includes formatted data and parameter data, and the identification code is a unique log generation identifier corresponding to each piece of log information in original log data; the first generating module 102 is configured to generate a log file from the identification code, the parameter data, and the correspondence between the identification code and the parameter data of each log information, and store the log file;
in this embodiment, the log parsing unit 20 includes an obtaining module 201 and a parsing module 202, where the obtaining module 201 is configured to obtain a format file corresponding to the log file when performing log parsing, where the format file includes an identification code of each piece of log information in the original log data, formatted data, and a corresponding relationship between the identification code and the formatted data; and the analysis module 202 is configured to analyze the log file and the format file according to the identification code to obtain corresponding log information.
In an optional embodiment of the present invention, the parsing module 202 is specifically configured to combine the parameter data and the formatted data having the same identification code in the log file and the format file according to an assigned rule, so as to obtain corresponding log information.
In an optional embodiment of the present invention, the log parsing unit 20 further includes a print output module, not shown in the figure, configured to output the parsed log information to a specified file after the parsing module 202 parses the log file and the format file according to the identification code.
Fig. 5 is a schematic structural diagram of a log data processing apparatus according to another embodiment of the present invention. Referring to fig. 5, the processing apparatus of log data according to the embodiment of the present invention further includes a log preprocessing unit 30, where the log preprocessing unit 30 includes an identification module 301 and a second generation module 301, where: the identification module 301 is configured to read the original log data, number each piece of log information in the original log data, and use a corresponding number as an identification code of the log information; the second generating module 302 is configured to parse the original log data, extract formatted data that appears repeatedly in each piece of log information of the original log data, and generate the format file according to the formatted data, the identification code, and the correspondence between the identification code and the formatted data of each piece of log information.
In this embodiment, the identification module 301 is specifically configured to read the original log data, read log information from the original log data according to a log printing interface keyword, and sequentially number the read log information.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The log data processing method and device provided by the embodiment of the invention can greatly reduce the storage space occupied by log data, improve the log writing performance and reduce the risk of secret leakage caused by printing of plaintext logs on the premise of not influencing later reading aiming at the problems in the existing log data processing method. The invention not only can safely and efficiently store more detailed log data, but also can not influence the overall performance of the communication system.
Furthermore, an embodiment of the present invention also provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the method as described above.
In this embodiment, the module/unit integrated with the processing device of the log data may be stored in a computer readable storage medium if it is implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
Fig. 6 is a schematic diagram of a computer device according to an embodiment of the present invention. The computer device provided by the embodiment of the present invention includes a memory 501, a processor 502, and a computer program that is stored in the memory 501 and can be run on the processor 502, where the processor 502 implements the steps in the above-described embodiments of the processing method for log data when executing the computer program, for example, S11 shown in fig. 1, obtains an identification code and parameter data of each piece of log information, where the log information includes formatted data and parameter data, and the identification code is a unique log generation identifier corresponding to each piece of log information in original log data; s12, generating a log file by the identification code of each log information, the parameter data and the corresponding relation between the identification code and the parameter data, and storing the log file; s13, acquiring a format file corresponding to the log file during log analysis, wherein the format file comprises an identification code of each piece of log information in the original log data, formatted data and a corresponding relation between the identification code and the formatted data; s14, analyzing the log file and the format file according to the identification code to obtain corresponding log information. Alternatively, the processor 502 implements the functions of the modules/units in the processing apparatus embodiments of the log data when executing the computer program, such as the extracting module 101, the first generating module 102, the obtaining module 201, and the parsing module 202 shown in fig. 4.
Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory and executed by the processor to implement the invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used for describing the execution process of the computer program in the processing device of the log data. For example, the computer program may be divided into the extraction module 101, the first generation module 102, the acquisition module 201, and the analysis module 202, and each module has the following specific functions: the extraction module 101 is configured to extract an identification code and parameter data of each piece of log information, where the log information includes formatted data and parameter data, and the identification code is a unique log generation identifier corresponding to each piece of log information in original log data; the first generating module 102 is configured to generate a log file from the identification code, the parameter data, and the correspondence between the identification code and the parameter data of each log information, and store the log file; an obtaining module 201, configured to obtain a format file corresponding to the log file when performing log analysis, where the format file includes an identification code of each piece of log information in the original log data, formatted data, and a correspondence between the identification code and the formatted data; and the analysis module 202 is configured to analyze the log file and the format file according to the identification code to obtain corresponding log information.
The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer device may include, but is not limited to, a processor, a memory. Those skilled in the art will appreciate that the schematic diagram 6 is merely an example of a computer device and is not intended to limit the computer device and may include more or fewer components than those shown, or some components may be combined, or different components, e.g., the computer device may also include input output devices, network access devices, buses, etc.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like which is the control center for the computer device and which connects the various parts of the overall computer device using various interfaces and lines.
The memory may be used to store the computer programs and/or modules, and the processor may implement various functions of the computer device by running or executing the computer programs and/or modules stored in the memory and calling data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
Those skilled in the art will appreciate that while some embodiments herein include some features included in other embodiments, rather than others, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.