CN114416481A - Log analysis method, device, equipment and storage medium - Google Patents

Log analysis method, device, equipment and storage medium Download PDF

Info

Publication number
CN114416481A
CN114416481A CN202210039209.3A CN202210039209A CN114416481A CN 114416481 A CN114416481 A CN 114416481A CN 202210039209 A CN202210039209 A CN 202210039209A CN 114416481 A CN114416481 A CN 114416481A
Authority
CN
China
Prior art keywords
function
log
file
character string
mapping table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210039209.3A
Other languages
Chinese (zh)
Inventor
于亦夫
孙远达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Glodon Co Ltd
Original Assignee
Glodon Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Glodon Co Ltd filed Critical Glodon Co Ltd
Priority to CN202210039209.3A priority Critical patent/CN114416481A/en
Publication of CN114416481A publication Critical patent/CN114416481A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection

Abstract

The application relates to a log analysis method, a log analysis device, log analysis equipment and a storage medium, in particular to the technical field of electric digital data processing. The method comprises the following steps: acquiring a symbol file, an executable file and a log file; analyzing the symbol file to obtain a function address mapping table; analyzing the executable file to obtain a character string address mapping table; analyzing the log file to obtain a log content table; and constructing a log content function table containing function information quoted by each thread at different time according to the log content table, the character string address mapping table and the function address mapping table so as to represent the behavior condition of the application program. The log content function table obtained by the scheme can directly represent the execution condition of each thread of the application program at different time, manual gradual searching in logs and source codes is not needed, and the analysis efficiency of the logs is improved.

Description

Log analysis method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of electric digital data processing, in particular to a log analysis method, a device, equipment and a storage medium.
Background
Network equipment, systems, service programs and the like generate an event record (namely a log) called log when in operation; each row of the log records the description of the date, time, user and action.
When an exception occurs in a program, in order to determine the cause of the exception, a log generated when the application runs may be analyzed. Generally, when analyzing logs of an application program, a user needs to manually find out related logs and threads in the log logs at a time point when the program is abnormal, extract texts of each line of logs, search in source codes, locate the source codes of the log of the sentence, record the source codes, and recover event sequence conditions of code operation before the abnormal events occur line by line, so that the reason why the abnormal events occur in the application program is obtained.
However, in the above scheme, the log and the source code need to be manually and gradually searched to obtain the event sequence condition of the code operation, and the analysis efficiency of the log is low.
Disclosure of Invention
The application provides a log analysis method, a log analysis device, log analysis equipment and a storage medium, which improve the analysis efficiency of logs.
In one aspect, a log analysis method is provided, and the method includes:
acquiring a symbol file, an executable file and a log file;
analyzing the symbol file to obtain a function address mapping table; the function address mapping table is used for indicating the mapping relation between the function name and the function address;
analyzing the executable file to obtain a character string address mapping table; the character string address mapping table is used for indicating the mapping relation between character string data and function addresses;
analyzing the log file to obtain a log content table; the log content table is used for indicating character string data corresponding to each thread and time;
and constructing a log content function table containing function information quoted by each thread at different time according to the log content table, the character string address mapping table and the function address mapping table so as to represent the behavior condition of the application program.
In yet another aspect, there is provided a log analysis apparatus, the apparatus including:
the file acquisition module is used for acquiring the symbol file, the executable file and the log file;
the symbol file analyzing module is used for analyzing the symbol file to obtain a function address mapping table; the function address mapping table is used for indicating the mapping relation between the function name and the function address;
the executable file analysis module is used for analyzing the executable file to obtain a character string address mapping table; the character string address mapping table is used for indicating the mapping relation between character string data and function addresses;
the log file analysis module is used for analyzing the log file to obtain a log content table; the log content table is used for indicating character string data corresponding to each thread and time;
and the mapping table building module is used for building a log content function table containing function information quoted by each thread at different time according to the log content table, the character string address mapping table and the function address mapping table so as to represent the behavior condition of the application program.
In a possible implementation manner, the mapping table constructing module includes:
a function address query unit, configured to query, in the string address mapping table, a target function address corresponding to target string data in the log content table;
a function name query unit, configured to query, in the function address mapping table, a target function name corresponding to the target function address;
and the mapping table acquisition unit is used for inserting the mapping relation between the target function name and the target character string data into the log content table to acquire the log content function table.
In a possible implementation, the function address query unit is further configured to,
performing word segmentation processing on the target character string data to obtain a target word segmentation set of the target character string data;
and determining a function address corresponding to the character string data with the highest similarity with the target word segmentation set in the character string address mapping table as the target function address.
In one possible implementation, the apparatus further includes:
and the time axis display module is used for displaying the time axes corresponding to the threads in the log analysis interface and displaying the function names of the self reference functions on the time axes corresponding to the threads according to the time sequence.
In one possible implementation, the timeline presentation module is further configured to,
displaying function controls on time axes corresponding to the threads respectively; the function control is superposed with and displays the function name of the reference function;
the device further comprises:
the operation information display module is used for responding to the received specified operation of the function control and displaying the operation information corresponding to the function control on the log analysis interface; the operation information at least comprises a local variable corresponding to the function name of the reference function.
The device further comprises:
the abnormal time point acquisition module is used for inquiring the abnormal snapshot information table and acquiring an abnormal time point;
and the abnormal time point display module is used for displaying the abnormal time point on the log analysis interface.
In one possible implementation, the apparatus further includes:
and the abnormal snapshot information table generating module is used for reading the memory process snapshot, obtaining each process name, process ID, process version number and abnormal occurrence time stored in the memory process snapshot, and generating an abnormal snapshot information table corresponding to the memory process snapshot.
In yet another aspect, a computer device is provided, which includes a processor and a memory, where the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, and the at least one instruction, at least one program, a set of codes, or a set of instructions is loaded and executed by the processor to implement the log analysis method.
In yet another aspect, a computer-readable storage medium is provided, in which at least one instruction is stored, and the at least one instruction is loaded and executed by a processor to implement the log analysis method described above.
In yet another aspect, a computer program product is provided, as well as a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the log analysis method described above.
The technical scheme provided by the application can comprise the following beneficial effects:
when the log needs to be analyzed, the computer device can read the symbol file set, the executable file and the log file corresponding to the application program and analyze each file, so as to respectively obtain the mapping relation between the function name and the function address, the mapping relation between the character string and the function address and the character string data related to time; therefore, according to the mapping relation, the computer equipment can establish the function names quoted by the character strings in the log file to construct a log content function table indicating the function information quoted by the threads of the application program at different time, so that the execution condition of the application program at different time can be directly represented, manual gradual searching in the log and the source code is not needed, and the analysis efficiency of the log is improved.
Drawings
In order to more clearly illustrate the detailed description of the present application or the technical solutions in the prior art, the drawings needed to be used in the detailed description of the present application or the prior art description will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic diagram illustrating a structure of a log analysis system according to an exemplary embodiment.
FIG. 2 is a method flow diagram illustrating a method of log analysis in accordance with an exemplary embodiment.
Fig. 3 is a flowchart illustrating a symbol file remapping table according to an embodiment of the present application.
Fig. 4 is a flowchart illustrating an executable file string-to-mapping table according to an embodiment of the present application.
Fig. 5 is a schematic diagram illustrating a flow of a log file remapping table according to an embodiment of the present application.
FIG. 6 is a method flow diagram illustrating a method of log analysis in accordance with an exemplary embodiment.
Fig. 7 shows a schematic diagram of a log analysis interface according to an embodiment of the present application.
Fig. 8 is a schematic diagram illustrating a memory snapshot file to mapping table according to an embodiment of the present application.
Fig. 9 is a schematic diagram illustrating a memory snapshot file to mapping table according to an embodiment of the present application.
Fig. 10 shows a schematic structural diagram of software modules of a log analysis system according to an embodiment of the present application.
Fig. 11 is a block diagram illustrating a configuration of a log analyzing apparatus according to an exemplary embodiment.
FIG. 12 is a schematic diagram of a computer device provided in accordance with an exemplary embodiment of the present application.
Detailed Description
The technical solutions of the present application will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be understood that "indication" mentioned in the embodiments of the present application may be a direct indication, an indirect indication, or an indication of an association relationship. For example, a indicates B, which may mean that a directly indicates B, e.g., B may be obtained by a; it may also mean that a indicates B indirectly, for example, a indicates C, and B may be obtained by C; it can also mean that there is an association between a and B.
In the description of the embodiments of the present application, the term "correspond" may indicate that there is a direct correspondence or an indirect correspondence between the two, may also indicate that there is an association between the two, and may also indicate and be indicated, configure and configured, and so on.
In the embodiment of the present application, "predefining" may be implemented by saving a corresponding code, table, or other manners that may be used to indicate related information in advance in a device (for example, including a terminal device and a network device), and the present application is not limited to a specific implementation manner thereof.
Before describing the various embodiments shown herein, several concepts related to the present application will be described.
1) Log file (Log)
The log file is a record file or a file set for recording system operation events and can be divided into an event log and a message log. The method has important roles in processing historical data, tracing diagnosis problems, understanding system activities and the like. In a computer, a log file is a file that records events that occur while an operating system or other software is running or messages between different users of the communication software. Logging is the act of keeping a log. In the simplest case, the messages are written to a single log file. Many operating systems, software frameworks and programs include logging systems.
2) Symbolic file
Symbol Files (Symbol Files) are a data information file. The method comprises application binary files, such as EXE (executable file), DLL (Dynamic Link Library) and other debugging information, and is specially used for debugging, the finally generated executable file does not need the symbolic file during running, but all variable information in the program is recorded in the file. This file is very important when debugging the application. This file is used with both Visual C + + and WinDbg debuggers. In the Windows system, the symbol file is extended with.pdb, such as: there is a gdi32.DLL file under each Windows operating system, and the compiler will generate a gdi32.PDB file when compiling the DLL, and once this PDB file is owned, it can be used to debug and trace into the gdi32. DLL. The file is closely related to the compiled version of the binary file, for example, the output function of the DLL is modified, and then the DLL is compiled, so that the original PDB (film DataBase) file is outdated, the old PDB file cannot be used for debugging, and the latest PDB file version must be used.
3) Executable file
An executable file refers to a file that may be loaded for execution by an operating system. In different operating system environments, executable programs are presented differently. Under the windows operating system, the executable program may be an exe file, sys file, com, etc. type file. The Win32 executable file is called a PE file. The basic structure of a PE file and a DOS executable are very different. It divides different parts of the program into sections (sections), where one Section may be used to place various resources, such as menus, dialog boxes, bitmaps, cursors, icons, sounds, etc. While the resource portion may be understood to resemble an "overlay" portion of a DOS executable, its format is fixed since the resource is a standard component of a Win32 executable and is a very important component. The Win32 software is developed with one more step of creating resource files compared to the development process of DOS software.
Fig. 1 is a schematic diagram illustrating a structure of a log analysis system according to an exemplary embodiment. The log analysis system includes a server 110 and a terminal 120.
When an application program in the terminal 120 runs, a log file corresponding to the application program is generated to record the running state of the application program.
Optionally, after the terminal 120 generates the log file corresponding to the application program, the log file is uploaded to the server 110 and stored in a data storage module in the server 110, so that the subsequent server 110 analyzes the log file.
Optionally, the terminal 120 may also upload an executable file and a symbol file corresponding to the application program to the server 110, and store the executable file and the symbol file in a data storage module in the server 110, so that the server 110 analyzes the log file through contents in the executable file, the symbol file and the log file, thereby determining the historical operating state of the application program.
Optionally, when the terminal 120 detects that the application program is abnormal, the terminal 120 converts the data stored in the memory into a memory snapshot and stores the memory snapshot in a memory snapshot file. The terminal 120 can upload the memory snapshot file to the server 110, so that the server 110 can analyze the log file according to the memory snapshot file.
Optionally, the log file, the symbol file, the executable file, and the memory snapshot file corresponding to the application program may also be stored in the terminal 120, and at this time, the terminal 120 may also analyze the content in the log file by using the file, so as to determine the historical operating state of the application program.
Optionally, the server may be an independent physical server, a server cluster formed by a plurality of physical servers, or a distributed system, and may also be a cloud server that provides technical computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, and a big data and artificial intelligence platform.
Optionally, the system may further include a management device, where the management device is configured to manage the system (e.g., manage connection states between the modules and the server, and the management device is connected to the server through a communication network. Optionally, the communication network is a wired network or a wireless network.
Optionally, the wireless network or wired network described above uses standard communication techniques and/or protocols. The network is typically the internet, but may be any other network including, but not limited to, a local area network, a metropolitan area network, a wide area network, a mobile, a limited or wireless network, a private network, or any combination of virtual private networks. In some embodiments, data exchanged over the network is represented using techniques and/or formats including hypertext markup language, extensible markup language, and the like. All or some of the links may also be encrypted using conventional encryption techniques such as secure sockets layer, transport layer security, virtual private network, internet protocol security, and the like. In other embodiments, custom and/or dedicated data communication techniques may also be used in place of, or in addition to, the data communication techniques described above.
FIG. 2 is a method flow diagram illustrating a method of log analysis in accordance with an exemplary embodiment. The method is performed by a server or a terminal, which may be a server or a terminal in a log analysis system as shown in fig. 1. As shown in fig. 2, the log analysis method may include the steps of:
step 201, obtaining a symbol file, an executable file and a log file.
In this embodiment of the present application, the log file to be analyzed is a running record generated during the running of the application program, and taking a windows platform as an example, the log text is a document recorded in the code of the running of the application program, and the content is a line text, such as gxxx _ log.1.txt. The method has the main function of recording actions or exceptions on a timeline in the running process of the program, so that operation and maintenance personnel can trace the actions of the program during running by analyzing the log.
In a possible implementation manner of the embodiment of the present application, the log file at least includes a module name, a version number, time, a process ID, a thread ID, and a string text.
The module name in the log file is used for indicating that the log text is the log generated by which module, the version number is the version number of the module, the time is the generation time, the process ID is used for indicating the process ID for generating the log, the thread ID is used for indicating the thread ID for generating the log, and the string text is used for indicating the action actually executed by the application program in the log text.
In the embodiment of the application, because the log content table is expanded, the string text in the log content table is used in the log content table, and the query is performed in the mapping table generated by other files until the function corresponding to the log text is queried.
In this embodiment of the application, taking windows platform as an example, the Executable file may be a binary Executable file, typically in a PE format file (PE, Portable Executable) format, and is in a standard file format of a microsoft Win32 environment Portable Executable file (such as exe, dll, vxd, sys, and vdm), and is an execution program module unit generated after code compilation. The main role is to provide specific functional services to the content at runtime. Two are common: an exe file or a dll file.
In one possible implementation of the embodiment of the present application, the executable file includes a process name, a module name, an address, a length, a string text, and a reference address.
In the executable file, the module name is the name of the executable file (i.e. the name of the execution module), and the process name indicates the corresponding process of the executable file when the application program is executed; the address is the address of the executable part in the executable file, the length is the length of the executable part, the word string text is the execution resource in the executable file, and the reference address is the reference address of the function for processing the execution resource corresponding to the word string text.
In the embodiment of the application, a windows platform is taken as an example, and the symbol file generally refers to a pdb file, which is mainly used for mapping name symbols in source codes and machine codes of addresses of binary runtime in a debugging process, so that the readability of debugging codes is improved after the binary code in runtime is translated through the symbols. For example, the symbol file corresponding to the above-mentioned calc.exe is calc.pdb. The symbol file corresponding to capip.dll is capip.pdb.
In a possible implementation manner of the embodiment of the present application, the symbolic file includes a module name, a version number, a function name, a function address, and a function length. The module name is the name of the symbolic file, usually corresponding to the module name of the binary executable file, and the version number is the version number of the symbolic file; and the symbol file also comprises the function name, the function address and the function length of each function called in the execution process of the executable file.
Step 202, analyzing the symbol file to obtain a function address mapping table.
The function address mapping table is used for indicating the mapping relation between the function name and the function address, and the function address mapping table is stored in the computer device in relational two-dimensional data. When the computer device obtains the symbol file, the module name, the version number, the function name, the function address and the function length in the symbol file can be extracted, the module name, the version number, the function name, the function address and the function length in each line or each text interval in the symbol file are obtained, and a function address mapping table is generated according to the mapping relation.
For example, fig. 3 shows a schematic flowchart of a symbol file remapping table according to an embodiment of the present application. After a symbol file resolver (i.e. parsing software of a symbol file) in computer equipment opens a symbol file, the symbol file resolver reads each function in the symbol file, obtains a module name, a version number, a function name, a function address and a function length corresponding to each function, generates corresponding records, and finally inserts each record into a function address mapping table, wherein each record in the function address mapping table represents a mapping.
In the generated function address mapping table, after the module name and the version number are determined, the function address corresponding to the function and the function length of the function can be determined according to the function name of the module reference function.
In a possible implementation manner, because the texts stored in the symbol file are distributed according to a certain rule inside the computer device, when the computer device needs to extract the symbol file, word segmentation processing (for example, dividing each line of texts according to the number of characters or dividing the texts according to the internal identifier of the symbol file) can be performed according to the rule corresponding to the symbol file, so as to divide the contents in the symbol file into "module name, version number, function name, function address, and function length".
Step 203, the executable file is analyzed to obtain a character string address mapping table.
The character string address mapping table is used for indicating the mapping relation between character string data and function addresses, and the character string address mapping table is stored in the computer device in relational two-dimensional data. When the computer device acquires the executable file, the process name, the module name, the address, the length, the word string text and the reference address in the executable file can be extracted, and a function address mapping table is generated according to the mapping relation.
For example, fig. 4 is a schematic flowchart illustrating an executable file string-to-mapping table according to an embodiment of the present application. As shown in fig. 4, after the binary executable file parser (i.e., software for parsing the binary specifiable file) loads the binary executable file, the binary executable file may be read to read each string resource (including string content, length, reference address, etc.) and use it as a record, and finally insert each record (i.e., each string information) into the mapping table (i.e., the two-dimensional table record) as each mapping in the mapping table.
That is, after generating the string address mapping table, the computer device may perform a query in the string address mapping table according to the string data to obtain the reference address corresponding to the string data.
In a possible implementation manner, since the texts stored in the executable file are distributed according to a certain rule inside the computer device, when the computer device needs to extract the executable file, word segmentation processing may be performed according to a rule corresponding to the symbol file (for example, each line of texts is divided according to the number of characters, or according to an identifier inside the executable file), so as to divide the contents in the executable file into "process name, module name, address, length, string text, and reference address".
Step 204, analyzing the log file to obtain a log content table.
The log content table is used for indicating character string data corresponding to time of each thread. When the computer device obtains the log file, each line of text in the log file contains "module name, version number, time, process ID, thread ID, and word string text", and the computer device can perform word segmentation on each line of text content in the log file to obtain "module name", "version number", "process ID", "thread ID", and "word string text", respectively, and establish a corresponding log content table to store a mapping relationship between the word string text and "module name, version number, time, process ID, and thread ID".
For example, fig. 5 shows a flowchart of a log file remapping table according to an embodiment of the present application. As shown in fig. 5, after a text file is opened by a log content analyzer (i.e., software for analyzing log content), the log content analyzer reads each line of text (i.e., each log) in the log content, extracts fields such as time, process ID, thread ID, and content included in the log, and generates a record corresponding to the log. The log content parser inserts each record into a mapping table (i.e., a two-dimensional table record) as a respective mapping in the mapping table.
Step 205, according to the log content table, the string address mapping table and the function address mapping table, a log content function table containing function information referenced by each thread at different times is constructed to represent behavior of the application program.
After the computer device obtains the log content table, the character string address mapping table and the function address mapping table, the computer device may query the character string address mapping table according to the character string text recorded in the log content table to obtain a reference address corresponding to the character string text, where the reference address is an address of a function referenced by the log content.
And the computer equipment queries in the function address mapping table through the reference address, and because the function address and the function length corresponding to each function name are recorded in the function address mapping table, a function interval in which the reference address falls can be determined in the function address mapping table, and the function name corresponding to the function interval is returned.
The computer device obtains the function name of the function quoted by the log content, when the function name corresponding to each log content (namely each line in the log content table) is obtained, a log content function table can be constructed, and each line of the log content function table comprises { module name, version number, time, process ID, thread ID, string text and function name }, so that the log content function table can represent, each execution module occupies the process ID and the thread ID in the designated time, the string text used in the thread indicated by each thread ID and the quoted function name, thereby displaying the running state of the application program more completely.
In summary, when the log needs to be analyzed, the computer device may read a symbol file set, an executable file, and a log file corresponding to the application program, and analyze each file, thereby respectively obtaining a mapping relationship between a function name and a function address, a mapping relationship between a character string and a function address, and character string data related to time; therefore, according to the mapping relation, the computer equipment can establish the function names quoted by the character strings in the log file to construct a log content function table indicating the function information quoted by the threads of the application program at different time, so that the execution condition of the application program at different time can be directly represented, manual gradual searching in the log and the source code is not needed, and the analysis efficiency of the log is improved.
FIG. 6 is a method flow diagram illustrating a method of log analysis in accordance with an exemplary embodiment. The method is performed by a server or a terminal, which may be a server or a terminal in a log analysis system as shown in fig. 1. As shown in fig. 6, the log analysis method may include the steps of:
step 601, obtaining a symbol file, an executable file and a log file.
Step 602, parsing the symbol file to obtain a function address mapping table.
Step 603, the executable file is analyzed to obtain a string address mapping table.
Step 604, analyzing the log file to obtain a log content table.
The specific implementation of steps 601 and 604 is similar to the specific implementation of steps 201 to 204, and is not described here again.
Step 605, in the character string address mapping table, querying a target function address corresponding to the target character string data in the log content table.
For target character string data in a log content table, when the target character string data needs to be analyzed, because the character string address mapping table has the mapping relation between the character string data and the function address, when the target character string data is queried in the character string address mapping table according to the target character string data, the target function address corresponding to the target character string data can be found.
In a possible implementation manner, performing word segmentation processing on the target character string data to obtain a target word segmentation set of the target character string data; and determining a function address corresponding to the character string data with the highest similarity with the target word segmentation set in the character string address mapping table as the target function address.
Since the target character string data in the log content table (i.e., the target character string data extracted from the log file) may not completely conform to the format of the character string data in the character string address mapping table (i.e., the character string data in the binary executable file), for example, the target character string data in the log content table contains an identifier such as a percentile, while the character string data in the character string address mapping table does not contain an identifier, the target character string data in the log content table may not find the same character string data in the character string address mapping table.
The computer equipment can perform word segmentation processing on the target character string data to obtain a target word segmentation set of the target character string data, then performs similarity comparison on the target word segmentation set and the character string data in the character string address mapping table, and determines a function address corresponding to the character string data with the highest similarity as a target function address, so that query failure caused by format inconsistency is avoided.
In another possible implementation manner, word segmentation is performed on the target character string data, and a word segmentation result is vectorized to obtain a target character string vector; performing word segmentation on each character string data in the character string address mapping table, and vectorizing word segmentation results to obtain each query vector corresponding to the character string address mapping table; and determining the function address indicated by the query vector with the minimum vector distance with the target character string vector as a target function address.
That is, when a query operation needs to be performed on the target string data in the string address mapping table, the target string data and each string data in the string address mapping table may be vectorized, for example, when the target string data is 012042045061, a one-dimensional vector value is generated for every three data (for example, an accumulation operation is performed), so as to generate a target string vector (3,6,9, 7); similarly, each character string data in the character string address mapping table may also be operated as follows to calculate the vector distance between vectors, thereby determining the target function address.
Step 606, in the function address mapping table, the name of the target function corresponding to the target function address is queried.
After the target function address corresponding to the target character string data is obtained, the target function name corresponding to the target function address can be inquired in the function address mapping table, so that the target function name corresponding to the target character string data is obtained.
In a possible implementation manner, the function address mapping table stores the correspondence among function names, function addresses, and function lengths, that is, for a certain function name, the start address of the function name and the length of the function name exist in the function address mapping table, so that a function interval corresponding to the function name can be obtained; and after the function interval corresponding to the target function address is determined according to the function address mapping table, the function name corresponding to the function interval corresponding to the target function address is the target function name.
Step 607, inserting the mapping relation between the target function name and the target character string data into the log content table to obtain the log content function table.
After the target function name corresponding to the target character string data is obtained, the fact that a mapping relation exists between the target character string data and the target function name can be determined, and then the mapping relation between the target function name and the target character string is inserted into the log content table, so that a new log content function table is generated.
In the log content function table, a function name corresponding to the character string data exists in addition to the module name, the process ID, the thread ID, the time, and the character string data included in the original log content table, so that the target process of the target module name can be specified, and the target character string data is processed by the target function indicated by the target function name within the target time of the target thread.
Step 608, displaying time axes corresponding to the threads in the log analysis interface, and displaying function names of the self-reference functions on the time axes corresponding to the threads according to the time sequence.
Please refer to fig. 7, which illustrates a schematic diagram of a log analysis interface according to an embodiment of the present application. When a log content function table of function information referenced by each thread at different time is constructed, the function name of the function referenced by each thread according to time is recorded in the log content function table, the computer device can generate a time axis for each thread, and display the function name of the function referenced by the thread (such as counter, obj.
In a possible implementation manner, function controls are displayed on a time axis corresponding to each thread; the function control is superposed with and displays the function name of the reference function; in response to receiving the specified operation on the function control, displaying operation information corresponding to the function control on the log analysis interface; the operation information at least comprises a local variable corresponding to the function name of the reference function.
That is, on the log analysis interface shown in fig. 7, when the log analysis interface is displayed on the terminal device, the user may click the function control displayed on the time axis corresponding to each thread through a mouse or touch operation, so as to display specific operation information (e.g., local variables used by the function) of the reference function.
In a possible implementation mode, an abnormal snapshot information table is inquired to obtain an abnormal time point; and displaying the abnormal time point on the log analysis interface.
The computer equipment also stores an abnormal snapshot information table of the application program, the abnormal snapshot information table records an abnormal time point of the application program, and the abnormal time point can be displayed on a log analysis interface after the computer equipment reads the abnormal time point, so that a user can directly analyze the abnormal condition of the log according to the abnormal time point.
In a possible implementation manner, an abnormal time control on which abnormal time information is superimposed is displayed on a time axis in the log analysis interface; in response to receiving a second specified operation for the exception time control, presenting stack information on the log analysis interface.
The abnormal snapshot information table may further store a thread ID of the application program that is abnormal, and according to the thread ID corresponding to the abnormal thread, the abnormal thread shown in fig. 7 may be determined in the log analysis interface, and an abnormal time control is displayed on the time axis on the log analysis interface shown in fig. 7, and the abnormal time point "17.20" is displayed on the abnormal time control in an overlapping manner. When the user clicks or touches the abnormal time control, stack information read from the abnormal memory snapshot can be displayed in the log analysis interface.
In a possible implementation manner, the memory process snapshot is read, the process name, the process ID, the process version number, and the abnormality occurrence time stored in the memory process snapshot are obtained, and the abnormality snapshot information table corresponding to the memory process snapshot is generated.
Due to the application program in the computer equipment, when an abnormality occurs, a snapshot is generated from the memory at the moment, and the snapshot is stored in the snapshot file of the abnormal memory process. The computer device can analyze the abnormal memory process snapshot file (and the memory process snapshot) so as to obtain each process name, process ID, process version number and abnormal occurrence time stored in the memory process snapshot, and generate an abnormal snapshot information table corresponding to the memory process snapshot.
Please refer to fig. 8, which illustrates a schematic diagram of a memory snapshot file to mapping table according to an embodiment of the present application.
In the process of analyzing the process memory snapshot, a process memory snapshot analyzer (i.e., software for analyzing the process memory snapshot) in the computer device opens a process snapshot file, reads basic information (i.e., a process name, a version number, exception occurrence time, a process ID, and the like) in the process memory snapshot, and reads a call stack of each thread, thereby determining each function (including a function name, a function address, and the like) in the call stack, and inserts the basic information and the call stack into a process exception basic information table (i.e., an exception snapshot information table).
And the process snapshot file can also analyze the basic information (including module name, version number, module loading address, module size, etc.) of the module used by each thread, and generate a module information table by using the basic information of each module and the function information stored in the call stack, thereby indicating the module running state and the memory occupation state of the application program when the application program is abnormal.
In summary, when the log needs to be analyzed, the computer device may read a symbol file set, an executable file, and a log file corresponding to the application program, and analyze each file, thereby respectively obtaining a mapping relationship between a function name and a function address, a mapping relationship between a character string and a function address, and character string data related to time; therefore, according to the mapping relation, the computer equipment can establish the function names quoted by the character strings in the log file to construct a log content function table indicating the function information quoted by the threads of the application program at different time, so that the execution condition of the application program at different time can be directly represented, manual gradual searching in the log and the source code is not needed, and the analysis efficiency of the log is improved.
Please refer to fig. 9, which is a schematic diagram illustrating an exception log analysis process according to an embodiment of the present application. As shown in fig. 9, the steps of the anomaly log analysis are as follows.
Step 901, obtaining upload data.
1) Acquiring an uploaded symbol file set, wherein the general symbol files have various formats, such as binary structured storage file formats and character string formats, triggering symbol file analysis (see fig. 7), and after the uploading is completed, triggering analysis action, extracting a function set from the symbol files, and storing the function set and an address interval of the function into a function address mapping table of a relational database.
2) The method comprises the steps of obtaining an uploaded executable file and a module file set, wherein the general executable file can be in different binary file formats on different operating systems, comprises the executable file and a dynamic library, an independent relocation table is generally arranged inside the general executable file, a character string constant is mainly arranged inside the general executable file, and contents can be extracted from the character string constant to a character string address mapping table.
3) The method comprises the steps of obtaining an uploaded process memory snapshot file, generating abnormal context generally when a process is abnormal, dumping a process current memory into the file by the memory snapshot, and analyzing relevant information from the file to be added into a snapshot information table.
4) And after the uploaded log file is obtained, triggering an analysis action, analyzing the text in the log file line by an analyzer, and extracting the content of each time, process ID, thread ID and word string from the text. Added to the log content table.
Step 902, the computer device starts analysis, at this time, the string content in each log is extracted from the log content table, and the strings are matched in the string address mapping table to obtain addresses. Namely, the function address of the log is printed; and then, using the word string to refer to the address, inquiring a function address mapping table, wherein the address interval (function address + function size) of each function is obtained, and if the reference address falls within the function address interval, the function name is obtained.
And associating the function name with the corresponding log content record to obtain a log record containing the function name, and adding the log record into a new log content function table.
And step 903, generating a workflow diagram.
From the log content function table, a multi-threaded runtime graph on the time axis can be generated.
And 904, displaying the details of the workflow diagram.
The user clicks the workflow diagram, details can be seen, the details can be extracted more in a process snapshot, and more fields such as stack information such as local variables and the like are stored in the two-dimensional table, so that a more detailed workflow diagram is constructed.
Please refer to fig. 10, which illustrates a software module structure diagram of a log analysis system according to an embodiment of the present application. The abnormal log analysis system comprises a data analyzer, an association analyzer and a chart generator, wherein the data analyzer, the association analyzer, the chart generator and the like are modules abstracted by software, and actually execute corresponding functions through the software or codes.
A data parser: the method mainly analyzes the input data with different formats and converts the data into a relational two-dimensional table. The data parser comprises an abnormal log parser, a symbolic file parser, a process memory snapshot parser and a binary executable file parser, so that each type of file is parsed.
The association analyzer is used for establishing an association mapping relation in the two-dimensional table to form a new relation table. The association analyzer includes: the process extractor is used for acquiring threads and processes corresponding to the mapping relations; the log string positioning analyzer is used for corresponding the log content in the log file with the source code in the binary executable file; and the abnormal time interval positioning analysis is used for corresponding the abnormal time point on the abnormal memory snapshot with the time point in the log content.
A chart generator: and finally forming a source code function multithread distribution graph on the time axis through the correlation analysis.
Fig. 11 is a block diagram illustrating a configuration of a log analyzing apparatus according to an exemplary embodiment. The device comprises:
a file obtaining module 1101, configured to obtain a symbol file, an executable file, and a log file;
a symbol file analyzing module 1102, configured to analyze a symbol file to obtain a function address mapping table; the function address mapping table is used for indicating the mapping relation between the function name and the function address;
an executable file parsing module 1103, configured to parse the executable file to obtain a string address mapping table; the character string address mapping table is used for indicating the mapping relation between character string data and function addresses;
a log file analysis module 1104, configured to analyze the log file to obtain a log content table; the log content table is used for indicating character string data corresponding to each thread and time;
a mapping table building module 1105, configured to build a log content function table containing function information referenced by each thread at different times according to the log content table, the string address mapping table, and the function address mapping table, so as to characterize behavior of an application program.
In a possible implementation manner, the mapping table constructing module includes:
a function address query unit, configured to query, in the string address mapping table, a target function address corresponding to target string data in the log content table;
a function name query unit, configured to query, in the function address mapping table, a target function name corresponding to the target function address;
and the mapping table acquisition unit is used for inserting the mapping relation between the target function name and the target character string data into the log content table to acquire the log content function table.
In a possible implementation, the function address query unit is further configured to,
performing word segmentation processing on the target character string data to obtain a target word segmentation set of the target character string data;
and determining a function address corresponding to the character string data with the highest similarity with the target word segmentation set in the character string address mapping table as the target function address.
In one possible implementation, the apparatus further includes:
and the time axis display module is used for displaying the time axes corresponding to the threads in the log analysis interface and displaying the function names of the self reference functions on the time axes corresponding to the threads according to the time sequence.
In one possible implementation, the timeline presentation module is further configured to,
displaying function controls on time axes corresponding to the threads respectively; the function control is superposed with and displays the function name of the reference function;
the device further comprises:
the operation information display module is used for responding to the received specified operation of the function control and displaying the operation information corresponding to the function control on the log analysis interface; the operation information at least comprises a local variable corresponding to the function name of the reference function.
The device further comprises:
the abnormal time point acquisition module is used for inquiring the abnormal snapshot information table and acquiring an abnormal time point;
and the abnormal time point display module is used for displaying the abnormal time point on the log analysis interface.
In one possible implementation, the apparatus further includes:
and the abnormal snapshot information table generating module is used for reading the memory process snapshot, obtaining each process name, process ID, process version number and abnormal occurrence time stored in the memory process snapshot, and generating an abnormal snapshot information table corresponding to the memory process snapshot.
In summary, when the log needs to be analyzed, the computer device may read a symbol file set, an executable file, and a log file corresponding to the application program, and analyze each file, thereby respectively obtaining a mapping relationship between a function name and a function address, a mapping relationship between a character string and a function address, and character string data related to time; therefore, according to the mapping relation, the computer equipment can establish the function names quoted by the character strings in the log file to construct a log content function table indicating the function information quoted by the threads of the application program at different time, so that the execution condition of the application program at different time can be directly represented, manual gradual searching in the log and the source code is not needed, and the analysis efficiency of the log is improved.
Please refer to fig. 12, which is a schematic diagram of a computer device according to an exemplary embodiment of the present application, the computer device includes a memory and a processor, the memory is used for storing a computer program, and the computer program is executed by the processor to implement the log analysis method.
The processor may be a Central Processing Unit (CPU). The Processor may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or a combination thereof.
The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods of the embodiments of the present invention. The processor executes various functional applications and data processing of the processor by executing non-transitory software programs, instructions and modules stored in the memory, that is, the method in the above method embodiment is realized.
The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor, and the like. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and such remote memory may be coupled to the processor via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
In an exemplary embodiment, a computer readable storage medium is also provided for storing at least one computer program, which is loaded and executed by a processor to implement all or part of the steps of the above method. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (10)

1. A method of log analysis, the method comprising:
acquiring a symbol file, an executable file and a log file;
analyzing the symbol file to obtain a function address mapping table; the function address mapping table is used for indicating the mapping relation between the function name and the function address;
analyzing the executable file to obtain a character string address mapping table; the character string address mapping table is used for indicating the mapping relation between character string data and function addresses;
analyzing the log file to obtain a log content table; the log content table is used for indicating character string data corresponding to each thread and time;
and constructing a log content function table containing function information quoted by each thread at different time according to the log content table, the character string address mapping table and the function address mapping table so as to represent the behavior condition of the application program.
2. The method according to claim 1, wherein constructing a log content function table containing function information referenced by the respective threads at different times according to the log content table, the string address mapping table, and the function address mapping table comprises:
in the character string address mapping table, inquiring a target function address corresponding to target character string data in the log content table;
in the function address mapping table, inquiring a target function name corresponding to the target function address;
and inserting the mapping relation between the target function name and the target character string data into the log content table to obtain the log content function table.
3. The method according to claim 2, wherein the querying, in the string address mapping table, a target function address corresponding to target string data in the log content table includes:
performing word segmentation processing on the target character string data to obtain a target word segmentation set of the target character string data;
and determining a function address corresponding to the character string data with the highest similarity with the target word segmentation set in the character string address mapping table as the target function address.
4. The method of any of claims 1 to 3, further comprising:
and respectively displaying time axes corresponding to the threads in a log analysis interface, and respectively displaying function names of the self-reference functions on the time axes corresponding to the threads according to a time sequence.
5. The method according to claim 4, wherein the respectively displaying the function names of the self-reference functions on the time axes respectively corresponding to the threads comprises:
displaying function controls on time axes corresponding to the threads respectively; the function control is superposed with and displays the function name of the reference function;
the method further comprises the following steps:
in response to receiving the specified operation on the function control, displaying operation information corresponding to the function control on the log analysis interface; the operation information at least comprises a local variable corresponding to the function name of the reference function.
6. The method of claim 4, further comprising:
inquiring an abnormal snapshot information table to obtain an abnormal time point;
and displaying the abnormal time point on the log analysis interface.
7. The method according to claim 6, wherein said querying the abnormal snapshot information table to obtain the abnormal time point comprises:
reading the memory process snapshot, obtaining each process name, process ID, process version number and exception occurrence time stored in the memory process snapshot, and generating an exception snapshot information table corresponding to the memory process snapshot.
8. An apparatus for log analysis, the apparatus comprising:
the file acquisition module is used for acquiring the symbol file, the executable file and the log file;
the symbol file analyzing module is used for analyzing the symbol file to obtain a function address mapping table; the function address mapping table is used for indicating the mapping relation between the function name and the function address;
the executable file analysis module is used for analyzing the executable file to obtain a character string address mapping table; the character string address mapping table is used for indicating the mapping relation between character string data and function addresses;
the log file analysis module is used for analyzing the log file to obtain a log content table; the log content table is used for indicating character string data corresponding to each thread and time;
and the mapping table building module is used for building a log content function table containing function information quoted by each thread at different time according to the log content table, the character string address mapping table and the function address mapping table so as to represent the behavior condition of the application program.
9. A computer device comprising a processor and a memory, the memory having at least one instruction, at least one program, a set of codes, or a set of instructions stored therein, the at least one instruction, at least one program, set of codes, or set of instructions being loaded and executed by the processor to implement the log analysis method according to any one of claims 1 to 7.
10. A computer-readable storage medium having stored therein at least one instruction which is loaded and executed by a processor to implement the log analysis method of any one of claims 1 to 7.
CN202210039209.3A 2022-01-13 2022-01-13 Log analysis method, device, equipment and storage medium Pending CN114416481A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210039209.3A CN114416481A (en) 2022-01-13 2022-01-13 Log analysis method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210039209.3A CN114416481A (en) 2022-01-13 2022-01-13 Log analysis method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114416481A true CN114416481A (en) 2022-04-29

Family

ID=81273777

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210039209.3A Pending CN114416481A (en) 2022-01-13 2022-01-13 Log analysis method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114416481A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089563A (en) * 2022-07-28 2023-05-09 荣耀终端有限公司 Log processing method and related device
CN116910000A (en) * 2023-06-30 2023-10-20 荣耀终端有限公司 Log processing method, log storage method and embedded equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089563A (en) * 2022-07-28 2023-05-09 荣耀终端有限公司 Log processing method and related device
CN116089563B (en) * 2022-07-28 2024-03-26 荣耀终端有限公司 Log processing method and related device
CN116910000A (en) * 2023-06-30 2023-10-20 荣耀终端有限公司 Log processing method, log storage method and embedded equipment

Similar Documents

Publication Publication Date Title
Fazzini et al. Automatically translating bug reports into test cases for mobile apps
CN112015430A (en) JavaScript code translation method and device, computer equipment and storage medium
US20180060415A1 (en) Language tag management on international data storage
CN111507086B (en) Automatic discovery of translated text locations in localized applications
US10509719B2 (en) Automatic regression identification
CN114416481A (en) Log analysis method, device, equipment and storage medium
US11436133B2 (en) Comparable user interface object identifications
US20130275951A1 (en) Race detection for web applications
CN110688307A (en) JavaScript code detection method, apparatus, device and storage medium
CN113076104A (en) Page generation method, device, equipment and storage medium
CN111459495A (en) Unit test code file generation method, electronic device and storage medium
JP2010067188A (en) Information processor for supporting programming, information processing system, and programming support method and program
CN111796809A (en) Interface document generation method and device, electronic equipment and medium
KR20140050323A (en) Method and apparatus for license verification of binary file
CN107391528B (en) Front-end component dependent information searching method and equipment
CN115033489A (en) Code resource detection method and device, electronic equipment and storage medium
US20220382776A1 (en) Message templatization for log analytics
US20130042224A1 (en) Application analysis device
CN114610364A (en) Application program updating method, application program developing method, application program updating device, application program developing device and computer equipment
Chen et al. Tracking down dynamic feature code changes against Python software evolution
CN112162954A (en) User operation log generation method, user operation log generation device
Admiraal et al. Deriving Modernity Signatures of Codebases with Static Analysis
CN113656044B (en) Android installation package compression method and device, computer equipment and storage medium
CN112748930B (en) Compilation detection method, device, equipment and storage medium
CN114090428A (en) Information processing method, information processing device, computer-readable storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination