CN113342744B - Parallel construction method, device and equipment of call chain and storage medium - Google Patents

Parallel construction method, device and equipment of call chain and storage medium Download PDF

Info

Publication number
CN113342744B
CN113342744B CN202110614040.5A CN202110614040A CN113342744B CN 113342744 B CN113342744 B CN 113342744B CN 202110614040 A CN202110614040 A CN 202110614040A CN 113342744 B CN113342744 B CN 113342744B
Authority
CN
China
Prior art keywords
log data
event
time
request
path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110614040.5A
Other languages
Chinese (zh)
Other versions
CN113342744A (en
Inventor
饶琛琳
梁玫娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Youtejie Information Technology Co ltd
Original Assignee
Beijing Youtejie Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Youtejie Information Technology Co ltd filed Critical Beijing Youtejie Information Technology Co ltd
Priority to CN202110614040.5A priority Critical patent/CN113342744B/en
Publication of CN113342744A publication Critical patent/CN113342744A/en
Application granted granted Critical
Publication of CN113342744B publication Critical patent/CN113342744B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Abstract

The embodiment of the invention discloses a parallel construction method, a device, equipment and a storage medium of a call chain. The method comprises the following steps: acquiring log data corresponding to at least one request, and calibrating timestamp information in the log data; according to the timestamp information, carrying out rolling time bucket division on the log data, and carrying out service clustering processing on the log data in each time bucket; according to log data in each cluster, constructing an event connected graph and acquiring an event key path; and merging the event key paths in each time bucket according to a preset path merging rule to obtain a call chain corresponding to each request. The technical scheme of the embodiment of the invention realizes that the call chain is constructed for a plurality of service requests simultaneously through data grouping in massive log data.

Description

Parallel construction method, device and equipment of call chain and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a parallel construction method, a parallel construction device, parallel construction equipment and a parallel construction storage medium of a call chain.
Background
The call chain tracking system can perform end-to-end display on a complete call link requiring processing by analyzing and processing log data generated in a service call process. By performing statistical analysis on the call links from different dimensions, abnormal services can be located, and system performance bottlenecks and the like can be analyzed.
In the prior art, in order to pursue good system performance, a call chain tracking system usually only constructs a call chain for one request, and the construction process uses less log data and has smaller calculation scale. When the request amount of each service interface is large for the mass log data, the existing call chain tracking system cannot meet the requirements, and simultaneously, a call chain corresponding to each request is constructed.
Disclosure of Invention
The embodiment of the invention provides a parallel construction method, a device, equipment and a storage medium of a call chain, which are used for realizing the construction of the call chain for a plurality of service requests simultaneously through data grouping in massive log data.
In a first aspect, an embodiment of the present invention provides a parallel construction method for a call chain, including:
acquiring log data corresponding to at least one request, and calibrating timestamp information in the log data;
according to the timestamp information, carrying out rolling time bucket division on the log data, and carrying out service clustering processing on the log data in each time bucket;
according to log data in each cluster, constructing an event connected graph and acquiring an event key path;
and merging the event key paths in each time bucket according to a preset path merging rule to obtain a call chain corresponding to each request.
Optionally, the obtaining log data corresponding to at least one request, and calibrating timestamp information in the log data includes:
receiving at least one request and acquiring log data corresponding to the at least one request;
extracting time information from log data according to a built-in analysis rule of the system, and converting the time information into a preset timestamp format;
and calibrating the timestamp information in the log data from different host sources based on the global public NTP server.
Optionally, according to the timestamp information, performing rolling time bucket division on the log data, and performing service clustering processing on the log data in each time bucket, including:
acquiring pre-configured barrel-dividing time, and according to the timestamp information, performing rolling time barrel-dividing on the log data according to the barrel-dividing time;
in each time bucket, one-hot coding is carried out on the port value/the service operation type value in the log data, and Kmeans clustering processing is carried out on the log data according to the coding value.
Optionally, constructing an event connectivity graph and acquiring an event critical path according to log data in each cluster, including:
extracting a request unique identifier and an event type of an interface from log data in the cluster, and constructing a full connectivity graph of the event according to the request unique identifier and the event type;
deleting a part with an overlapping time interval, a part with other events in the middle and a part inconsistent with the calling chain corpus in the full connected graph according to the time shifting condition of the event type to obtain an event connected graph;
and taking the path with the maximum sum of the event durations in the event connected graph as an event key path.
Optionally, merging the event critical paths in each time bucket according to a preset path merging rule to obtain a call chain corresponding to each request, where the call chain includes:
taking each request as a target request in sequence, and performing correlation calculation on event key paths associated with the target requests in all time buckets;
and carrying out union set processing on the event critical paths meeting the correlation conditions to obtain a call chain corresponding to the target request.
Optionally, extracting time information from the log data according to a built-in parsing rule of the system, and converting the time information into a preset timestamp format, including:
matching the log data by using a preset time format, and determining the position of time information in the log data;
and according to the field names and the separators of the field values and the separators among the fields in the built-in parsing rule of the system, extracting the time information from the positioning position in the log data, and converting the time information into a preset time stamp format.
Optionally, after obtaining the log data corresponding to the at least one request, the method further includes:
and according to a built-in analysis rule of the system, extracting the request unique identifier from the log data, and converting the request unique identifier into a uniform format in a key-value pair form.
In a second aspect, an embodiment of the present invention further provides a parallel building apparatus for a call chain, including:
the time calibration module is used for acquiring log data corresponding to at least one request and calibrating timestamp information in the log data;
the grouping processing module is used for carrying out rolling time bucket division on the log data according to the timestamp information and carrying out service clustering processing on the log data in each time bucket;
the path acquisition module is used for constructing an event connected graph and acquiring an event key path according to the log data in each cluster;
and the path merging module is used for merging the event key paths in each time bucket according to a preset path merging rule to obtain a call chain corresponding to each request.
In a third aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the parallel construction method of the call chain provided by any embodiment of the present invention.
In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the parallel construction method for a call chain provided in any embodiment of the present invention.
In the embodiment of the invention, log data corresponding to at least one request is obtained, and timestamp information in the log data is calibrated; according to the timestamp information, carrying out rolling time bucket division on the log data, and carrying out service clustering processing on the log data in each time bucket; according to log data in each cluster, constructing an event connected graph and acquiring an event key path; according to the preset path merging rules, event key paths in each time bucket are merged to obtain a calling chain corresponding to each request, the problem that calling chains corresponding to a plurality of requests cannot be built at the same time in the prior art is solved, and the purpose that calling chains are built for a plurality of service requests at the same time through data grouping in massive log data is achieved.
Drawings
FIG. 1 is a flowchart of a parallel construction method of a call chain according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a parallel construction method for call chains in the second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a parallel construction apparatus for a call chain in a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device in a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a parallel construction method of a call chain in an embodiment of the present invention, where this embodiment is applicable to a case where a call chain is constructed for multiple requests in mass log data at the same time, and the method may be executed by a parallel construction apparatus of a call chain, where the apparatus may be implemented by hardware and/or software, and may be generally integrated in an electronic device providing a call chain construction service. As shown in fig. 1, the method includes:
step 110, obtaining log data corresponding to at least one request, and calibrating timestamp information in the log data.
In this embodiment, the call chain refers to a call chain that is generated by dotting call information between services, such as time, interface, hierarchy, result, and other information, into a log during a process of completing a service call, and then connecting all the dotting data into a tree chain. The calling chain can restore the service end-to-end complete execution calling process, thereby performing statistical analysis of different dimensions, identifying abnormal service calling, quickly analyzing and positioning abnormal service and the like.
Optionally, acquiring log data corresponding to at least one request, and calibrating timestamp information in the log data, may include: receiving at least one request and acquiring log data corresponding to the at least one request; extracting time information from log data according to a built-in analysis rule of the system, and converting the time information into a preset timestamp format; and calibrating the timestamp information in the log data from different host sources based on the global public NTP server.
In this embodiment, at least one call chain construction request, for example, a call chain construction request for a bank deposit transaction, a call chain construction request for a bank withdrawal transaction, and a call chain construction request for a bank cross-bank remittance transaction, may be received, and then mass log data corresponding to the at least one request, for example, all log data generated by the bank, may be obtained. Considering that a large amount of log data may include log data generated by a plurality of log recording systems, and time formats in different log recording systems may be inconsistent, it is necessary to extract time information from the log data according to a system built-in parsing rule, and convert the time information into a uniformly set time stamp format, so as to perform time binning on the log data in the following process. Then, recording a public field of each piece of data in the mass log data: requesting a unique identifier, a host address, a host local timestamp and an event type of an interface, and performing Time-shifting alignment on timestamp information in log data of different host sources at a receiving end according to a Round Trip delay (RTT) and an NTP difference on the basis of a global public NTP server.
If there are multiple pieces of time information in some log data, the time information closest to the current time may be extracted from the log, or one piece of time information may be extracted at random.
And 120, performing rolling time bucket division on the log data according to the timestamp information, and performing service clustering processing on the log data in each time bucket.
In this embodiment, it is considered that the calculation amount of the call chain corresponding to each request is directly analyzed and constructed from the massive log data is relatively large, and therefore, rolling time binning may be performed on all log data according to the calibrated timestamp information in each log data, so as to bin the log data generated in at least one service call process executed at the same time into the same time bin, thereby implementing data grouping. And then, according to the service operation type, clustering processing is carried out on the log data in the same time bucket so as to further carry out data grouping and reduce the calculated amount. Meanwhile, log data generated aiming at the same service calling process are concentrated into one cluster, so that a calling chain corresponding to each service can be conveniently constructed subsequently.
And step 130, constructing an event connection graph and acquiring an event key path according to the log data in each cluster.
In this embodiment, because log data in one cluster may only be directed to one service invocation process, in each cluster, the causal relationship of each event associated with a service may be determined according to a request unique identifier, and two events having the causal relationship are communicated with each other, so as to construct an event communication graph. The longest execution path is then selected from the event connectivity graph as the critical path of the event connectivity graph, which is actually a directed acyclic graph that is part of the traffic call chain.
And step 140, merging the event key paths in each time bucket according to a preset path merging rule to obtain a call chain corresponding to each request.
In this embodiment, since there may be a case where multiple operations are executed in parallel in one service invocation process, there may be multiple paths for the invocation chain corresponding to the service. Therefore, after the key paths corresponding to the clusters are obtained, the key paths corresponding to all the clusters in all the time buckets can be merged according to the preset path merging rule aiming at each service, so as to obtain a complete call chain corresponding to each request.
In the embodiment of the invention, log data corresponding to at least one request is obtained, and timestamp information in the log data is calibrated; according to the timestamp information, carrying out rolling time bucket division on the log data, and carrying out service clustering processing on the log data in each time bucket; according to log data in each cluster, constructing an event connected graph and acquiring an event key path; according to the preset path merging rules, event key paths in each time bucket are merged to obtain a calling chain corresponding to each request, the problem that calling chains corresponding to a plurality of requests cannot be built at the same time in the prior art is solved, and the purpose that calling chains are built for a plurality of service requests at the same time through data grouping in massive log data is achieved.
Example two
Fig. 2 is a flowchart of a parallel construction method of a call chain in the second embodiment of the present invention, and this embodiment may be combined with various alternatives in the above embodiments. Specifically, referring to fig. 2, the method may include the steps of:
step 210, receiving at least one request, and obtaining log data corresponding to the at least one request.
Optionally, after obtaining the log data corresponding to the at least one request, the method may further include: and according to a built-in analysis rule of the system, extracting the request unique identifier from the log data, and converting the request unique identifier into a uniform format in a key-value pair form.
In this embodiment, after obtaining the mass log data corresponding to the at least one call chain construction request, the request unique identifier may be extracted from the log data according to the field separators and the field-value separators defined in the system built-in parsing rule, and the request unique identifier is converted into a uniform domain name: and the domain value format realizes the uniform format of the request unique identifier in the log data.
And step 220, extracting time information from the log data according to a built-in analysis rule of the system, and converting the time information into a preset time stamp format.
In this embodiment, it is considered that a large amount of log data may include log data generated by a plurality of log recording systems, and time formats in different log recording systems may be inconsistent, so that in order to group the log data according to time and reduce the amount of computation, time information in the log data needs to be converted into a uniformly set time stamp format.
Optionally, extracting time information from the log data according to a built-in parsing rule of the system, and converting the time information into a preset timestamp format, which may include: matching the log data by using a preset time format, and determining the position of time information in the log data; and according to the field names and the separators of the field values and the separators among the fields in the built-in parsing rule of the system, extracting the time information from the positioning position in the log data, and converting the time information into a preset time stamp format.
In this embodiment, some preset common time formats may be used to perform format matching on each log data, so as to determine the location of the time information in each log data. The common time format may include: 1)1998-12-31,% Y-% m-% d; 2)98-12-31,% y-% m-% d; 3)1998years,312days,% Y years,% j days; 4) jan 24,2003,% b% d,% Y; 5) january 24,2003,% B% d,% Y; 6)1397477611.862,% s.% 3N, etc. Then, according to the field name and the separator of the field value and the separator between the fields in the system built-in parsing rule, extracting the time information at the positioning position in the log data, and according to "timemap": the format of '2011-09-12T 13:00: 42.000Z' converts the time information into a uniform time stamp format. And then, recording information such as a request unique identifier, a host address, a host local timestamp, an event type of an interface and the like of each piece of data in the mass log data.
Step 230, calibrating timestamp information in log data from different host sources based on the global common NTP server.
In this embodiment, in order to conveniently group log data according to time, the host local timestamps in each log data need to be calibrated based on the global common NTP server and the network round-trip time between the client and the server. Assuming the request and response delays are symmetric, it is necessary to calculate the clock offset from the NTP server and after time alignment from the clock offset, the first timestamp of the server is 1/2 network round trip times just after the client timestamp issued by the task.
In this embodiment, a plurality of network round trip time estimated values are used to avoid delay interference caused by a Transmission Control Protocol (TCP) slow request, and no additional resource investment is required, thereby greatly saving labor cost.
And 240, performing rolling time bucket division on the log data according to the timestamp information, and performing service clustering processing on the log data in each time bucket.
In this embodiment, the log data may be grouped according to the calibrated timestamp information in each log data, the log data generated in at least one service invocation process executed at the same time are grouped into the same group, and the log data in the same group is further clustered according to the service, so as to reduce the calculation amount for constructing the service invocation chain.
Optionally, performing rolling time bucket division on the log data according to the timestamp information, and performing service clustering processing on the log data in each time bucket, may include: acquiring pre-configured barrel-dividing time, and according to the timestamp information, performing rolling time barrel-dividing on the log data according to the barrel-dividing time; in each time bucket, one-hot coding is carried out on the port value/the service operation type value in the log data, and Kmeans clustering processing is carried out on the log data according to the coding value.
In this embodiment, the time length of the sub-bucket is determined by the user according to the actual service statistics, for example, if the normal transaction time is 10 minutes, 10 minutes may be set as the time length of the sub-bucket, that is, when time sub-bucket is performed on the log data according to the timestamp, the log data in every ten minutes is divided into one time bucket from the smallest timestamp. The reason for time-bucketing log data is that log data that are close in time and within a service duration may be log data generated during execution of the same service call or log data generated during execution of multiple service calls at the same time. Then, in each time bucket, one-hot coding is carried out on the port value/service operation type value related in the log data, and then Kmeans clustering is carried out on the log data in the time bucket according to the coding result, so that each cluster corresponds to a service calling process, a calling chain of the service is convenient to generate, and the calculation amount is reduced.
And 250, constructing an event connection graph and acquiring an event key path according to the log data in each cluster.
In this embodiment, in order to restore the service end-to-end execution calling process, for the log data in each cluster, a causal relationship between events in the log data is analyzed to establish an event connectivity graph, and then a critical path is calculated according to the event connectivity graph.
Optionally, constructing an event connectivity graph and acquiring an event critical path according to log data in each cluster, which may include: extracting a request unique identifier and an event type of an interface from log data in the cluster, and constructing a full connectivity graph of the event according to the request unique identifier and the event type; deleting a part with an overlapping time interval, a part with other events in the middle and a part inconsistent with the calling chain corpus in the full connected graph according to the time shifting condition of the event type to obtain an event connected graph; and taking the path with the maximum sum of the event durations in the event connected graph as an event key path.
In this embodiment, a miniature log segment may be converted from log data, each log segment is defined as an event recorded when the same task is executed twice consecutively, each segment is marked by < task, start event, end event >, and the duration of one segment is the time interval between two events. And constructing a full connection graph of the events corresponding to the services according to the request unique identifier and the event type in the log data. Since all possible relationships between all events are included in the fully connected graph, including both correct and incorrect hypothetical relationships. Therefore, the full connectivity graph needs to be deleted according to the causal relationship between the events, so as to obtain the event connectivity graph corresponding to the service invocation process. The event type may include a server type, a network type, and a client type.
In this embodiment, when the full-connected graph is deleted, a part of the full-connected graph with an overlapping time period may be deleted according to a time shift condition of an event type, for example, when an event B is executed during the execution of an event a, there is time overlap between the two, and there is no causal relationship between the two, so that a path between a and B in the full-connected graph is deleted. Portions of the full connectivity graph that are interspersed with other events may also be deleted, e.g., a connectivity between a and B, a connectivity between B and C, and the path between a and C needs to be deleted. It is also possible to delete portions of the fully-connected graph that are inconsistent with the call chain corpus, i.e., traverse the call chain corpus, and delete the path in the fully-connected graph that is opposite to the call chain in the corpus. The call chain corpus stores the established call chains.
In this embodiment, a path with the longest event execution time in the event connectivity graph may be used as the event critical path. The analysis request of the critical path can be found in the first event of the path with the longest execution time, and the last event is usually the termination of execution of some JavaScript. The length of the event critical path is the end-to-end delay of the entire request. If the paths with equal execution time length are calculated, the first found path is selected as the event critical path.
And step 260, merging the event key paths in each time bucket according to a preset path merging rule to obtain a call chain corresponding to each request.
In this embodiment, since there may be a case where multiple operations are executed in parallel in one service invocation process, there may be multiple paths in a service invocation chain. Therefore, for each service, the critical paths corresponding to all clusters in all time buckets need to be merged to obtain a complete call chain corresponding to each request.
Optionally, merging the event critical paths in each time bucket according to a preset path merging rule to obtain a call chain corresponding to each request, where the call chain includes: taking each request as a target request in sequence, and performing correlation calculation on event key paths associated with the target requests in all time buckets; and carrying out union set processing on the event critical paths meeting the correlation conditions to obtain a call chain corresponding to the target request.
In this embodiment, when merging the event critical paths to obtain a complete call chain, the target request may be determined first, all event critical paths associated with the target request are compared, and completely identical event critical paths are directly merged, that is, only one event critical path is reserved. And for the event critical paths which are not completely the same, including the event critical paths after direct combination, taking each event critical path as a vector, and calculating a correlation coefficient between the event critical paths. And if the correlation coefficient is larger than or equal to the threshold value, the correlation condition is considered to be met, and the calling chain corresponding to the target request is obtained by performing union set on the event critical path. And if the correlation coefficient is smaller than the threshold value, the event critical paths are considered to belong to completely different services respectively, and path combination is not needed. The threshold value can be set according to requirements, and can generally take a value of 0.3-0.5.
In the embodiment of the invention, log data corresponding to at least one request is obtained, and timestamp information in the log data is calibrated; according to the timestamp information, carrying out rolling time bucket division on the log data, and carrying out service clustering processing on the log data in each time bucket; according to log data in each cluster, constructing an event connected graph and acquiring an event key path; according to the preset path merging rules, event key paths in each time bucket are merged to obtain a calling chain corresponding to each request, the problem that the calling chains corresponding to a plurality of requests cannot be built at the same time in the prior art is solved, the calculation amount is reduced by utilizing time bucket division and clustering processing through proper time bucket division granularity under the condition that the request amount of each service interface of massive log data is large, and the calling chains are built for a plurality of service requests at the same time.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a parallel construction apparatus for a call chain in a third embodiment of the present invention, which is applicable to a case where a call chain is simultaneously constructed for multiple requests in massive log data. As shown in fig. 3, the parallel construction apparatus of the call chain includes:
a time calibration module 310, configured to obtain log data corresponding to at least one request, and calibrate timestamp information in the log data;
the grouping processing module 320 is used for performing rolling time bucket division on the log data according to the timestamp information and performing service clustering processing on the log data in each time bucket;
the path obtaining module 330 is configured to construct an event connectivity graph and obtain an event key path according to log data in each cluster;
and the path merging module 340 is configured to merge event critical paths in each time bucket according to a preset path merging rule to obtain a call chain corresponding to each request.
In the embodiment of the invention, log data corresponding to at least one request is obtained, and timestamp information in the log data is calibrated; according to the timestamp information, carrying out rolling time bucket division on the log data, and carrying out service clustering processing on the log data in each time bucket; according to log data in each cluster, constructing an event connected graph and acquiring an event key path; according to the preset path merging rules, event key paths in each time bucket are merged to obtain a calling chain corresponding to each request, the problem that calling chains corresponding to a plurality of requests cannot be built at the same time in the prior art is solved, and the purpose that calling chains are built for a plurality of service requests at the same time through data grouping in massive log data is achieved.
Optionally, the time calibration module 310 is configured to:
receiving at least one request and acquiring log data corresponding to the at least one request;
extracting time information from log data according to a built-in analysis rule of the system, and converting the time information into a preset timestamp format;
and calibrating the timestamp information in the log data from different host sources based on the global public NTP server.
Optionally, the packet processing module 320 is configured to:
acquiring pre-configured barrel-dividing time, and according to the timestamp information, performing rolling time barrel-dividing on the log data according to the barrel-dividing time;
in each time bucket, one-hot coding is carried out on the port value/the service operation type value in the log data, and Kmeans clustering processing is carried out on the log data according to the coding value.
Optionally, the path obtaining module 330 is configured to:
extracting a request unique identifier and an event type of an interface from log data in the cluster, and constructing a full connectivity graph of the event according to the request unique identifier and the event type;
deleting a part with an overlapping time interval, a part with other events in the middle and a part inconsistent with the calling chain corpus in the full connected graph according to the time shifting condition of the event type to obtain an event connected graph;
and taking the path with the maximum sum of the event durations in the event connected graph as an event key path.
Optionally, the path merging module 340 is configured to:
taking each request as a target request in sequence, and performing correlation calculation on event key paths associated with the target requests in all time buckets;
and carrying out union set processing on the event critical paths meeting the correlation conditions to obtain a call chain corresponding to the target request.
Optionally, the packet processing module 320 includes:
the time unifying unit is used for matching the log data by using a preset time format and determining the position of the timestamp information in the log data;
and according to the field names and the separators of the field values and the separators among the fields in the built-in parsing rule of the system, extracting the time information from the positioning position in the log data, and converting the time information into a preset time stamp format.
Optionally, the method further includes: and the identifier unification module is used for extracting the request unique identifier from the log data according to a built-in analysis rule of the system after the log data corresponding to at least one request is obtained, and converting the request unique identifier into a unified format of a key-value pair form.
The parallel construction device of the call chain provided by the embodiment of the invention can execute the parallel construction method of the call chain provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 4 is a schematic structural diagram of an electronic device in a fourth embodiment of the present invention. Fig. 4 illustrates a block diagram of an exemplary device 12 suitable for use in implementing embodiments of the present invention. The device 12 shown in fig. 4 is only an example and should not bring any limitation to the function and scope of use of the embodiments of the present invention.
As shown in FIG. 4, device 12 is in the form of a general purpose computing device. The components of device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, and commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with device 12, and/or with any devices (e.g., network card, modem, etc.) that enable device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 20. As shown, the network adapter 20 communicates with the other modules of the device 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by executing programs stored in the system memory 28, for example, implementing a parallel construction method of a call chain provided by an embodiment of the present invention.
Namely: the parallel construction method for the call chain is realized, and comprises the following steps:
acquiring log data corresponding to at least one request, and calibrating timestamp information in the log data;
according to the timestamp information, carrying out rolling time bucket division on the log data, and carrying out service clustering processing on the log data in each time bucket;
according to log data in each cluster, constructing an event connected graph and acquiring an event key path;
and merging the event key paths in each time bucket according to a preset path merging rule to obtain a call chain corresponding to each request.
EXAMPLE five
The fifth embodiment of the present invention further discloses a computer storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method for parallel construction of a call chain is implemented, including:
acquiring log data corresponding to at least one request, and calibrating timestamp information in the log data;
according to the timestamp information, carrying out rolling time bucket division on the log data, and carrying out service clustering processing on the log data in each time bucket;
according to log data in each cluster, constructing an event connected graph and acquiring an event key path;
and merging the event key paths in each time bucket according to a preset path merging rule to obtain a call chain corresponding to each request.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (9)

1. A parallel construction method of a call chain is characterized by comprising the following steps:
acquiring log data corresponding to at least one request, and calibrating timestamp information in the log data;
according to the timestamp information, carrying out rolling time bucket division on the log data, and carrying out service clustering processing on the log data in each time bucket;
according to log data in each cluster, constructing an event connected graph and acquiring an event key path;
merging the event key paths in the time buckets according to a preset path merging rule to obtain calling chains corresponding to the requests;
the constructing an event connected graph and acquiring an event key path according to the log data in each cluster comprises the following steps:
extracting a request unique identifier and an event type of an interface from log data in a cluster, and constructing a full connectivity graph of an event according to the request unique identifier and the event type;
deleting a part with an overlapping time interval, a part with other events in the middle and a part inconsistent with the calling chain corpus in the full connected graph according to the time shifting condition of the event type to obtain an event connected graph;
taking the path with the maximum sum of the event durations in the event connected graph as an event key path;
the method for constructing the event connected graph and acquiring the event key path according to the log data in each cluster further comprises the following steps: and connecting the events with the association relation in the log data in each cluster to form the event connection graph.
2. The method of claim 1, wherein obtaining log data corresponding to at least one request and calibrating timestamp information in the log data comprises:
receiving at least one request and acquiring log data corresponding to the at least one request;
extracting time information from the log data according to a built-in analysis rule of the system, and converting the time information into a preset time stamp format;
and calibrating the timestamp information in the log data from different host sources based on the global public network time protocol NTP server.
3. The method of claim 1, wherein performing rolling time binning on the log data according to the timestamp information and performing service clustering on the log data in each time bin comprises:
acquiring a pre-configured barrel dividing time length, and dividing the rolling time of the log data into barrels according to the barrel dividing time length according to the timestamp information;
in each time bucket, one-hot coding is carried out on the port value/the service operation type value in the log data, and Kmeans clustering processing is carried out on the log data according to the coding value.
4. The method of claim 1, wherein merging event critical paths in each of the time buckets according to a preset path merging rule to obtain a call chain corresponding to each of the requests comprises:
taking each request as a target request in sequence, and performing correlation calculation on event key paths associated with the target requests in all time buckets;
and carrying out union set processing on the event critical paths meeting the correlation conditions to obtain a call chain corresponding to the target request.
5. The method of claim 2, wherein extracting time information from the log data according to a system-in parsing rule and converting the time information into a preset timestamp format comprises:
matching the log data by using a preset time format, and determining the position of time information in the log data;
and according to the field name and the separator of the field value and the separators among the fields in the built-in parsing rule of the system, extracting time information from the positioning position in the log data, and converting the time information into a preset time stamp format.
6. The method of claim 2, after obtaining log data corresponding to the at least one request, further comprising:
and according to a built-in analysis rule of the system, extracting a request unique identifier from the log data, and converting the request unique identifier into a uniform format in a key-value pair form.
7. An apparatus for parallel construction of call chains, comprising:
the time calibration module is used for acquiring log data corresponding to at least one request and calibrating timestamp information in the log data;
the grouping processing module is used for carrying out rolling time bucket division on the log data according to the timestamp information and carrying out service clustering processing on the log data in each time bucket;
the path acquisition module is used for constructing an event connected graph and acquiring an event key path according to the log data in each cluster;
the path merging module is used for merging the event key paths in the time buckets according to a preset path merging rule to obtain call chains corresponding to the requests;
the path acquisition module is configured to:
extracting a request unique identifier and an event type of an interface from log data in the cluster, and constructing a full connectivity graph of the event according to the request unique identifier and the event type;
deleting a part with an overlapping time interval, a part with other events in the middle and a part inconsistent with the calling chain corpus in the full connected graph according to the time shifting condition of the event type to obtain an event connected graph;
taking the path with the maximum sum of the event durations in the event connected graph as an event key path;
the path obtaining module is further configured to: and connecting the time with the association relation in the log data in each cluster to form the event connection graph.
8. An electronic device, characterized in that the device comprises:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a method of parallel construction of a call chain as recited in any of claims 1-6.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method for parallel construction of a call chain according to any one of claims 1 to 6.
CN202110614040.5A 2021-06-02 2021-06-02 Parallel construction method, device and equipment of call chain and storage medium Active CN113342744B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110614040.5A CN113342744B (en) 2021-06-02 2021-06-02 Parallel construction method, device and equipment of call chain and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110614040.5A CN113342744B (en) 2021-06-02 2021-06-02 Parallel construction method, device and equipment of call chain and storage medium

Publications (2)

Publication Number Publication Date
CN113342744A CN113342744A (en) 2021-09-03
CN113342744B true CN113342744B (en) 2022-02-15

Family

ID=77474754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110614040.5A Active CN113342744B (en) 2021-06-02 2021-06-02 Parallel construction method, device and equipment of call chain and storage medium

Country Status (1)

Country Link
CN (1) CN113342744B (en)

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102314491B (en) * 2011-08-23 2013-03-13 杭州电子科技大学 Method for identifying similar behavior mode users in multicore environment based on massive logs
US8429165B1 (en) * 2012-03-07 2013-04-23 Xerox Corporation Systems and methods of partitioning data for synchronous parallel processing
US9183200B1 (en) * 2012-08-02 2015-11-10 Symantec Corporation Scale up deduplication engine via efficient partitioning
CN107688619B (en) * 2017-08-10 2020-06-16 奇安信科技集团股份有限公司 Log data processing method and device
CN112559301B (en) * 2019-09-10 2022-05-27 网易(杭州)网络有限公司 Service processing method, storage medium, processor and electronic device
CN112559513A (en) * 2019-09-10 2021-03-26 网易(杭州)网络有限公司 Link data access method, device, storage medium, processor and electronic device
CN110708339B (en) * 2019-11-06 2021-06-22 四川长虹电器股份有限公司 Correlation analysis method based on WEB log
CN111966653A (en) * 2020-08-05 2020-11-20 深圳乐信软件技术有限公司 Data processing method, device, server and storage medium for micro-service call link

Also Published As

Publication number Publication date
CN113342744A (en) 2021-09-03

Similar Documents

Publication Publication Date Title
US10560465B2 (en) Real time anomaly detection for data streams
US8521871B2 (en) System and method for merging monitoring data streams from a server and a client of the server
US9037555B2 (en) Asynchronous collection and correlation of trace and communications event data
CN112395300B (en) Data processing method, device and equipment based on block chain and readable storage medium
KR101989330B1 (en) Auditing of data processing applications
US11188443B2 (en) Method, apparatus and system for processing log data
US20090222506A1 (en) System and method for metering and analyzing usage and performance data of a virtualized compute and network infrastructure
CN110647447B (en) Abnormal instance detection method, device, equipment and medium for distributed system
CN110750592A (en) Data synchronization method, device and terminal equipment
US20220086075A1 (en) Collecting route-based traffic metrics in a service-oriented system
CN112039701A (en) Interface call monitoring method, device, equipment and storage medium
CN114490268A (en) Full link monitoring method, device, equipment, storage medium and program product
CN115567607A (en) Processing method, device and system for calling link, electronic equipment and storage medium
CN111291054A (en) Data processing method and device, computer equipment and storage medium
CN113342744B (en) Parallel construction method, device and equipment of call chain and storage medium
US20130125139A1 (en) Logging In A Computer System
CN111966653A (en) Data processing method, device, server and storage medium for micro-service call link
CN114218173A (en) Batch processing system, processing method, medium and equipment for account-transfer transaction files
CN113138906A (en) Call chain data acquisition method, device, equipment and storage medium
CN112579673A (en) Multi-source data processing method and device
CN113225228B (en) Data processing method and device
CN117667885A (en) Data migration method, device, terminal equipment and storage medium
CN115408993A (en) Data conversion method, device, equipment and medium
CN113779139A (en) Information synchronization method and device, electronic equipment and computer readable medium
CN115470031A (en) Abnormal event correction method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant