CN113114612B - Determination method and device for distributed system call chain - Google Patents

Determination method and device for distributed system call chain Download PDF

Info

Publication number
CN113114612B
CN113114612B CN202010032366.2A CN202010032366A CN113114612B CN 113114612 B CN113114612 B CN 113114612B CN 202010032366 A CN202010032366 A CN 202010032366A CN 113114612 B CN113114612 B CN 113114612B
Authority
CN
China
Prior art keywords
node
service
sub
call
log data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010032366.2A
Other languages
Chinese (zh)
Other versions
CN113114612A (en
Inventor
何宽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Zhenshi Information Technology Co Ltd
Original Assignee
Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Zhenshi Information Technology Co Ltd filed Critical Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority to CN202010032366.2A priority Critical patent/CN113114612B/en
Publication of CN113114612A publication Critical patent/CN113114612A/en
Application granted granted Critical
Publication of CN113114612B publication Critical patent/CN113114612B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/133Protocols for remote procedure calls [RPC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/51Discovery or management thereof, e.g. service location protocol [SLP] or web services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a method and a device for determining a distributed system call chain, and relates to the technical field of computers. One embodiment of the method comprises the following steps: after receiving a request for invoking a target service, generating a tracking identifier corresponding to the invocation of the target service; wherein the target service is commonly provided by a plurality of sub-service nodes, each sub-service node comprising at least one method node; when any sub-service node is called, generating log data corresponding to the call; when any method node in any sub-service node is called, generating log data corresponding to the call; and determining the log data generated in the process of calling the target service according to the tracking identification, and determining the call chain information of the target service call by utilizing the sub-service node identification and the method node identification in the determined log data. This embodiment is able to determine call chain information for sub-service granularity and method granularity.

Description

Determination method and device for distributed system call chain
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and apparatus for determining a distributed system call chain.
Background
In a Service-oriented architecture SOA (Service-Oriented Architecture) or a micro-Service architecture of a distributed system, call chain information in a target Service call process needs to be tracked to locate a problem point, and in the prior art, a call chain tracking scheme of sub-Service (i.e., a Service or an application program forming the target Service) granularity and method (i.e., a method or a function provided in the Service or the application program) granularity is lacking.
Disclosure of Invention
In view of this, the embodiment of the invention provides a method and a device for determining a distributed system call chain, which can determine call chain information of sub-service granularity and method granularity.
To achieve the above object, according to one aspect of the present invention, there is provided a method of determining a distributed system call chain.
The method for determining the distributed system call chain comprises the following steps: after receiving a request for invoking a target service, generating a tracking identifier corresponding to the invocation of the target service; wherein the target service is commonly provided by a plurality of sub-service nodes, each sub-service node comprising at least one method node; when any sub-service node is called, generating log data corresponding to the call; wherein: when the child service node is not an entry node of the target service, the generated log data contains the tracking identifier, the identifier of the child service node and the parent node identifier of the child service node; when any method node in any sub-service node is called, generating log data corresponding to the call; when the method node is not an entry method node of the child service node, the generated log data contains the tracking identifier, the identifier of the child service node, the identifier of the method node and the identifier of a father node of the method node; and determining the log data generated in the process of calling the target service according to the tracking identification, and determining the call chain information of the target service call by utilizing the sub-service node identification and the method node identification in the determined log data.
Optionally, when any sub-service node is called, if the sub-service node is an entry node of the target service, the generated log data contains the tracking identifier and the identifier of the sub-service node; when any method node in any sub-service node is called, if the method node is an entry method node of the sub-service node, the generated log data contains the tracking identifier, the identifier of the sub-service node and the identifier of the method node.
Optionally, the father node of the child service node is a child service node, and the father node of the method node is a method node; the call chain information of the secondary target service call is in a tree structure, and comprises the following steps: call chain information formed by the plurality of sub-service nodes and call chain information formed by method nodes in each sub-service node; the log data generated by any call further contains the time length information of the call; the method further comprises: and adding the calling duration information in the log data generated by calling any sub-service node into the sub-service node of the calling chain of the target service call, and adding the calling duration information in the log data generated by calling any method node into the method node of the calling chain of the target service call.
Optionally, the method further comprises: before the log data generated in the process of the target service call is determined according to the tracking identification: and acquiring log data generated in the current acquisition period, and storing the generated log data in a database through a message queue.
Optionally, the method further comprises: acquiring call chain information of multiple target service calls in a current statistical period from the database, and determining the average call duration of each sub-service node and the average call duration of each method node in the current statistical period by using the call chain information; and alarming when the average calling time length of any sub-service node is larger than a preset first time length threshold value or the average calling time length of any method node is larger than a preset second time length threshold value.
Optionally, the method further comprises: and acquiring call chain information of each call of a plurality of services in the current statistical period from the database, and counting the call times of each sub-service node and the call times of each method node related to each call according to the call chain information.
Optionally, the message queue comprises Kafka, and the database comprises an elastic search engine ES; the sub-service node comprises one or more of the following: application program nodes, file system nodes, cache system nodes and database nodes.
To achieve the above object, according to another aspect of the present invention, there is provided a determining apparatus of a distributed system call chain.
The determining device of the distributed system call chain according to the embodiment of the invention can comprise: a tracking identifier generating unit configured to: after receiving a request for invoking a target service, generating a tracking identifier corresponding to the invocation of the target service; wherein the target service is commonly provided by a plurality of sub-service nodes, each sub-service node comprising at least one method node; a sub-service log generating unit, configured to: when any sub-service node is called, generating log data corresponding to the call; wherein: when the child service node is not an entry node of the target service, the generated log data contains the tracking identifier, the identifier of the child service node and the parent node identifier of the child service node; a method log generating unit for: when any method node in any sub-service node is called, generating log data corresponding to the call; when the method node is not an entry method node of the child service node, the generated log data contains the tracking identifier, the identifier of the child service node, the identifier of the method node and the identifier of a father node of the method node; and the call chain tracking unit is used for determining the log data generated in the call process of the target service according to the tracking identification, and determining the call chain information of the call of the target service by utilizing the sub-service node identification and the method node identification in the determined log data.
To achieve the above object, according to still another aspect of the present invention, there is provided an electronic apparatus.
An electronic apparatus of the present invention includes: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors are enabled to realize the method for determining the distributed system call chain.
To achieve the above object, according to still another aspect of the present invention, there is provided a computer-readable storage medium.
A computer readable storage medium of the present invention has stored thereon a computer program which, when executed by a processor, implements the method of determining a distributed system call chain provided by the present invention.
According to the technical scheme of the invention, one embodiment of the invention has the following advantages or beneficial effects:
firstly, after receiving a request for calling a target service sent by a client, generating a globally unique tracking node to identify the target service call, and storing the tracking node in log data generated by each call; when each child service node is called, corresponding log data can be generated, and the log data can contain tracking nodes, child service node identifiers and father node identifiers. Thus, when the call chain of the target service call needs to be tracked, the related log data can be queried according to the tracking node, and the whole call chain information can be organized by utilizing the child service node identification and the father node identification in the log data. In addition, each log data generated during calling also contains calling time length information, and when the calling chain information is organized, the calling time length information in the log data can be written into the corresponding node. In this way, the average calling duration and the calling times of each sub-service node can be counted by using the calling chain information of multiple service calls, the former can be used for judging the performance of the sub-service node, the latter can be used for judging the importance of the sub-service node, and the two can provide data support for subsequent system optimization.
Secondly, on the basis of realizing the call chain tracking of the sub-service granularity, the invention further realizes the call chain tracking of the method granularity. Specifically, when any method node in the sub-service nodes is called, corresponding log data can be generated to store tracking identification, sub-service node identification, method node identification, father node identification and calling duration information, so that calling chain information of method granularity in each sub-service node can be obtained according to the information in the log data when calling chain information is organized, and the calling chain information can be used for counting average calling duration and calling times of each method node, thereby realizing accurate judgment of performance and importance of the method node. Through the arrangement, the invention can track the calling condition of the complex service and comb complex nesting relations among the nodes, can rapidly locate the problem node and the performance bottleneck node, and can bring the file system, the cache system and the database into the calling chain tracking range as sub-service nodes.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main steps of a method for determining a distributed system call chain in an embodiment of the present invention;
FIG. 2 is a schematic diagram of a call chain formed by a sub-service node in an embodiment of the invention;
FIG. 3 is a schematic diagram of a call chain formed by method nodes in an embodiment of the invention;
FIG. 4 is a schematic diagram of the overall architecture of a method for performing the determination of a distributed system call chain in an embodiment of the invention;
FIG. 5 is a schematic diagram of the main steps of a determining device of a distributed system call chain in an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram to which embodiments in accordance with the present invention may be applied;
FIG. 7 is a schematic diagram of an electronic device used to implement a method for determining a distributed system call chain in an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In the embodiment of the invention, the call chains in various service call frames (such as SOA, micro-service and the like) of the distributed system can be tracked in the sub-service dimension and the method dimension. In general, a call chain refers to a chain in which a system clicks call information between sub-services or methods into log data in the process of completing one service call, and then connects all log data to form a tree structure. After the statistical analysis of different dimensions is carried out on the call chain information, abnormal sub-service call or method call can be identified, and the performance bottleneck of the system can be analyzed. It will be appreciated that the service is a business service implemented by calling a plurality of sub-services, each of which may be an application program, or may be a file system, a cache system, or a database (call to a file system, a cache system, or a database corresponds to access or query to a file system, a cache system, or a database), and each of which is implemented by calling at least one method or function therein. The "sub-service node" and the "method node" appearing in the following description are the designations of the above-described "sub-service" and "method" in the distributed system and the tree-structured call chain, respectively.
In addition, in the embodiment of the invention, the existing byte code instrumentation method can be adopted to generate the log data by burying points in each service and sub-service code, in practical application, a Java agent can be used to intercept a ClassLoad (byte code file loader), and a Java agent (Java technology is known in both of the Java agent and the Java agent) can be used to operate and modify the byte code file, so as to realize burying points in the byte code file. In addition, it is to be noted that the embodiments of the present invention and the technical features in the embodiments may be combined with each other without collision.
FIG. 1 is a schematic diagram of the main steps of a method for determining a distributed system call chain in an embodiment of the invention.
As shown in fig. 1, the method for determining a distributed system call chain according to an embodiment of the present invention may specifically be performed according to the following steps:
step S101: after receiving a request for invoking a target service, generating a tracking identifier corresponding to the invocation of the target service; wherein the target service is commonly provided by a plurality of sub-service nodes, each sub-service node comprising at least one method node.
In this step, the server first receives a request for invoking the target service sent by the client, and since the implementation of the target service requires invoking the associated plurality of sub-service nodes, each sub-service node needs to be invoked in turn according to the invoking order. For example, if the target service is associated with 5 sub-service nodes A, B, C, D, E, where a is the ingress node, the return result of a depends on the return results of B and C, and the return result of C depends on the return result of D, E, after receiving the client request, the request first reaches a, then a requests a call to B, C, B can return data directly, C needs to interact with D, E and returns to a, and finally a responds to the request sent by the client. Upon receiving the request sent by the client, a globally unique (i.e., different tracking identifier is generated for each service call), tracking identifier for uniquely identifying the target service call may be generated to organize each call associated. In a specific application, the tracking identifier, the sub-service node identifier and the method node identifier, which will be described below, may be represented by 60-bit integers, which should be understood to not limit the data format of each identifier in the embodiment of the present invention.
Further, each sub-service node may include an ingress node and at least one method node, and a method node directly interacting with the ingress node may be referred to as an ingress method node. The method node in each sub-service node realizes the function of the sub-service node through interaction of various modes. In practice, interactions between sub-service nodes and between method nodes may be performed in various ways, such as hypertext transfer protocol HTTP (HyperText Transfer Protocol), remote procedure call RPC (Remote Procedure Call), socket, java message service JMS (Java Message Service), etc.
Step S102: when any sub-service node is called, log data corresponding to the call is generated.
In this step, the embedded point code may be utilized to generate log data corresponding to each sub-service node call. Generally, the identifier of a sub-service node may be generated when a sub-service node is first invoked during the secondary target service invocation, where the identifier of the sub-service node remains unique during the secondary target service invocation, that is, any two sub-service nodes of the secondary target service invocation chain have different identifiers. In practical application, the sub-service node identifiers can be sequentially generated according to the sequence of the calling moments. For example, if the calling order in the above example is a-B-C-D-E, the identifiers of the five sub-service nodes may be set to 1, 2, 3, 4, 5 in order.
As a preferred solution, the log data generated when a certain sub-service node is invoked may include the tracking identifier, the identifier of the sub-service node, and various invocation parameters (for example, the IP address and port information of the sub-service node, etc., where IP represents the internet protocol). When the child service node is an entry node of the target service, the child service node has no father node; when the child service node is not the entry node of the target service, the generated log data also contains the father node identification of the child service node. It will be appreciated that the parent node of the child service node is the child service node that initiates the call to the child service node. For example, in the above example, a is the parent node of B and C (B and C are child nodes of a), and C is the parent node of D and E (D and E are child nodes of C). In particular, the generated log data can also contain the time length information of the call, the duration information can be used for subsequent problem point judgment and system performance bottleneck analysis.
Step S103: when any method node in any sub-service node is called, log data corresponding to the call is generated.
In this step, the generation of log data may be invoked for each method node in each sub-service node. Specifically, if a method node in a certain sub-service node is called for the first time in the process of calling the target service, the identification of the method node is generated. The method node identifiers are unique during the internal invocation of the sub-service node, that is, each method node has a unique identifier during the invocation of each method node within the sub-service node. In practical application, the method node identifiers can be sequentially generated according to the sequence of the calling moments. For example, if the sub-service node B in the above example includes an entry node a and a method node B, c, d, e, f, g, h (where B and c are entry method nodes), a depends on the returned results of B and c, B depends on the returned results of d and e, c depends on the returned results of f and g, d depends on the returned results of h, and the calling order of each method node is b—c—d—e—f—g—h, the flag of B, c, d, e, f, g, h may be set to 1, 2, 3, 4, 5, 6, and 7 in order.
In some embodiments, if a method node is an ingress method node of a sub-service node, the generated log data contains the tracking identifier, the identifier of the sub-service node, and the identifier of the method node. If the method node is not an entry method node of the child service node, the generated log data contains the tracking identifier, the identifier of the child service node, the identifier of the method node and the identifier of a father node of the method node. It will be appreciated that the parent node of the method node is the method node that initiates the call to the method node. It will be appreciated that the log data generated by the calling method node may also include calling parameters (such as IP address and port information) and calling duration information. In addition, the tracking identifier, the sub-service node identifier or the method node identifier in the log data may be determined according to related identifier information transferred when interaction is performed between the sub-service nodes or between the method nodes. In practice, the tracking identifier, the child service node identifier, and the method node identifier may be stored in a local storage class (i.e., threadLocal) of the current operating thread.
In an alternative implementation, the method node identifier may also be set to have data in a hierarchical format, where the method node identifier includes its parent node identifier. For example, in the above example, the flag of b may be set to 1, the flag of c may be set to 2, the flag of d may be set to 1-1, the flag of e may be set to 1-2, the flag of f may be set to 2-1, the flag of g may be set to 2-2, and the flag of h may be set to 1-1-1. By the arrangement, the father node identification and even the ancestor node identification of the father node identification can be determined in the node identifications of a certain method. For example, g is identified as 2-2, then removing the last value may result in its parent node being identified as 2; h is marked as 1-1-1, then the last value is removed to obtain the parent node mark as 1-1, and the two values are removed to obtain the ancestor node as 1.
Through the setting, tracking of each sub-service node call and method node call in the target service call process can be realized through the generated log data, and the log data can be used for subsequent call times statistics, problem point detection and performance bottleneck analysis. In addition, the present invention does not limit the execution order between step S103 and step S102.
Step S104: and determining the log data generated in the process of calling the target service according to the tracking identification, and determining the call chain information of the target service call by utilizing the sub-service node identification and the method node identification in the determined log data.
After log data is generated using steps S102 and S103, log data generated may be collected using a log stack (a log collection tool) and sent to a message queue (e.g., kafka), where the log data may be stored to a database (e.g., an elastic search engine ES) for subsequent queries and statistics. It is understood that the log data may be collected according to a preset collection period (for example, 1 hour), and the above message queue can implement decoupling of log data collection and storage.
In step S104, the associated log data may be queried using the tracking identifier of a service call at a time, and then each sub-service node and each method node may be connected according to the sub-service node identifier (including the sub-service node identifier and the parent node identifier in the log data generated when the sub-service node is called and the sub-service node identifier in the log data generated when the method node is called) and the method node identifier (including the method node identifier and the parent node identifier in the log data generated when the method node is called), so as to obtain the call chain information of the service call at the time. The call chain information is tree structure data, which may include call chain information formed by sub-service nodes (i.e., call chain information of a sub-service dimension) and call chain information formed by method nodes in each sub-service node (i.e., call chain information of a method dimension).
Fig. 2 is a schematic diagram of a call chain formed by a sub-service node in the embodiment of the present invention, and fig. 3 is a schematic diagram of a call chain formed by a method node in the embodiment of the present invention, where the two schematic diagrams respectively show a call chain structure of a sub-service dimension and a call chain structure of a method dimension in the above example. In fig. 2 and 3, the arrow points from the parent node to the child node, indicating that the parent node initiates a call to the child node, the ChainID indicates the child service node identity, and the EventID indicates the method node identity.
Preferably, in the embodiment of the present invention, the call duration information in the log data can also be added to the corresponding node of the call chain for subsequent statistical analysis. Several statistical analysis methods based on call chain information will be described below.
1. And adding the calling time length of each sub-service node in the calling chain information of a certain service call to obtain the total time length of the service call. And if the total time length is greater than the preset threshold value, indicating that the service is abnormal, and giving an alarm.
2. Determining a sub-service node with the calling time length larger than a preset first time length threshold value and a method node with the calling time length larger than a preset second time length threshold value from calling chain information of a certain service call, and taking the determined sub-service node and method node as problem nodes to alarm.
3. And acquiring call chain information of multiple target service calls in the current statistical period (for example, one week) from a database, and determining the average call duration of each sub-service node and the average call duration of each method node in the current statistical period by using the call chain information. Specifically, for any sub-service node or method node, firstly determining the calling duration of the sub-service node or method node in each calling chain information, and then calculating the average value (such as an arithmetic average value or a geometric average value) of a plurality of calling durations to obtain the average calling duration. And if the average calling time length of any sub-service node is larger than the first time length threshold value or the average calling time length of any method node is larger than the second time length threshold value, the sub-service node or the method node is indicated to be the performance bottleneck node of the target service, and an alarm is given.
4. And acquiring call chain information of each call of a plurality of services in the current statistical period from a database, and counting the call times of each sub-service node and the call times of each method node related to each call according to the call chain information. Specifically, for each sub-service node or method node, the number of calls in each call chain is counted, and then the number of calls (i.e. the total number of calls) in the current counting period can be obtained by adding the number of calls. The calling times can represent the importance of the sub-service nodes and the method nodes, and the sub-service nodes and the method nodes with larger calling times can be subjected to resource expansion according to actual conditions.
FIG. 4 is a schematic diagram of the overall architecture of a method for performing the determination of a distributed system call chain in an embodiment of the invention. In fig. 4, the log generating module may be an Agent (Agent end for generating a log), the log collecting module may be a logstack, the message queue may be Kafka, and the database may be an ES, which does not limit the possible configuration manners of the above modules. It should be noted that, the method for determining the distributed system call chain provided by the embodiment of the invention can be applied to distributed architectures such as an SOA and a micro-service architecture, and can also be applied to other distributed service call architectures.
In the technical scheme of the embodiment of the invention, after receiving a request for calling target service sent by a client, a globally unique tracking node is generated to identify the target service call, and the tracking node is stored in log data generated by each call; when each child service node is called, corresponding log data can be generated, and the log data can contain tracking nodes, child service node identifiers and father node identifiers. Thus, when the call chain of the target service call needs to be tracked, the related log data can be queried according to the tracking node, and the whole call chain information can be organized by utilizing the child service node identification and the father node identification in the log data. In addition, each log data generated during calling also contains calling time length information, and when the calling chain information is organized, the calling time length information in the log data can be written into the corresponding node. In this way, the average calling duration and the calling times of each sub-service node can be counted by using the calling chain information of multiple service calls, the former can be used for judging the performance of the sub-service node, the latter can be used for judging the importance of the sub-service node, and the two can provide data support for subsequent system optimization. On the basis of realizing the call chain tracking of the sub-service granularity, the invention further realizes the call chain tracking of the method granularity. Specifically, when any method node in the sub-service nodes is called, corresponding log data can be generated to store tracking identification, sub-service node identification, method node identification, father node identification and calling duration information, so that calling chain information of method granularity in each sub-service node can be obtained according to the information in the log data when calling chain information is organized, and the calling chain information can be used for counting average calling duration and calling times of each method node, thereby realizing accurate judgment of performance and importance of the method node. Through the arrangement, the invention can track the calling condition of the complex service and comb complex nesting relations among the nodes, can rapidly locate the problem node and the performance bottleneck node, and can bring the file system, the cache system and the database into the calling chain tracking range as sub-service nodes.
It should be noted that, for the convenience of description, the foregoing method embodiments are expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present invention is not limited by the described order of actions, and some steps may actually be performed in other order or simultaneously. Moreover, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts and modules referred to are not necessarily required to practice the invention.
In order to facilitate better implementation of the above-described aspects of embodiments of the present invention, the following provides related devices for implementing the above-described aspects.
Referring to fig. 5, a determining apparatus 500 for a distributed system call chain according to an embodiment of the present invention may include: a trace identification generation unit 501, a sub service log generation unit 502, a method log generation unit 503, and a call chain trace unit 504.
Wherein the tracking identifier generating unit 501 is operable to: after receiving a request for invoking a target service, generating a tracking identifier corresponding to the invocation of the target service; wherein the target service is commonly provided by a plurality of sub-service nodes, each sub-service node comprising at least one method node; the sub-service log generation unit 502 may be configured to: when any sub-service node is called, generating log data corresponding to the call; wherein: when the child service node is not an entry node of the target service, the generated log data contains the tracking identifier, the identifier of the child service node and the parent node identifier of the child service node; the method log generating unit 503 may be configured to: when any method node in any sub-service node is called, generating log data corresponding to the call; when the method node is not an entry method node of the child service node, the generated log data contains the tracking identifier, the identifier of the child service node, the identifier of the method node and the identifier of a father node of the method node; the call chain tracking unit 504 may be configured to determine, according to the tracking identifier, log data generated in the process of calling the target service, and determine call chain information of the target service call by using a child service node identifier and a method node identifier in the determined log data.
In the embodiment of the invention, when any sub-service node is called, if the sub-service node is an entry node of target service, the generated log data contains the tracking identifier and the identifier of the sub-service node; when any method node in any sub-service node is called, if the method node is an entry method node of the sub-service node, the generated log data contains the tracking identifier, the identifier of the sub-service node and the identifier of the method node.
In a specific application, the father node of the child service node is a child service node, and the father node of the method node is a method node; the call chain information of the secondary target service call is in a tree structure, and comprises the following steps: call chain information formed by the plurality of sub-service nodes and call chain information formed by method nodes in each sub-service node; the log data generated by any call further contains the time length information of the call; and, the apparatus 500 may further include an information adding unit for adding the call duration information in the log data generated by calling any one sub-service node to the sub-service node of the call chain of the target service call, and adding the call duration information in the log data generated by calling any one method node to the method node of the call chain of the target service call.
In practical applications, the apparatus 500 may further comprise a log storage unit for: and before the log data generated in the process of calling the target service is determined according to the tracking identification, acquiring the log data generated in the current acquisition period, and storing the generated log data in a database through a message queue.
In some embodiments, the device 500 may further comprise a statistical analysis unit for: acquiring call chain information of multiple target service calls in a current statistical period from the database, and determining the average call duration of each sub-service node and the average call duration of each method node in the current statistical period by using the call chain information; and alarming when the average calling time length of any sub-service node is larger than a preset first time length threshold value or the average calling time length of any method node is larger than a preset second time length threshold value.
In an alternative implementation, the statistical analysis unit may be further configured to: and acquiring call chain information of each call of a plurality of services in the current statistical period from the database, and counting the call times of each sub-service node and the call times of each method node related to each call according to the call chain information.
Furthermore, in an embodiment of the present invention, the message queue comprises Kafka and the database comprises an elastic search engine ES; the sub-service node comprises one or more of the following: application program nodes, file system nodes, cache system nodes and database nodes.
In the technical scheme of the embodiment of the invention, after receiving a request for calling target service sent by a client, a globally unique tracking node is generated to identify the target service call, and the tracking node is stored in log data generated by each call; when each child service node is called, corresponding log data can be generated, and the log data can contain tracking nodes, child service node identifiers and father node identifiers. Thus, when the call chain of the target service call needs to be tracked, the related log data can be queried according to the tracking node, and the whole call chain information can be organized by utilizing the child service node identification and the father node identification in the log data. In addition, each log data generated during calling also contains calling time length information, and when the calling chain information is organized, the calling time length information in the log data can be written into the corresponding node. In this way, the average calling duration and the calling times of each sub-service node can be counted by using the calling chain information of multiple service calls, the former can be used for judging the performance of the sub-service node, the latter can be used for judging the importance of the sub-service node, and the two can provide data support for subsequent system optimization. On the basis of realizing the call chain tracking of the sub-service granularity, the invention further realizes the call chain tracking of the method granularity. Specifically, when any method node in the sub-service nodes is called, corresponding log data can be generated to store tracking identification, sub-service node identification, method node identification, father node identification and calling duration information, so that calling chain information of method granularity in each sub-service node can be obtained according to the information in the log data when calling chain information is organized, and the calling chain information can be used for counting average calling duration and calling times of each method node, thereby realizing accurate judgment of performance and importance of the method node. Through the arrangement, the invention can track the calling condition of the complex service and comb complex nesting relations among the nodes, can rapidly locate the problem node and the performance bottleneck node, and can bring the file system, the cache system and the database into the calling chain tracking range as sub-service nodes.
Fig. 6 illustrates an exemplary system architecture 600 of a determination method of a distributed system call chain or a determination apparatus of a distributed system call chain to which an embodiment of the present invention may be applied.
As shown in fig. 6, a system architecture 600 may include terminal devices 601, 602, 603, a network 604, and a server 605 (this architecture is merely an example, and the components contained in a particular architecture may be tailored to the application specific case). The network 604 is used as a medium to provide communication links between the terminal devices 601, 602, 603 and the server 605. The network 604 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the server 605 via the network 604 using the terminal devices 601, 602, 603 to receive or send messages, etc. Various client applications, such as a call chain tracking application (by way of example only), may be installed on the terminal devices 601, 602, 603.
The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 605 may be a server providing various services, such as a background server (by way of example only) providing support for call chain tracking applications operated by users using the terminal devices 601, 602, 603. The background server may process the received request and feed back the processing result (determined call chain information, for example only) to the terminal devices 601, 602, 603.
It should be noted that, the method for determining a distributed system call chain provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the determining device of the distributed system call chain is generally disposed in the server 605.
It should be understood that the number of terminal devices, networks and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The invention also provides electronic equipment. The electronic equipment of the embodiment of the invention comprises: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors are enabled to realize the method for determining the distributed system call chain.
Referring now to FIG. 7, there is illustrated a schematic diagram of a computer system 700 suitable for use in implementing an electronic device of an embodiment of the present invention. The electronic device shown in fig. 7 is only an example and should not be construed as limiting the functionality and scope of use of the embodiments of the invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU) 701, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the computer system 700 are also stored. The CPU701, ROM 702, and RAM703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input section 706 including a keyboard, a mouse, and the like; an output portion 707 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 708 including a hard disk or the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed on the drive 710 as necessary, so that a computer program read out therefrom is installed into the storage section 708 as necessary.
In particular, the processes described in the main step diagrams above may be implemented as computer software programs according to the disclosed embodiments of the invention. For example, embodiments of the present invention include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the main step diagrams. In the above-described embodiment, the computer program can be downloaded and installed from a network through the communication section 709 and/or installed from the removable medium 711. The above-described functions defined in the system of the present invention are performed when the computer program is executed by the central processing unit 701.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, a computer readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with computer readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present invention may be implemented in software or in hardware. The described units may also be provided in a processor, for example, described as: a processor includes a trace identity generation unit, a sub-service log generation unit, a method log generation unit, and a call chain trace unit. The names of these units do not constitute a limitation on the unit itself in some cases, and for example, the tracking identifier generating unit may also be described as "a unit that transmits tracking identifiers to the sub-service log generating unit and the method log generating unit".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by the device, cause the device to perform steps comprising: after receiving a request for invoking a target service, generating a tracking identifier corresponding to the invocation of the target service; wherein the target service is commonly provided by a plurality of sub-service nodes, each sub-service node comprising at least one method node; when any sub-service node is called, generating log data corresponding to the call; wherein: when the child service node is not an entry node of the target service, the generated log data contains the tracking identifier, the identifier of the child service node and the parent node identifier of the child service node; when any method node in any sub-service node is called, generating log data corresponding to the call; when the method node is not an entry method node of the child service node, the generated log data contains the tracking identifier, the identifier of the child service node, the identifier of the method node and the identifier of a father node of the method node; and determining the log data generated in the process of calling the target service according to the tracking identification, and determining the call chain information of the target service call by utilizing the sub-service node identification and the method node identification in the determined log data.
In the technical scheme of the embodiment of the invention, after receiving a request for calling target service sent by a client, a globally unique tracking node is generated to identify the target service call, and the tracking node is stored in log data generated by each call; when each child service node is called, corresponding log data can be generated, and the log data can contain tracking nodes, child service node identifiers and father node identifiers. Thus, when the call chain of the target service call needs to be tracked, the related log data can be queried according to the tracking node, and the whole call chain information can be organized by utilizing the child service node identification and the father node identification in the log data. In addition, each log data generated during calling also contains calling time length information, and when the calling chain information is organized, the calling time length information in the log data can be written into the corresponding node. In this way, the average calling duration and the calling times of each sub-service node can be counted by using the calling chain information of multiple service calls, the former can be used for judging the performance of the sub-service node, the latter can be used for judging the importance of the sub-service node, and the two can provide data support for subsequent system optimization. On the basis of realizing the call chain tracking of the sub-service granularity, the invention further realizes the call chain tracking of the method granularity. Specifically, when any method node in the sub-service nodes is called, corresponding log data can be generated to store tracking identification, sub-service node identification, method node identification, father node identification and calling duration information, so that calling chain information of method granularity in each sub-service node can be obtained according to the information in the log data when calling chain information is organized, and the calling chain information can be used for counting average calling duration and calling times of each method node, thereby realizing accurate judgment of performance and importance of the method node. Through the arrangement, the invention can track the calling condition of the complex service and comb complex nesting relations among the nodes, can rapidly locate the problem node and the performance bottleneck node, and can bring the file system, the cache system and the database into the calling chain tracking range as sub-service nodes.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for determining a distributed system call chain is applied to a service-oriented architecture or a micro-service architecture; characterized by comprising the following steps:
after receiving a request for invoking a target service, generating a tracking identifier corresponding to the invocation of the target service; wherein the target service is commonly provided by a plurality of sub-service nodes, each sub-service node comprising at least one method node; the sub-service node refers to an application program forming a target service, and the method node refers to a method or a function provided in the sub-service node;
when any sub-service node is called, generating log data corresponding to the call; wherein: when the child service node is not an entry node of the target service, the generated log data contains the tracking identifier, the identifier of the child service node and the parent node identifier of the child service node;
When any method node in any sub-service node is called, generating log data corresponding to the call; when the method node is not an entry method node of the child service node, the generated log data contains the tracking identifier, the identifier of the child service node, the identifier of the method node and the identifier of a father node of the method node; and
and determining the log data generated in the process of calling the target service according to the tracking identification, and determining the call chain information of the target service call by utilizing the sub-service node identification and the method node identification in the determined log data.
2. The method for determining according to claim 1, wherein,
when any sub-service node is called, if the sub-service node is an entry node of target service, the generated log data contains the tracking identifier and the identifier of the sub-service node;
when any method node in any sub-service node is called, if the method node is an entry method node of the sub-service node, the generated log data contains the tracking identifier, the identifier of the sub-service node and the identifier of the method node.
3. The method of determining according to claim 2, wherein,
the father node of the child service node is a child service node, and the father node of the method node is a method node;
the call chain information of the secondary target service call is in a tree structure, and comprises the following steps: call chain information formed by the plurality of sub-service nodes and call chain information formed by method nodes in each sub-service node;
the log data generated by any call further contains the time length information of the call;
and, the method further comprises: and adding the calling duration information in the log data generated by calling any sub-service node into the sub-service node of the calling chain of the target service call, and adding the calling duration information in the log data generated by calling any method node into the method node of the calling chain of the target service call.
4. A method of determining as claimed in claim 3, further comprising:
before the log data generated in the process of the target service call is determined according to the tracking identification: and acquiring log data generated in the current acquisition period, and storing the generated log data in a database through a message queue.
5. The method of determining according to claim 4, wherein the method further comprises:
acquiring call chain information of multiple target service calls in the current statistical period from the database, determining the average calling duration of each sub-service node and the average calling duration of each method node in the current statistical period by using the calling chain information;
and alarming when the average calling time length of any sub-service node is larger than a preset first time length threshold value or the average calling time length of any method node is larger than a preset second time length threshold value.
6. The method of determining according to claim 4, wherein the method further comprises:
and acquiring call chain information of each call of a plurality of services in the current statistical period from the database, and counting the call times of each sub-service node and the call times of each method node related to each call according to the call chain information.
7. The method of determining according to claim 4, 5 or 6, wherein,
the message queue comprises Kafka, and the database comprises an elastic search engine ES;
the sub-service node comprises one or more of the following: application program nodes, file system nodes, cache system nodes and database nodes.
8. A determining device of a distributed system call chain is applied to a service-oriented architecture or a micro-service architecture; it is characterized in that the method comprises the steps of, comprising the following steps:
a tracking identifier generating unit configured to: after receiving a request for invoking a target service, generating a tracking identifier corresponding to the invocation of the target service; wherein the target service is commonly provided by a plurality of sub-service nodes, each sub-service node comprises at least one method node; the sub-service node refers to an application program forming a target service, and the method node refers to a method or a function provided in the sub-service node;
a sub-service log generating unit, configured to: when any sub-service node is called, generating log data corresponding to the call; wherein: when the child service node is not an entry node of the target service, the generated log data contains the tracking identifier, the identifier of the child service node and the parent node identifier of the child service node;
a method log generating unit for: when any method node in any sub-service node is called, generating log data corresponding to the call; when the method node is not an entry method node of the child service node, the generated log data contains the tracking identifier, the identifier of the child service node, the identifier of the method node and the identifier of a father node of the method node; and
And the call chain tracking unit is used for determining the log data generated in the call process of the target service according to the tracking identification, and determining the call chain information of the call of the target service by utilizing the sub-service node identification and the method node identification in the determined log data.
9. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
when executed by the one or more processors, causes the one or more processors to implement the determination method of any of claims 1-7.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the determination method according to any one of claims 1-7.
CN202010032366.2A 2020-01-13 2020-01-13 Determination method and device for distributed system call chain Active CN113114612B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010032366.2A CN113114612B (en) 2020-01-13 2020-01-13 Determination method and device for distributed system call chain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010032366.2A CN113114612B (en) 2020-01-13 2020-01-13 Determination method and device for distributed system call chain

Publications (2)

Publication Number Publication Date
CN113114612A CN113114612A (en) 2021-07-13
CN113114612B true CN113114612B (en) 2023-06-27

Family

ID=76709070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010032366.2A Active CN113114612B (en) 2020-01-13 2020-01-13 Determination method and device for distributed system call chain

Country Status (1)

Country Link
CN (1) CN113114612B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107580018A (en) * 2017-07-28 2018-01-12 北京北信源软件股份有限公司 The tracking and device of a kind of distributed system
CN108038145A (en) * 2017-11-23 2018-05-15 携程旅游网络技术(上海)有限公司 Distributed Services tracking, system, storage medium and electronic equipment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10320878B2 (en) * 2013-10-14 2019-06-11 Medidata Solutions, Inc. System and method for preserving causality of audits
CN109726016A (en) * 2017-10-30 2019-05-07 阿里巴巴集团控股有限公司 A kind of link tracing methods, devices and systems for distributed system
CN109992465B (en) * 2017-12-29 2023-05-16 中国电信股份有限公司 Service tracking method, device and computer readable storage medium
CN108600045A (en) * 2018-04-05 2018-09-28 厦门快商通信息技术有限公司 A kind of service link monitoring method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107580018A (en) * 2017-07-28 2018-01-12 北京北信源软件股份有限公司 The tracking and device of a kind of distributed system
CN108038145A (en) * 2017-11-23 2018-05-15 携程旅游网络技术(上海)有限公司 Distributed Services tracking, system, storage medium and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
分布式系统服务链追踪与监控;郑邦峰;;工业技术创新(02);第60-64页 *

Also Published As

Publication number Publication date
CN113114612A (en) 2021-07-13

Similar Documents

Publication Publication Date Title
CN107491488B (en) Page data acquisition method and device
CN109145023B (en) Method and apparatus for processing data
US11188443B2 (en) Method, apparatus and system for processing log data
CN113987074A (en) Distributed service full-link monitoring method and device, electronic equipment and storage medium
CN111309550A (en) Data acquisition method, system, equipment and storage medium of application program
CN111190888A (en) Method and device for managing graph database cluster
CN110928934A (en) Data processing method and device for business analysis
CN114090366A (en) Method, device and system for monitoring data
US20220138074A1 (en) Method, electronic device and computer program product for processing data
CN111158637A (en) Block chain-based random number generation method, equipment and storage medium
CN115357761A (en) Link tracking method and device, electronic equipment and storage medium
CN113596078A (en) Service problem positioning method and device
CN108764866B (en) Method and equipment for allocating resources and drawing resources
US20230269304A1 (en) Method and apparatus for processing notification trigger message
CN108011936B (en) Method and device for pushing information
US11704157B2 (en) Method and apparatus for comparing acquired cloud resource use information to thresholds to recommend a target cloud resource instance
CN110737655B (en) Method and device for reporting data
CN113760677A (en) Abnormal link analysis method, device, equipment and storage medium
CN113114612B (en) Determination method and device for distributed system call chain
CN112948138A (en) Method and device for processing message
CN114465919B (en) Network service testing method, system, electronic equipment and storage medium
CN108390770B (en) Information generation method and device and server
CN112579406A (en) Log call chain generation method and device
CN108933802B (en) Method and apparatus for monitoring operation
CN115220131A (en) Meteorological data quality inspection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant