CN115242613B

CN115242613B - Target node determining method and device

Info

Publication number: CN115242613B
Application number: CN202210926465.4A
Authority: CN
Inventors: 朱震宇
Original assignee: Zhejiang eCommerce Bank Co Ltd
Current assignee: Zhejiang eCommerce Bank Co Ltd
Priority date: 2022-08-03
Filing date: 2022-08-03
Publication date: 2024-03-15
Anticipated expiration: 2042-08-03
Also published as: CN118055004A; CN115242613A

Abstract

The embodiment of the specification provides a target node determining method and device, wherein the method comprises the following steps: acquiring an operation log of an application service system, wherein the application service system comprises a preset number of services, and calling relations exist among the services; generating a service call knowledge graph according to the service call information in the operation log; and determining a target node in the service call knowledge graph according to the node access strategy and the service call knowledge graph, wherein the node access strategy is determined according to the service call abnormal information. And generating a service calling knowledge graph through the running log of the service, wherein the service calling knowledge graph contains service calling abnormal information, and determining a target node according to a node access strategy determined by the service calling abnormal information in the service calling knowledge graph.

Description

Target node determining method and device

Technical Field

The embodiment of the specification relates to the technical field of data processing, in particular to a target node determining method.

Background

With popularization of the architecture of the internet micro-service system, the number of enterprise applications is increased, and the enterprise applications interact with each other in a micro-service mode to form a service link. The task execution of a particular business is followed by a plurality of application services. Under the complex service link scene, the number of application service nodes is increased, when the on-line abnormality occurs, each node needs to be checked step by step, the emergency efficiency is low, the manpower consumption is serious, and the abnormality root cause cannot be positioned in time. There is thus a need for a more efficient solution to the above-mentioned problems.

Disclosure of Invention

In view of this, the present embodiments provide a target node determining method. One or more embodiments of the present specification relate to a target node determining apparatus, a computing device, a computer-readable storage medium, and a computer program that solve the technical drawbacks of the prior art.

According to a first aspect of embodiments of the present disclosure, there is provided a target node determining method, including:

acquiring an operation log of an application service system, wherein the application service system comprises a preset number of services, and calling relations exist among the services;

generating a service call knowledge graph according to the service call information in the operation log, wherein the service call knowledge graph comprises nodes corresponding to the service and service call abnormal information;

and determining a target node in the service call knowledge graph according to a node access strategy and the service call knowledge graph, wherein the node access strategy is determined according to the service call abnormal information.

According to a second aspect of embodiments of the present specification, there is provided a target node determining apparatus, comprising:

The system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is configured to acquire an operation log of an application service system, the application service system comprises a preset number of services, and calling relations exist among the services;

the generation module is configured to generate a service call knowledge graph according to the service call information in the operation log, wherein the service call knowledge graph comprises nodes corresponding to the service and service call abnormal information;

and the access module is configured to determine a target node in the service call knowledge graph according to a node access strategy and the service call knowledge graph, wherein the node access strategy is determined according to the service call abnormal information.

According to a third aspect of embodiments of the present specification, there is provided a computing device comprising:

a memory and a processor;

the memory is configured to store computer-executable instructions that, when executed by the processor, perform the steps of the above-described target node determination method.

According to a fourth aspect of embodiments of the present specification, there is provided a computer readable storage medium storing computer executable instructions which, when executed by a processor, implement the steps of the above-described target node determination method.

According to a fifth aspect of embodiments of the present specification, there is provided a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of the above-described target node determination method.

The embodiment of the specification provides a target node determining method and a device, wherein the target node determining method comprises the following steps: acquiring an operation log of an application service system, wherein the application service system comprises a preset number of services, and calling relations exist among the services; generating a service call knowledge graph according to the service call information in the operation log, wherein the service call knowledge graph comprises nodes corresponding to the service and service call abnormal information; and determining a target node in the service call knowledge graph according to a node access strategy and the service call knowledge graph, wherein the node access strategy is determined according to the service call abnormal information. And generating a service call knowledge graph through the running log of the service, wherein the service call knowledge graph contains service call abnormal information, and determining a target node according to a node access strategy determined by the service call abnormal information in the service call knowledge graph.

Drawings

Fig. 1 is a schematic view of a scenario of a target node determining method according to an embodiment of the present disclosure;

FIG. 2 is a flow chart of a method for determining a target node according to one embodiment of the present disclosure;

FIG. 3a is a schematic diagram of a call relationship tree diagram of a target node determining method according to an embodiment of the present disclosure;

fig. 3b is a schematic diagram of a service invocation knowledge graph of a target node determining method according to an embodiment of the present disclosure;

FIG. 3c is a schematic diagram of another service invocation knowledge graph of a target node determination method according to an embodiment of the present disclosure;

FIG. 4 is a process flow diagram of a method for determining a target node according to one embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a method for determining a target node according to an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of a target node determining apparatus according to an embodiment of the present disclosure;

FIG. 7 is a block diagram of a computing device provided in one embodiment of the present description.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.

The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.

First, terms related to one or more embodiments of the present specification will be explained.

Micro-services: refers to breaking up an application into multiple smaller services or instances and running on different clusters/machines.

Root Cause Analysis (RCA): english is called Root Cause Analysis, which is a structured problem processing method to gradually find out the root cause of the problem and solve the problem, and does not just pay attention to the characterization of the problem.

Service dependency: in a distributed software system, services provided by the system are provided based on a combination and collaboration between different sub-services. Service dependencies are often used to describe the call relationships of one service to other services in a system in order to complete a response to a request for that service.

Data cleaning: refers to the last procedure to find and correct identifiable errors in a data file, including checking for data consistency, processing invalid and missing values, etc.

Python: a computer language.

Java: a computer language.

ruby: a computer language.

At present, root cause analysis (root cause analysis) in a micro-service architecture system is to construct a specific analysis map structure through manual experience, collect global abnormal events when abnormality occurs, and deduce and locate the root cause through a relation. For example, in a money transfer service scenario of a financial enterprise, a plurality of application systems involving a plurality of service links are used to ensure stable operation of the money transfer service, operation staff monitor service operation conditions according to link monitoring logs, and when a money transfer failure rate increases, the operation staff need to intervene to check the service links which may cause abnormality one by one, and meanwhile analyze an abnormal state of each application, so that labor consumption is increased. Meanwhile, complicated nodes of the link may cause the condition of investigation omission to exist manually, and the effectiveness and accuracy of manual investigation cannot meet the requirements.

Therefore, the conventional root cause analysis depends on a lot of expert experiences, the relation structure and the deduction rule among abnormal indexes depend on manual experiences and experience deviations, and the abnormal positioning result is directly influenced. And when the abnormal information is missing and a plurality of abnormal results appear, a more complex deducing strategy is needed, and the priority of the abnormal results cannot be well evaluated. Therefore, in summary, the root cause analysis method is not suitable for the abnormal positioning scene when the link is complex.

Based on this, in the present specification, a target node determining method is provided, and the present specification relates to a target node determining apparatus, a computing device, and a computer-readable storage medium, which are described in detail one by one in the following embodiments.

Referring to fig. 1, fig. 1 is a schematic view of a scenario of a target node determining method according to an embodiment of the present disclosure, where as shown in the drawing, a micro service architecture system includes an application service instance A1, an application service instance A2, an application service instance A3, an application service instance A4, an application service instance A5, an application service instance A6, an application service instance A7, an application service instance A8, a database, a container 1, a container 2, and a container 3, where the container 1, the container 2, and the container 3 are all based on machines. Under the condition that an error occurs in the system, an operation log of the system is obtained, a service call knowledge graph is generated according to service call information in the operation log, and a target node in the service call knowledge graph is determined according to a node access strategy and the service call knowledge graph, wherein the node access strategy comprises a weight probability, namely the probability of the node accessing the node, the weight probability is extracted according to historical characteristics, for example, under the condition that an application service is applied to a database and an application service is applied to the application service, the weight between the application service and the database is 0.6, and the weight between the application service and the application service is 0.4. In the application service to container and application service to application service scenarios, the weight between application service to container is 0.7 and the weight between application service to application service is 0.3. In the application service-to-application service and application service-to-application service scenarios, the weights are all 0.5.

And generating a service call knowledge graph through the running log of the service, wherein the service call knowledge graph contains service call abnormal information, and determining a target node according to a node access strategy determined by the service call abnormal information in the service call knowledge graph.

Referring to fig. 2, fig. 2 shows a flowchart of a method for determining a target node according to an embodiment of the present disclosure, which specifically includes the following steps.

Step 202: and acquiring an operation log of an application service system, wherein the application service system comprises a preset number of services, and calling relations exist among the services.

The application service system can be a system based on a micro-service architecture, and can also be an application program, and the operation log can be an operation log such as the occupancy rate of a memory, the occupancy rate of a processor or the record of calling other services recorded during service operation; the preset number can be the number of services in the micro-service architecture or the number of services in the application program; the service may be an application service such as: the commodity inquiry service, commodity ordering service, and the like may also be a database, a container, and the like, specifically, for example, a python-based product page service, a java-based reviews service, a ruby-based details service, and the like.

In practical applications, in a system of a micro-service architecture, there is a service dependency relationship between services, and there may be thousands of hundreds of services in an associated service rack, so in the case where one or more services have problems, it is difficult to determine root causes, because there may be a large number of related services, and running logs of the services are stored in the system of the micro-service architecture, the running logs may be obtained, and then analyzed to determine the root causes of the anomalies.

For example, if an error condition occurs in the system based on the micro-service architecture, index data such as service call, error log, application CPU, application memory, application thread number and the like in the system based on the micro-service architecture is obtained.

According to the embodiment of the specification, the running log of the system is obtained according to the error condition, and subsequent analysis can be carried out according to the running log, so that the accuracy of determining the target node is improved.

In one implementation manner, the obtaining the running log of the application service system includes:

acquiring an initial log of the application service system;

and under the condition that missing data exists in the initial log, carrying out the deficiency supplementing processing on the initial log to obtain an operation log.

Wherein the initial log may be an unmodified log, i.e., a log extracted directly from the system; in the case where there is missing data in the initial log, it can be understood that there is no period of time or some part of the data in the log, for example, there is no partial log of service a in three to four points; the patch processing may be processing of predicting missing data by the preceding piece of data and the following piece of data of the missing data, for example, if there is no partial log of the service a in three to four points, then the patch processing is performed by the preceding period, the two to three point data, and the following period, the four to five point data, the patch processing is performed by the two to three point data, and the three to four point data.

In practical application, an acquisition application link is configured, and index data such as application service call, error log, application CPU, application memory, application thread number and the like are acquired from an application monitoring platform according to an application service log, and data deficiency cleaning is performed on the index data to form an application service call link, that is, after an operation log is acquired, the operation log needs to be subjected to data cleaning, and accordingly, missing data possibly exists in the operation log, and then deficiency processing is performed on the data in the log.

For example, in the initial log, there is no data of the processor occupancy rate of three to four points in the data of the service a, and then the data of two to three points for the previous period and the data of four to five points for the subsequent period can be found in the initial log, and the deficiency processing is performed on the data of three to four points of the data of two to three points and the data of three to four points of the data of four to five points. Specifically, the two-to-three-point processor occupancy is thirty percent and the four-to-five-point processor occupancy is fifty percent, then the two-to-three-point processor occupancy may be added to the four-to-five-point processor occupancy and averaged to obtain the average as the three-to-four-point processor occupancy for service a, i.e., thirty percent plus fifty percent equals eighty percent, and then eighty percent is divided by two to obtain forty percent.

Any method for processing the missing value may be used for the missing processing of the data, and the embodiments of the present invention are not limited.

According to the embodiment of the specification, the missing data is subjected to the deficiency supplementing processing according to the context, so that the target node can be continuously searched under the condition that the data is incomplete, and the usability of the scheme is improved.

Step 204: and generating a service call knowledge graph according to the service call information in the operation log, wherein the service call knowledge graph comprises nodes corresponding to the service and service call abnormal information.

Since the running log is acquired in step 202, a service call knowledge graph may be generated according to the service call information in the running log.

The service call information may be information generated by calling between services, for example, the service call information is that service a calls service B; the service call knowledge graph may be a knowledge graph generated according to call between services, that is, the service call knowledge graph shows call relations between services; the service call exception information may be exception information between services, for example, in the case that service a calls service B, the occupancy rate of the processor is too high, and the service call exception information is determined to be too high.

In practical application, the service call exception information includes multiple types, which may include access errors, memory occupancy rate, processor occupancy rate, etc., and according to the obtained application service call link, the same node combination is performed from the tree-like link structure, and the obtained application service call link is converted into a knowledge graph structure, and the exception data such as the number of application service errors is reserved on the edge.

For example, when the service call relationship in the running log corresponding to the service a is that the service a needs to call the service B, a service call relationship exists between the service a and the service B, and when the service a calls the service B, the occupancy rate of the processor is too high, then the service call exception information is the occupancy rate of the processor, and the service call relationship exists between the service a and the service B according to the service a needs to call the service B, so that a service call knowledge graph is generated, wherein the service call exception information with the too high occupancy rate of the processor is reserved in a path between the service a and the service B in the service call knowledge graph.

According to the embodiment of the specification, the service call knowledge graph is generated according to the service call information, the service call abnormal information is reserved, and the target node can be conveniently searched according to the service call abnormal information.

In one possible implementation manner, the generating a service call knowledge graph according to the service call information in the running log includes:

determining service calling relations among the services according to the service calling information in the operation log;

generating a service call tree diagram according to the service and the service call relation between the services;

And processing the service call relation in the service call tree diagram to generate a service call knowledge graph.

The service calling relationship is a service dependency relationship between services, for example, the service calling relationship is that the service A needs to call the service B, and the service calling relationship exists between the service A and the service B; the service invocation tree graph may be a tree graph including invocation relationships between services; the service call knowledge graph can be a knowledge graph obtained by processing paths in a tree graph based on the service call tree graph.

In practical application, in order to facilitate the processing of call relations between services, a service call relation tree diagram may be generated first. For example, referring to fig. 3a, fig. 3a shows a schematic diagram of a call relationship tree, wherein services 1 to 6 are nodes in the call relationship tree, service 1 depends on service 2 and service 3, service 2 depends on service 4 and service 5, and service 5 depends on service 6; service 3 depends on service 5 and service 6, service 5 depends on service 6, and service 6 depends on service 4. Then, the service call relationship tree graph also has a plurality of identical dependency relationships, and then the dependency relationships need to be processed to simplify the generated knowledge graph.

According to the embodiment of the specification, the service call tree diagram is generated according to the service call information, the service call tree diagram is simplified to obtain the service call knowledge graph, and the subsequent process of carrying out the migration searching on the service call knowledge graph is convenient.

Specifically, the processing of the service call relationships in the service call tree graph may be simplified processing, and specific embodiments are described below.

The processing the service call relation in the service call tree diagram to generate a service call knowledge graph comprises the following steps:

determining the same service calling relation in the service calling tree diagram, and merging the same service calling relation to obtain an initial knowledge graph;

and deleting the service call relation without the service call abnormal information in the initial knowledge graph to generate a service call knowledge graph.

The same service call relationship may be understood that the same call relationship exists in different branches of the tree diagram, for example, the service 2 in the above embodiment depends on the service 4 or the service 5, and the service 5 depends on the service 6; service 3 depends on service 5 and service 6, service 5 depends on service 6, wherein there are two identical call relations, i.e. service 5 depends on service 6; a service invocation relationship in which the service invocation anomaly information does not exist can be understood as a call relationship between services, but no anomaly information exists, for example, the service 7 depends on the service 8, but no anomaly information occurs.

In practical application, the calling relations with the same paths in the service calling knowledge graph can be combined, and abnormal information is not deleted in the calling relations, so that the service calling knowledge graph is simplified.

Along the above example, referring to fig. 3b, fig. 3b shows a schematic diagram of a service call knowledge graph generated according to the service call tree diagram in the above embodiment, where service 2 depends on service 4 and service 5, and service 5 depends on service 6; the service 3 depends on the service 5 and the service 6, the service 5 depends on the service 6, and then the same call relationship exists that the service 5 depends on the service 6, and according to the service 3 of the service depends on the service 5 and the service 6, the service 5 depends on the service 6, and the service 6 depends on the service 4, two identical call relationships exist, namely, the service 6 depends on the service 4. The service invocation knowledge graph needs to be combined, referring to fig. 3c, fig. 3c shows a schematic diagram of another service invocation knowledge graph, that is, paths of two services 5 depending on the service 6 and paths of two services 6 depending on the service 4 may be combined respectively, so that the service invocation knowledge graph in fig. 3c may be obtained.

According to the embodiment of the specification, the paths in the service call knowledge graph are combined according to the same call relation, and no abnormal information in the call relation is deleted to avoid unnecessary paths in the subsequent searching step, so that the searching efficiency is improved.

Further, before determining the target node in the service invocation knowledge graph according to the node access policy and the service invocation knowledge graph, the method further includes:

and generating a node access strategy according to the service call abnormality information.

The above-mentioned node access policy may be a policy how to perform abnormal node query in the service invocation knowledge graph.

In one implementation manner, the generating a node access policy according to the service call anomaly information includes:

and generating a node access strategy according to the service call abnormal information corresponding to the service call relation.

Wherein, the service call relationship can be understood as the call relationship in the above embodiment; the service call exception information corresponding to the service call relationship may be understood as exception information generated during the call process of the service, for example, in the case that the service a calls the service B, the processor occupancy rate is too high.

In practical application, because the exception information represents the situation that the service call has an exception problem, a node access policy can be generated according to the exception information, and the node access policy is used for searching for an exception node. In addition, when there are at least two service call relationships, there is a problem of access selection, that is, the node access policy may be a policy for accessing one of the nodes when there are at least two service call relationships.

For example, there is a record in the travel log that service a calls service B, and there is also a record that service a calls service C. Then it may be determined that a service call relationship exists between the service a and the service B, and it may be determined that a service call relationship exists between the service a and the service C, the abnormal information is invoked according to the service corresponding to the service a invoking the service B, and the node access policy is generated according to the abnormal information is invoked according to the service corresponding to the service a invoking the service C.

According to the embodiment of the specification, the node access strategy is generated according to the service call abnormal information corresponding to the service call relation, and when the scene matching rule is prepared, the positioning scene adaptation of each service link can be performed by properly adjusting the node weight initial strategy, so that the method has strong universality.

In one implementation manner, the generating the node access policy according to the service call exception information corresponding to the service call relationship includes:

generating an abnormal information weight according to the service call abnormal information corresponding to the service call relation;

and determining access probability according to the ratio of the abnormal information weights, and generating a node access strategy according to the access probability.

The abnormal information weight may be different weights given to policies according to different node types when selecting the next node in the graph node migration process (access process), for example, service a is an application service, service B is a container, service C is a database, the abnormal information weight of service a to service B is 0.6, and the abnormal information weight of service a to service C is 0.8; the access probability may be a probability that one service accesses another service, for example, the anomaly information weight of service a to service B is 0.6 and the anomaly information weight of service a to service C is 0.8, and then the probability that service a accesses service B is three-sevenths and the probability that service a accesses service C is four-sevenths.

Along the above example, there is a record in the travel log that service a calls service B, and there is also a record that service a calls service C. Then it may be determined that a service invocation relationship exists between service a and service B and that a service invocation relationship exists between service a and service C. Determining service call abnormality information corresponding to service A call service B: the error number is 80, and service call abnormality information corresponding to the service A call service C is determined: the number of errors is 20, the weight of the abnormality information from the service A to the service B generated according to the number of errors is 80, and the weight of the abnormality information from the service A to the service C generated according to the number of errors is 20. Then it may be determined that the probability of service a accessing service B is eighty percent, the probability of service a accessing service C is twenty percent, and the generated node access policy is that in the case where service a exists in two access paths, one access path is service a accessing service B and the other access path is service a accessing service C, the probability of service a accessing service B is eighty percent and the probability of service a accessing service C is twenty percent.

In another implementation, the access weight may also be determined according to the type of node and the number of errors, as described below.

For another example, service a is an application service, service B is a container, service C is a database, and there is a record of service a calling service B and also a record of service a calling service C in the running log. Then it may be determined that a service invocation relationship exists between service a and service B and that a service invocation relationship exists between service a and service C. Determining service call exception information and type weight corresponding to the service A call service B: the error number is 80, the type weight from the service A to the service B is 0.6, and the service call abnormality information and the type weight corresponding to the service A call service C are determined: the number of errors is 20, the type weight of the service A to the service C is 0.8, the weight of the abnormal information of the service A to the service B generated according to the number of errors is 80 multiplied by 0.6 to be 48, and the weight of the abnormal information of the service A to the service C generated according to the number of errors and the type weight, namely 20 multiplied by 0.8 to be 16. Then it may be determined that the probability of service a accessing service B is seventy-five percent, the probability of service a accessing service C is twenty-five percent, and the generated node access policy is seventy-five percent when there are two access paths for service a, one access path is service a accessing service B, and the other access path is service a accessing service C.

It should be noted that the above weight calculation method is only an exemplary description, and the anomaly information weight may be determined by other calculation methods, which is not limited in the embodiments of the present disclosure.

According to the embodiment of the specification, the range of abnormality location can be enlarged by only supplementing the corresponding service node, if abnormality of a physical machine needs to be located, only the physical machine node and the edge data need to be supplemented, and a large amount of historical data training is not needed.

Step 206: and determining a target node in the service call knowledge graph according to a node access strategy and the service call knowledge graph, wherein the node access strategy is determined according to the service call abnormal information.

Because the node access policy is generated before the target node in the service call knowledge graph is determined according to the node access policy and the service call knowledge graph, the target node in the service call knowledge graph can be determined according to the node access policy and the service call knowledge graph.

The node access policy may be understood as the node access policy in the above embodiment, and the target node may be understood as the abnormal node that is found.

In practical application, the nodes in the service call knowledge graph are accessed through the node access strategy, and the final stay node is the target node, namely the abnormal node.

In one implementation manner, the determining, according to a node access policy and the service invocation knowledge graph, a target node in the service invocation knowledge graph includes:

and according to the preset access times and the preset access steps, performing node access in the service call knowledge graph by using a node access strategy, and determining a target node in the service call knowledge graph.

The above-mentioned preset access times can be understood as how many times the service call knowledge graph is accessed, for example, 100 times the service call knowledge graph is accessed, that is, 100 times the target node is determined, and the preset access times are 100; the preset number of access steps may be understood as the number of times the service call knowledge graph is accessed, for example, 100 times in the service call knowledge graph to determine the target node, and then the preset number of access steps may be 100.

In practical application, because the access probability exists in the node access strategy, the nodes in the service call knowledge graph need to be accessed for multiple times to obtain an accurate result. That is, the node walk is performed a plurality of times, and the node walk is performed from the node without upstream according to the weight walk policy until the most downstream abnormal node is found. Repeating the migration, and reserving the found abnormal node.

It should be noted that, the larger the preset access number is, the more accurate the obtained result is, so the preset access number can be set according to the actual requirement, the embodiment of the present specification does not limit the preset access number, and similarly, the larger the preset access step number is, the more accurate the obtained result is, so the preset access step number can be set according to the actual requirement, and the embodiment of the present specification does not limit the preset access step number.

Specifically, according to the preset access times and the preset access steps, performing node access in the service call knowledge graph by using a node access policy, and determining a target node in the service call knowledge graph includes:

s2, determining a current node in the service call knowledge graph;

s4, determining an associated node which has a service calling relation with the current node in the service calling knowledge graph;

s6, accessing the associated node according to the access probability of the current node and the associated node, and determining the current access step number;

s8, judging whether the current access step number is greater than or equal to the preset access step number;

if yes, the process is ended, and the step S2 is continuously executed,

If not, determining the associated node as the current node, continuing to execute the step S4 until the preset access condition is met, and determining the current access times;

judging whether the current access times are larger than or equal to the preset access times,

if yes, ending;

if not, the current access times are increased by 1, and the step S2 is continuously executed.

The current node can call any node in the knowledge graph for the service, and the current node can understand which node to access, for example, the service A accesses the service B, and the current node is the service A; an associated node may be a node that the current node may access, e.g., there may be an access path between service a and service B, and an access path between service a and service C, then the associated node of service a includes service C and service B.

In practical application, a dual-loop execution logic may be set, that is, the count of the sequential preset access times is increased every time the determination of the target node is completed according to the preset access steps until the preset access times are reached.

For example, the preset number of accesses is 100, the preset number of access steps is 100, service 1 depends on service 2 and service 3, service 2 depends on service 4 or service 5, and service 5 depends on service 6; service 3 depends on service 5 and service 6, service 5 depends on service 6, and service 6 depends on service 4. The access probability of service 1 to service 2 is eighty percent, the probability of service 1 to service 3 is twenty percent, the probability of service 2 to service 5 is twenty five percent, the probability of service 2 to service 4 is seventy five percent, the probability of service 3 to service 5 is fifty percent, the probability of service 3 to service 6 is fifty percent, and in addition, service 6 only has an access path to service 4, the access probability is one, it is understood that only one node can access, and service 5 only has an access path to service 6, the access probability is one. Then in the first access times, it is determined that node access is performed from service 1, and the first step is: service 1 chooses to access service 2, the second step: service 2 selects to access service 4, and third step, service 4 selects to access service 2 … … until step 100, and service 4 is remained, then service 4 is determined as the target node.

For another example, the preset number of accesses is 100, the preset number of access steps is 100, service 1 depends on service 2 and service 3, service 2 depends on service 4 or service 5, and service 5 depends on service 6; service 3 depends on service 5 and service 6, service 5 depends on service 6, and service 6 depends on service 4. The access probability of service 1 to service 2 is eighty percent, the probability of service 1 to service 3 is twenty percent, the probability of service 2 to service 5 is twenty five percent, the probability of service 2 to service 4 is seventy five percent, the probability of service 3 to service 5 is fifty percent, the probability of service 3 to service 6 is fifty percent, and in addition, service 6 only has an access path to service 4, the access probability is one, service 5 only has an access path to service 6, and the access probability is one. Then in the first access times, it is determined that node access is performed from service 1, and the first step is: service 1 chooses to access service 2, the second step: service 2 selects to access service 4, and third step, service 4 selects to access service 2 … … at step 25, and if service 4 is found to be the most downstream node, service 4 is determined to be the target node.

It should be noted that, the number of target nodes determined by the number of accesses may be two or more, so that a cyclic access may occur, and in the case of the cyclic access, the target nodes may be determined according to the preset access conditions described below.

In one possible implementation manner, the preset access condition includes:

and the continuous access times of the current node and the associated node of the current node are larger than or equal to a preset time threshold.

The above-mentioned preset number of times threshold may be a threshold smaller than 10 times, for example, 5 times.

In practical application, if access is circulated between two nodes for multiple times, it is possible that both nodes are target nodes, and then the access can be finished in advance.

For example, in the first access times, it is determined that node access is performed from service 1, the first step: service 1 chooses to access service 2, the second step: and the service 2 selects to access the service 4, the third step, the service 4 selects to access the service 2 … …, and the fourth step, the fifth step, the sixth step and the seventh step are all to circulate between the service 4 and the service 2, so that the access can be finished in advance, and the service 4 and the service 2 are determined to be target nodes.

In one implementation manner, after the determining the target node in the service invocation knowledge graph, the method further includes:

determining the occurrence number of each target node under the condition that the number of the target nodes is at least two;

and sequencing according to the occurrence times to generate an abnormal positioning result, and displaying the abnormal positioning result and the service calling knowledge graph.

In practical application, sorting and positioning abnormal results, counting the occurrence times of abnormal nodes, sorting according to the occurrence times, namely, the priority of the abnormal root causes, and if the occurrence times are maximum, the suspected abnormal root causes are the most possible abnormal root causes, constructing a map result output, assembling an abnormal propagation path map and the root cause positioning result, and finally outputting.

For example, the preset number of accesses is 100, that is, 100 accesses need to be completed in the service call knowledge graph, in the case that 100 accesses are completed, the results of the target nodes may be counted, the result of the target node corresponding to the 100 accesses is that the service 4 is determined to be 56 target nodes, the service 6 is determined to be 44 target nodes, and the service 4 is determined to be 56 target nodes and the service 6 is determined to be 44 target nodes, and the service call knowledge graph is displayed while being sequenced.

According to the embodiment of the specification, the result of the target node is sequenced to obtain the abnormal positioning result, and the abnormal positioning result and the service call knowledge graph are displayed, so that a worker can intuitively see the result of the abnormal node determination.

The embodiment of the specification provides a target node determining method and a device, wherein the target node determining method comprises the following steps: acquiring an operation log of an application service system, wherein the application service system comprises a preset number of services, and calling relations exist among the services; generating a service call knowledge graph according to the service call information in the operation log, wherein the service call knowledge graph comprises nodes corresponding to the service and service call abnormal information; and determining a target node in the service call knowledge graph according to a node access strategy and the service call knowledge graph, wherein the node access strategy is determined according to the service call abnormal information. And generating a service calling knowledge graph through the running log of the service, wherein the service calling knowledge graph contains service calling abnormal information, and determining a target node according to a node access strategy determined by the service calling abnormal information in the service calling knowledge graph.

The method for determining the target node is further described below with reference to fig. 4 by taking an application of the method for determining the target node provided in the present specification to a server as an example. Fig. 4 is a flowchart of a process of a method for determining a target node according to an embodiment of the present disclosure, which specifically includes the following steps.

Step 402: acquiring an initial log of the application service system, and performing deficiency supplementing processing on the initial log to obtain an operation log under the condition that the missing data exists in the initial log.

For example, in the initial log, there is no data of the processor occupancy rate of three to four points in the data of the service 1, and then the data of two to three points for the previous period and the data of four to five points for the subsequent period can be found in the initial log, and the deficiency processing is performed on the data of three to four points of the data of two to three points and the data of three to four points of the data of four to five points. Specifically, the two-to-three-point processor occupancy is thirty percent and the four-to-five-point processor occupancy is fifty percent, then the two-to-three-point processor occupancy may be added to the four-to-five-point processor occupancy and averaged to obtain the average as the three-to-four-point processor occupancy for service 1, i.e., thirty percent plus fifty percent equals eighty percent, and then eighty percent is divided by two to obtain forty percent.

Step 404: and determining service calling relations among the services according to the service calling information in the operation log.

For example, service 2 depends on service 4 or service 5, and service 5 depends on service 6; the service 3 depends on the service 5 and the service 6, the service 5 depends on the service 6, and then the same call relationship exists in which the service 5 depends on the service 6, and according to which the service 3 depends on the service 5 and the service 6, the service 5 depends on the service 6, and the service 6 depends on the service 4.

Step 406: and generating a service call tree diagram according to the service and the service call relation among the services, processing the service call relation in the service call tree diagram, and generating a service call knowledge graph.

For example, there are two identical call relationships, i.e. service 6 depends on service 4. The service call knowledge graph is also required to be combined, and the paths of the two services 5 and the paths of the two services 6 and the service 4 can be combined respectively, so that the service call knowledge graph can be obtained.

Step 408: and generating a node access strategy according to the service call abnormal information corresponding to the service call relation.

Along the above example, there is a record of service 1 invoking service 2 in the travel log, and there is also a record of service 1 invoking service 3. Then it can be determined that there is a service invocation relationship between service 1 and service 2 and that there is a service invocation relationship between service 1 and service 3. Determining service call abnormality information corresponding to the service 1 call service 2: the error number is 80, and service call abnormality information corresponding to the service 1 call service 3 is determined: the number of errors is 20, the weight of the abnormality information of the generation service 1 to the service 2 according to the number of errors is 80, and the weight of the abnormality information of the generation service 1 to the service 3 according to the number of errors is 20. Then it may be determined that service 1 has a probability of accessing service 2 of eighty percent, service 1 has a probability of accessing service 3 of twenty percent, and the generated node access policy is such that in the case where service 1 has two access paths, one access path is service 1 to service 2 and the other access path is service 1 to service 3, service 1 has a probability of accessing service 2 of eighty percent, and service 1 has a probability of accessing service 3 of twenty percent.

For another example, service 1 is an application service, service 2 is a container, service 3 is a database, and there is a record of service 1 calling service 2 and a record of service 1 calling service 3 in the running log. Then it can be determined that there is a service invocation relationship between service 1 and service 2 and that there is a service invocation relationship between service 1 and service 3. Determining service call exception information and type weight corresponding to the service 1 call service 2: the error number is 80, the type weight of the service 1 to the service 2 is 0.6, and the service call abnormality information and the type weight corresponding to the service 1 call service 3 are determined: the number of errors is 20, the type weights of the services 1 to 3 are 0.8, the abnormal information weights of the services 1 to 2 are 80 multiplied by 0.6 and are equal to 48 according to the number of errors, and the abnormal information weights of the services 1 to 3 are generated according to the number of errors and the type weights, namely, 20 multiplied by 0.8 and are equal to 16. Then it may be determined that the probability of service 1 accessing service 2 is seventy-five percent, the probability of service 1 accessing service 3 is twenty-five percent, and the generated node access policy is that in the case where service 1 exists two access paths, one access path is service 1 accessing service 2 and the other access path is service 1 accessing service 3, the probability of service 1 accessing service 2 is seventy-five percent, and the probability of service 1 accessing service 3 is twenty-five percent.

Step 410: and performing node access in the service call knowledge graph by using a node access strategy.

For example, the preset number of accesses is 100, the preset number of access steps is 100, service 1 depends on service 2 and service 3, service 2 depends on service 4 or service 5, and service 5 depends on service 6; service 3 depends on service 5 and service 6, service 5 depends on service 6, and service 6 depends on service 4. The access probability of service 1 to service 2 is eighty percent, the probability of service 1 to service 3 is twenty percent, the probability of service 2 to service 5 is twenty five percent, the probability of service 2 to service 4 is seventy five percent, the probability of service 3 to service 5 is fifty percent, the probability of service 3 to service 6 is fifty percent, and in addition, service 6 only has an access path to service 4, the access probability is one, service 5 only has an access path to service 6, and the access probability is one. Then in the first access times, it is determined that node access is performed from service 1, and the first step is: service 1 chooses to access service 2.

Step 412: whether the preset number of access steps is reached is determined, if yes, step 414 is executed. If not, go to step 410.

For example, it is determined whether the preset number of access steps 100 is reached, and if not, the second step is continued: service 2 chooses to access service 4 and, in a third step, service 4 chooses to access service 2 … ….

Step 414: and determining a target node in the service calling knowledge graph.

For example, until step 100, service 4 is left, then service 4 is determined to be the target node.

Step 416: judging whether the preset access times are reached. If yes, go to step 418. If not, go to step 410.

For example, after determining that the service 4 is the target node, it is determined whether the preset number of accesses is reached, and if so, step 418 is performed.

Step 418: and under the condition that the number of the target nodes is at least two, determining the occurrence number of each target node, sorting according to the occurrence number to generate an abnormal positioning result, and displaying the abnormal positioning result and the service call knowledge graph.

And counting the results of the target nodes, wherein the results of the target nodes corresponding to the 100 times of access are that the service 4 is determined to be 56 times of target nodes, the service 6 is determined to be 44 times of target nodes, and the service 4 is determined to be 56 times of target nodes and the service 6 is determined to be 44 times of target nodes, and the service call knowledge graph is displayed while sequencing and displaying.

And generating a service calling knowledge graph through the running log of the service, wherein the service calling knowledge graph contains service calling abnormal information, and determining a target node according to a node access strategy determined by the service calling abnormal information in the service calling knowledge graph.

In practical application, referring to fig. 5, fig. 5 shows that the target node determining method in the embodiments of the present disclosure may be configured in an engineering manner, where the engineering configuration includes a unified link configuration module 502, a weight assignment module 504, a build propagation map module 506, a policy operation module 508, and a root cause result assembly module 510. The unified link configuration module 502 is configured to obtain an operation log of the application service system, and generate a call link according to the operation log; the weight assignment module 504 is configured to set an access weight between nodes; the propagation map constructing module 506 is configured to generate a knowledge map according to the call link and the set weight; the policy operation module 508 is configured to perform a walk access of the knowledge graph according to a preset policy, and the root cause result assembly module 510 is configured to combine the obtained knowledge graph and the abnormal node recommendation result for display. Further, the policy running module 508 may be configured as pluggable modules, including a loading policy model unit 512, a loading decision script unit 514, a running node wander unit 516, and an application feature weight unit 518, where the running node wander unit 516 may locate root causes to obtain a fault propagation graph 520 and root cause recommendation 522. The application feature weight unit 518 may assign weights of various services, for example, determine weights according to physical machine features, container features, database features, cache features, message features, and transport protocol features, and the physical machine features, container features, and database features may be acquired according to event collection, that is, abnormal event monitoring; the buffer characteristic, the message characteristic and the transmission protocol characteristic can be obtained according to index analysis, namely index abnormality detection.

Corresponding to the method embodiment, the present disclosure further provides an embodiment of a target node determining device, and fig. 6 shows a schematic structural diagram of the target node determining device provided in one embodiment of the present disclosure. As shown in fig. 6, the apparatus includes:

the data acquisition module 602 is configured to acquire a running log of an application service system, wherein the application service system comprises a preset number of services, and call relations exist among the services;

a generating module 604, configured to generate a service call knowledge graph according to the service call information in the running log, where the service call knowledge graph includes a node corresponding to the service and service call exception information;

and an access module 606 configured to determine a target node in the service invocation knowledge graph according to a node access policy and the service invocation knowledge graph, wherein the node access policy is determined according to the service invocation anomaly information.

Further, the generating module 604 is further configured to:

Further, the access module 606 is further configured to:

s2, determining a current node in the service call knowledge graph;

if yes, the process is ended, and the step S2 is continuously executed,

if yes, ending;

Further, the access module 606 is further configured to:

Further, the data acquisition module 602 is further configured to:

acquiring an initial log of the application service system;

The embodiment of the specification provides a target node determining method and device, wherein the target node determining device comprises the following steps: the system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is configured to acquire an operation log of an application service system, the application service system comprises a preset number of services, and calling relations exist among the services; the generation module is configured to generate a service call knowledge graph according to the service call information in the operation log, wherein the service call knowledge graph comprises nodes corresponding to the service and service call abnormal information; and the access module is configured to determine a target node in the service call knowledge graph according to a node access strategy and the service call knowledge graph, wherein the node access strategy is determined according to the service call abnormal information. And generating a service calling knowledge graph through the running log of the service, wherein the service calling knowledge graph contains service calling abnormal information, and determining a target node according to a node access strategy determined by the service calling abnormal information in the service calling knowledge graph.

The above is an exemplary scheme of a target node determining apparatus of the present embodiment. It should be noted that, the technical solution of the target node determining apparatus and the technical solution of the target node determining method belong to the same concept, and details of the technical solution of the target node determining apparatus, which are not described in detail, can be referred to the description of the technical solution of the target node determining method.

Fig. 7 illustrates a block diagram of a computing device 700 provided in accordance with one embodiment of the present description. The components of computing device 700 include, but are not limited to, memory 710 and processor 720. Processor 720 is coupled to memory 710 via bus 730, and database 750 is used to store data.

Computing device 700 also includes access device 740, access device 740 enabling computing device 700 to communicate via one or more networks 760. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 740 may include one or more of any type of network interface, wired or wireless (e.g., a Network Interface Card (NIC)), such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the present description, the above-described components of computing device 700, as well as other components not shown in FIG. 7, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 7 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.

Computing device 700 may be any type of stationary or mobile computing device including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 700 may also be a mobile or stationary server.

Wherein the processor 720 is configured to execute computer-executable instructions that, when executed by the processor, perform the steps of the above-described target node determination method.

The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the target node determining method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the target node determining method.

An embodiment of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the above-described target node determination method.

The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the target node determining method belong to the same concept, and details of the technical solution of the storage medium, which are not described in detail, can be referred to the description of the technical solution of the target node determining method.

An embodiment of the present specification further provides a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of the above-described target node determining method.

The above is an exemplary version of a computer program of the present embodiment. It should be noted that, the technical solution of the computer program and the technical solution of the target node determining method belong to the same concept, and details of the technical solution of the computer program, which are not described in detail, can be referred to the description of the technical solution of the target node determining method.

The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.

It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments are not limited by the order of actions described, as some steps may be performed in other order or simultaneously according to the embodiments of the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the embodiments described in the specification.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.

The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teaching of the embodiments. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims

1. A target node determination method, comprising:

determining a target node in the service call knowledge graph according to a node access strategy and the service call knowledge graph, wherein the node access strategy is determined according to the service call abnormal information;

before determining the target node in the service calling knowledge graph according to the node access strategy and the service calling knowledge graph, the method further comprises the following steps:

2. The method of claim 1, the generating a service invocation knowledge graph from service invocation information in the travel log, comprising:

3. The method of claim 2, wherein the processing the service call relationships in the service call tree graph to generate a service call knowledge graph comprises:

4. The method of claim 1, the determining a target node in the service invocation knowledge graph based on a node access policy and the service invocation knowledge graph, comprising:

5. The method according to claim 4, wherein the determining the target node in the service invocation knowledge graph by using the node access policy to access the node in the service invocation knowledge graph according to the preset number of accesses and the preset number of access steps includes:

s2, determining a current node in the service call knowledge graph;

if yes, the process is ended, and the step S2 is continuously executed,

If yes, ending;

6. The method of claim 5, the preset access condition comprising:

7. The method of claim 4, after the determining the target node in the service invocation knowledge graph, further comprising:

8. The method of claim 1, the obtaining a running log of an application service system, comprising:

acquiring an initial log of the application service system;

9. A target node determining apparatus comprising:

the access module is configured to determine a target node in the service call knowledge graph according to a node access strategy and the service call knowledge graph, wherein the node access strategy is determined according to the service call abnormal information;

the determining module is configured to determine service calling relations among the services according to the service calling information in the running log;

the weight generation module is configured to generate an abnormal information weight according to the service call abnormal information corresponding to the service call relation;

and the strategy generation module is configured to determine the access probability according to the ratio of the abnormal information weights and generate a node access strategy according to the access probability.

10. A computing device, comprising:

a memory and a processor;

the memory is configured to store computer executable instructions, the processor being configured to execute the computer executable instructions, which when executed by the processor, implement the steps of the target node determination method according to any one of claims 1 to 8.

11. A computer readable storage medium storing computer executable instructions which when executed by a processor perform the steps of the target node determining method of any one of claims 1 to 8.