WO2017076188A1 - 一种用于处理服务调用信息的方法与设备 - Google Patents

一种用于处理服务调用信息的方法与设备 Download PDF

Info

Publication number
WO2017076188A1
WO2017076188A1 PCT/CN2016/103173 CN2016103173W WO2017076188A1 WO 2017076188 A1 WO2017076188 A1 WO 2017076188A1 CN 2016103173 W CN2016103173 W CN 2016103173W WO 2017076188 A1 WO2017076188 A1 WO 2017076188A1
Authority
WO
WIPO (PCT)
Prior art keywords
service
call
service call
topology
node
Prior art date
Application number
PCT/CN2016/103173
Other languages
English (en)
French (fr)
Inventor
夏玉才
常二鹏
王杰
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Priority to ES16861456T priority Critical patent/ES2808966T3/es
Priority to EP16861456.8A priority patent/EP3373516B1/en
Priority to KR1020187015528A priority patent/KR102146173B1/ko
Priority to PL16861456T priority patent/PL3373516T3/pl
Priority to JP2018522941A priority patent/JP6706321B2/ja
Priority to AU2016351091A priority patent/AU2016351091B2/en
Priority to MYPI2018701755A priority patent/MY197612A/en
Priority to SG11201803696QA priority patent/SG11201803696QA/en
Publication of WO2017076188A1 publication Critical patent/WO2017076188A1/zh
Priority to US15/969,364 priority patent/US10671474B2/en
Priority to PH12018500934A priority patent/PH12018500934A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5041Network service management, e.g. ensuring proper service fulfilment according to agreements characterised by the time relationship between creation and deployment of a service
    • H04L41/5054Automatic deployment of services triggered by the service manager, e.g. service implementation by automatic configuration of network components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5009Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF]
    • H04L41/5012Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF] determining service availability, e.g. which services are available at a certain point in time
    • H04L41/5016Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF] determining service availability, e.g. which services are available at a certain point in time based on statistics of service availability, e.g. in percentage or over a given time

Definitions

  • the present application relates to the field of computers, and in particular, to a technique for processing service call information.
  • the method of locating the problem by using the log and tracking service calling path is cumbersome and time-consuming, and the time-of-time accuracy is low, and the monitoring service call is often unable to perform pre-circumvention and early warning of the problem after the problem occurs.
  • An object of the present application is to provide a method and a device for processing service call information, which are used to solve the problem of the problem of running a service in a distributed system and the problem of monitoring and early warning of service operation.
  • the present application provides a method for processing service call information, which solves the problem of location of a problem in a service operation in a distributed system and a problem of monitoring and warning of a service operation.
  • Methods include:
  • each service call chain includes one or more service nodes that are called sequentially;
  • the service call chain is processed according to the service invocation model.
  • the present application provides an apparatus for processing service call information, the apparatus It solves the problem of the problem of the running of the business in the distributed system and the monitoring and early warning of the business operation.
  • the equipment includes:
  • a service call chain obtaining means for acquiring one or more service call chains in the distributed service system, wherein each service call chain includes one or more service nodes that are sequentially called;
  • a service invocation model building device configured to construct a corresponding service invocation model according to the service invocation chain
  • Processing means for processing the service call chain according to the service invocation model For processing the service call chain according to the service invocation model.
  • the present application obtains a service call chain with service node call sequence information in a distributed service system, and constructs a service call chain with the same service node call sequence as a service call model, thereby based on the service call model. Analyze the call information of each service node and analyze the problem of normal monitoring and operation of the service call. The big data information of the service node is used for analysis and monitoring, which improves the problem location efficiency of the distributed service system and increases The reliability of the distributed service system.
  • FIG. 1 shows a flow chart of a method for processing service call information in accordance with an aspect of the present application
  • step S1 shows a flow chart of step S1 in a method for processing service call information according to another preferred embodiment of the present application
  • step S3 shows a flow chart of step S3 in a method for processing service call information according to still another preferred embodiment of the present application
  • step S32 shows a flow chart of step S32 in a method for processing service call information according to still another preferred embodiment of the present application
  • FIG. 5 shows a schematic diagram of an apparatus for processing service call information according to another aspect of the present application
  • FIG. 6 is a schematic diagram of a service call chain acquisition apparatus in an apparatus for processing service call information according to another preferred embodiment of the present application;
  • FIG. 7 is a schematic diagram of a processing device in an apparatus for processing service call information according to still another preferred embodiment of the present application.
  • FIG. 8 is a schematic diagram of a monitoring unit in an apparatus for processing service call information according to still another preferred embodiment of the present application.
  • Figure 9 shows a schematic diagram of a service call in accordance with yet another preferred embodiment of the present application.
  • the terminal, the device of the service network, and the trusted party each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information storage can be implemented by any method or technology.
  • the information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage,
  • computer readable media does not include non-transitory computer readable media, such as modulated data signals and carrier waves.
  • Step S1 shows a flow chart of a method for processing service invocation information in accordance with an aspect of the present application. Step S1, step S2, and step S3 are included.
  • step S1 the device 1 acquires one or more service call chains in the distributed service system, wherein each service call chain includes one or more service nodes that are sequentially called; in step S2, the device 1 according to the The service call chain constructs a corresponding service invocation model; in step S3, the device 1 processes the service invocation chain according to the service invocation model.
  • the device 1 acquires one or more service call chains in the distributed service system in step S1, wherein each service call chain includes one or more service nodes that are called in sequence.
  • the distributed service system includes, but is not limited to, a service oriented architecture or a software system built on a distributed system.
  • the service node includes, but is not limited to, a service or a function for calling in the distributed service system, such as a service node when the e-commerce platform performs product consumption, including calling a user name, calling a user association account, Get the payment page, get security verification, check account balance, and more.
  • the service call chain refers to a service call completed in the distributed service system.
  • the service nodes involved and their order for example, the service call diagram shown in FIG.
  • A, B, C, and D indicated by circles refer to a service node.
  • the entry of the indicated service call chain is service node A, and the completion of A needs to call B and then call C, and complete C needs to call D, so the service call node in the service call chain in the service call shown in FIG. 9 is There is a calling sequence, that is, the calling sequence of the service call chain shown is A ⁇ B ⁇ C ⁇ D, wherein, in order to facilitate the recognition of the computer language, the calling sequence of the service nodes in the service call chain shown in FIG. 9 can be based on the service.
  • the called sequence of the node marks the order of the initial entry node as A0, and then the called B is marked as B0.1, that is, 0 represents the first "service node” after the A ".” number is followed by A, and then calls C. Marked as C0.2, that is, 0 means that the A ".” is followed by the second service node called after A. Then, in order to complete the C call D, it is known that D is called to complete C, so D is marked as D0.
  • the call link represented by the topology diagram in Figure 9 can be expressed as A0, B0.1, C0.2, D0.2.1,
  • the method of labeling the order is only an example.
  • the number representing the sequence topology is recorded in the log.
  • the field representing the call is X, and several fields after the X field are recorded.
  • Obtaining the service call chain containing the service nodes involved in the call and the order in which they are called can clearly show the process of the service call and obtain the topology and characteristics of each service call.
  • the device 1 constructs a corresponding service invocation model according to the service call chain.
  • the service invocation model refers to a service invocation chain having the same node invocation sequence constructed according to the topology of the service invocation chain, for example, in the case that the sample of the acquired service invocation chain is sufficiently large, provided in the same service.
  • the same service call chain will appear in the platform or application system of the quotient. For example, during the day, there are 30,000 service calls among the national users involved in the service node shown in Figure 9, and the calling sequence and topology are the same.
  • a service invocation model is the build process. Constructing a service invocation model corresponding to the service call chain makes the analysis of the service call chain based on the big data sample clearer, and the constructed model can represent a service call, thereby facilitating subsequent data based on the pair of models. Analyze.
  • the device 1 processes the service call chain according to the service invocation model.
  • Means according to service The call model analyzes the service call chain data with the same service node call and topology order as the service call model.
  • the call data of each service node for example, the completion time of each node call, the success or failure of the call, etc. are different. In the case of a difference, but the data sample is large enough to observe the calling rules of each service node, for example, the calling time of a service node is normally completed within 0.1 seconds, and for example, the feedback information of a certain service node is normal.
  • the call rule analyzed based on the service call model and the service call chain data with the same service node call and topology order can be used to monitor whether the call in the distributed system is normal and the problem is located.
  • the call time of a service node is normally completed within 0.1 seconds.
  • the call time of the service node is more than 50 times and more than ten times and more than 0.1 second, then the detection can be detected. There is a problem with the call to the service node.
  • step S1 shows a flow chart of step S1 in a method for processing service call information in accordance with another preferred embodiment of the present application.
  • the step S1 includes a step S11 and a step S12.
  • the device 1 acquires service call log information in the distributed system in step S11; in step S12, the device 1 extracts one or more service call chains from the service call log information, wherein each service call chain includes One or more service nodes that are called sequentially.
  • the device 1 acquires service call log information in the distributed system in step S11.
  • the service call log information records each time the service node is called, the sequence information, and other information that can determine the order and topology of each service call. For example, because a node is called more than once in a certain time range, if the node is marked D0.2.1 in the service call shown in Figure 9 in the order of the call, there is a certain chance that it will be twice or The above call process is called by the starting point, and after calling the first service node, it is called after the second service node and is marked as 0.2.1. Therefore, it needs to be called when each node is called.
  • Each service call is marked and recorded in the log, for example, in the log, the record represents the field of the call shown in Figure 9 as X, that is, X represents a service call marked X by A entry to D, thus
  • the field called by the tag is read when the log information is obtained.
  • the above example exemplifies that the number representing the sequential topology is recorded in the log when each node is called, so that when the log information is acquired, a field indicating the calling sequence and topology of the service node in the service call chain is read.
  • Obtain the log information of the above service call and associate the nodes in the service call to obtain the service call chain.
  • step S12 the device 1 extracts one or more service call chains from the service call log information, wherein each service call chain includes one or more service nodes that are called in sequence. That is, according to the mark, sequence information, and other information that can be determined in each log of each service node, as well as the order and topology of each service call, the relevant called sequence is extracted in units of each service call. And topology information and associations to generate a chain of service calls for each service call.
  • the obtained call log information is: "alipay, com.alipay.chashier.xxx, 0x0boc123, 0.2.1, AE001"
  • the comma is separated by the log, the first field is the system name alipay, the second The fields are interface methods, the third field is the token representing a service call, the fourth field is the order and topology when called, and the fifth field is the return code "AE001" representing the result of the call execution.
  • Multi-fields are omitted with "", that is, the third field in all logs is searched according to the mark of the record service call, all service call nodes containing "0x0boc123" are searched, and the searched log records are searched.
  • the corresponding node performs the ordering of the calling sequence and the topology according to the fields recorded in the fourth field recorded by the marking method exemplified above, and finally forms a service call such as A0, B0.1, C0.2, and D0.2.1 format. chain.
  • the device 1 constructs a corresponding service invocation model by performing aggregation processing on the service call chain, wherein the service invocation model includes one or more service invocation topologies, and each service invocation topology is performed by one or A plurality of said service call chain aggregation processes are obtained.
  • the service invocation models all include the same service invocation topology, that is, aggregated by service invocation chains with the same service invocation topology. For example, in the above example, there are 30,000 service call chains in the service call chain obtained as A0, B0.1, C0.2, D0.2.1, and 20,000 service call chains are A0, B0.1.
  • service call chain is listed separately as a service call model 003. That is, the service invocation model is a representative of all service call chains that contain the same service node and the service node invocation topology and order, so that the service node call data in the service call chain can be analyzed and monitored by the service invocation model.
  • the device 1 processes the corresponding service call chain according to the service call topology in step S3.
  • the service call topology in the service call model to be aggregated is based on the service with the same service call topology Call the link for data analysis.
  • the analysis shows that the probability of the call of one of the service invocation nodes displaying an error is one ten thousandth of an hour, and then according to the analysis result, other service invocation links having the same service invocation topology with the service invocation model are monitored, thereby When this service invocation node displays an error probability of more than one tenth of an hour, an error is reported. Therefore, the service call chain is processed according to the service invocation topology, so that the system is effectively monitored in a large data sample, and the early warning performance and stability of the system are improved.
  • the apparatus further comprises a step S4 (not shown), in which the device 1 performs a cleaning operation on the service invocation topology.
  • the cleaning operation is to filter out the information of the calling object that is not important.
  • One of the service nodes C will additionally query some information when calling, for example, in the first service call link.
  • the cache C1 is queried, the database C2 is queried after being acquired, and then the data is put into the cache C3, and the cache C1 is directly queried in the second service call link.
  • the situation before the cleaning operation may be two different links, that is, C also calls the C1, C2, C3 nodes, because the nodes of these queries are usually performed in one system after the error, and the execution result is not fed back.
  • the result of the call is usually fed back to the C node, so it can be cleaned and ignored.
  • the nodes of the middleware routing query can be cleaned up without affecting the model, thus highlighting the call of the key service node. Make the service call topology more accurate.
  • step S3 the device 1 processes the corresponding service call chain according to the service call topology after the cleaning.
  • the service call chain with the same service tune topology after the cleaning operation is aggregated according to its log information to construct a service invocation model, specifically the same as the method described above.
  • the cleaning operation comprises at least one of the following:
  • Deleting the predetermined service node in the service invocation topology that is, filtering the called service node that is not part of the remote service, such as the calling node of the middleware service node routing query, and the like.
  • the service node in the service invocation topology that does not feed back the result information is deleted, that is, the invoked operation is performed in the system and the result of the service call without the execution result feedback is filtered, such as querying and calling the cache, calling the database, and the like.
  • the cumulative number of occurrences is less than or equal to the predetermined even-use threshold information, that is, by setting a certain threshold number of times to be called, by reading the number of calls in the log information or other history information,
  • the service node that is normally used when a complete service call chain is completed or is very accidental is deleted to reduce the burden on the system for data analysis and processing, and a more prepared service call model is obtained. And data.
  • the occasional service node in the service invocation topology Deleting the occasional service node in the service invocation topology, wherein the occasional service node accumulates the number of occurrences less than or equal to a predetermined even-use threshold information in the service call chain corresponding to the service invocation topology,
  • the service node in the service call topology except for the occasional service node, the cumulative number of occurrences in the service call chain corresponding to the service call topology is greater than or equal to a predetermined common threshold information, that is, by setting a certain call.
  • the threshold of the number of times by reading the number of calls in the log information or other history information, filtering the calling service nodes that are mainly or need to be maintained, and deleting other infrequently called service nodes, thereby performing targeted data. Analysis and monitoring.
  • the device 1 combines the call feedback information of the service node in the service call chain to perform aggregation processing on the service call chain to construct a corresponding service call model, wherein the service call model includes a Or multiple services invoke a topology, each service call topology being aggregated by one or more of the service call chain aggregations.
  • the call feedback information refers to a return node after each service node is called to represent an execution result of the service node, and the call feedback information includes a certain execution result differently formed according to a certain preset rule. Information, such as successful or failed execution, and certain business scenarios.
  • each service node corresponds to a number of call feedback information, and usually the call feedback information at the end of the service call link affects the call feedback information of the service call link front end, but in the service call.
  • the link front end may be aggregated.
  • the A service node shown in Figure 9 returns a system exception, which may be due to It is not possible for the C service node or the D service node network to be faulty, or it may be a database exception. Therefore, each service call link is a combination of call feedback information. Therefore, when the service invocation link with the same topology aggregates the service invocation model, different service invocation links record all the call feedback information combinations and use them for subsequent service invocation model-based data in order to clearly locate the problem. analysis.
  • step S3 shows a flow chart of step S3 in a method for processing service call information in accordance with yet another preferred embodiment of the present application. Step S31 and step S32 are included.
  • the device 1 performs statistical processing on the one or more service call chains corresponding to the service call topology in step S31 according to the call feedback information of the service node in each service call chain; in step S32 The device 1 monitors and processes the service call chain corresponding to the service calling topology according to the corresponding statistical result.
  • step S31 the device 1 performs statistical processing on the one or more service call chains corresponding to the service call topology according to the call feedback information of the service node in each of the service call chains. That is, the call feedback information in the service call chain with the same service call topology corresponding to the service call model is analyzed and counted, for example, according to the topological relationship of the service call model generated by the service call link, and the same service call topology is used.
  • the service invocation links are tagged with the service invocation model, that is, each service invocation model has a specific tag, and is combined with the invoking feedback information in each service invocation link to give each service invocation link
  • the unique tag and then the call feedback information in the service call link, the call feedback information of the service node in each service call link is uniformly analyzed according to a fixed time ring ratio or year-on-year.
  • step S32 the device 1 monitors and processes the service call chain corresponding to the service call topology according to the corresponding statistical result. That is, when the data samples are large enough, the call feedback information of the same node in the service call topology is compared at a fixed time, and the data is monitored, for example, the data is in a normal state in more than 95% of the statistical samples, then Once there is a 5 percent abnormality in the system, it will be monitored by the system, because the call feedback information is that each service call node has the service link corresponding to the specific service call model after monitoring the abnormal situation.
  • the above example is based on the unique tag corresponding to each service link and is located on the specific service link and the service node it invokes.
  • step S32 shows a flow chart of step S32 in a method for processing service call information in accordance with still another preferred embodiment of the present application. Step S321 and step S322 are included.
  • step S321 the device 1 compares the statistical result with the service based on the predetermined time period comparison rule. Comparing the call feedback information of each service node in the service call chain corresponding to the service topology; in step S322, when the device 1 has the comparison difference value exceeding the predetermined fluctuation threshold information, generating corresponding information about the comparison difference.
  • the alarm information of the service call chain wherein the alarm information is located at a service node corresponding to the comparison difference.
  • the device 1 compares the corresponding statistical result with the call feedback information of each service node in the service call chain corresponding to the service calling topology based on the predetermined time period comparison rule. That is, according to a certain time comparison rule, the call feedback information of all service call links is analyzed based on the same service call topology, for example, the data of the service call link having the same service call topology is 8:00 am to 9:00 am and 9:00 am 10 points of call feedback information for ring analysis, for example, the service call link with the same service call topology, data from 8:00 am to 9:00 am on March 2, and call feedback information from 8:00 am to 9:00 am on March 3 Year-on-year analysis, so that the normal operation data interval of different scenarios can be obtained. For example, the probability of failure in daily payment is that there are one hundred outstanding balances after the payment has been completed, and then the state of the service call in the system is performed according to the analysis results. monitor.
  • step S322 the device 1 generates alarm information about the service call chain corresponding to the comparison difference when the comparison difference exceeds a predetermined fluctuation threshold information, wherein the alarm information is located in the comparison.
  • the service node corresponding to the difference After analyzing the service call chain with the same topology service model according to the call feedback information, it monitors the service call link data with the same topology in daily operation according to the normal operation data interval of obtaining different scenarios, and The case of the normal interval is set to a certain threshold.
  • the threshold of the number of feedbacks of the service node error for checking the balance is less than or equal to One hundred or one hundred and twenty times, when the number of feedbacks of the service node error checking the balance every day in the system checks the set threshold, it will alarm, and the root service calls the link information and the call feedback information.
  • the globally unique tag is traced to the error node.
  • monitoring and early warning methods are merely examples, and other existing or future possible monitoring and early warning methods may be applied to the present application, and should also be included in the scope of protection of the present application. This is hereby incorporated by reference.
  • FIG. 5 shows a schematic diagram of an apparatus for processing service call information in accordance with another aspect of the present application.
  • the device 1 includes a service call chain acquisition device 11, a service call model construction device 12, and a processing device 13.
  • the service call chain obtaining device 11 in the device 1 acquires one or more services in the distributed service system. a service call chain, wherein each service call chain includes one or more service nodes that are called sequentially; the service call model construction device 12 constructs a corresponding service call model according to the service call chain; the processing device 13 is invoked according to the service The model processes the service call chain.
  • the service call chain obtaining means 11 acquires one or more service call chains in the distributed service system, wherein each service call chain includes one or more service nodes that are called in sequence.
  • the distributed service system includes, but is not limited to, a service oriented architecture or a software system built on a distributed system.
  • the service node includes, but is not limited to, a service or a function for calling in the distributed service system, such as a service node when the e-commerce platform performs product consumption, including calling a user name, calling a user association account, Get the payment page, get security verification, check account balance, and more.
  • the service call chain refers to a service node involved in a service call completed in the distributed service system and its sequence. For example, the service call diagram shown in FIG.
  • a 9 is a service call chain with an A service node as an entry.
  • A, B, C, and D which are indicated by circles, all refer to a service node.
  • the entry of the illustrated service call chain is service node A, and the completion of A needs to call B and then call C, and complete C to call D. Therefore, the service invocation node in the service call chain in the service call shown in FIG. 9 has a calling sequence, that is, the calling sequence of the service call chain shown is A ⁇ B ⁇ C ⁇ D, wherein, in order to facilitate the computer The language recognizes that the calling sequence of the service nodes in the service call chain shown in FIG.
  • D 9 may mark the order of the initial entry nodes as A0 according to the called order of the service nodes, and then the called B is marked as B0.1, ie 0 represents A.
  • the ".” is followed by the first service node that is called after A, and then the C is marked as C0.2, that is, 0 represents the second ".” followed by the second service node called after A, and then D needs to be called to complete D
  • D is marked as D0.2.1, that is, 0 means A ".”, 2 is followed by C, and then ".” is followed by 1 to be the first node called after C.
  • the calling link represented by the topology diagram in 9 can be expressed as A0, B0.1, C0.2, D0.2.1, wherein the labeling method for the order is only an example, and can represent the sequential topology when each node is called.
  • the number is recorded in the log, for example, in the log, the record represents the field of the call is X, and several fields following the X field record the number and point indicating the calling sequence and topology of the service node, so as to serve according to the service call record in the log.
  • Obtaining the service call chain containing the service nodes involved in the call and the order in which they are called can clearly show the process of the service call and obtain the topology and characteristics of each service call.
  • the service invocation model building device 12 constructs a corresponding service invocation model according to the service invocation chain.
  • the service invocation model refers to a node with the same node constructed according to the topology of the service invocation chain. Sequential service call chain, for example, if the sample of the acquired service call chain is large enough, the same service call chain will appear in the same service provider's platform or application system, for example, three of the national users in a day. Ten thousand service calls refer to the service node shown in Figure 9, and the calling sequence and topology are the same.
  • the 30,000 service calls corresponding to the same service invocation model are A0, B0.1 as exemplified above.
  • the service process model is a build process by summarizing several service call chains with the same topology and sequential service node calls. Constructing a service invocation model corresponding to the service call chain makes the analysis of the service call chain based on the big data sample clearer, and the constructed model can represent a service call, thereby facilitating subsequent data based on the pair of models. Analyze.
  • the processing device 13 processes the service call chain according to the service invocation model. It refers to analyzing the service call chain data with the same service node call and topology order according to the service invocation model, and the calling data of each service node, such as the completion time of each node call, the success or failure of the call, etc. There will be differences in different situations, but the calling rules of each service node can be observed when the data samples are large enough.
  • the calling time of a service node is normally completed within 0.1 seconds, and for example, the service node is If the feedback information fails normally within ten times, the call rule analyzed based on the service invocation model and the service call chain data with the same service node call and topology order can be used to monitor whether the call in the distributed system is normal.
  • the problem location for example, the call time of a service node is normally completed within 0.1 seconds. In the 100 calls of a certain time period, the call time of the service node is more than 50 times and more than ten times more than 0.1 second. There is a problem detecting the call of the service node.
  • the service call chain obtaining means 11 includes a call information acquiring unit 111 and a service call chain generating unit 112.
  • the call information acquiring unit 111 acquires service call log information in the distributed system; the service call chain generating unit 112 extracts one or more service call chains from the service call log information, where each service call chain Includes one or more service nodes that are called in sequence.
  • the call information acquiring unit 111 acquires service call log information in the distributed system.
  • the service call log information records each time the service node is called, the sequence information, and other information that can determine the order and topology of each service call. For example, because a node is not only being used for a certain time range Called once, therefore, if the node is marked D0.2.1 in the service call shown in Figure 9 as in the above call sequence, there is a certain probability that it will be called by the starting point in two or more calls. After calling the first service node, it is called after the second service node and is marked as 0.2.1. Therefore, each service call needs to be marked and recorded in the log when each node is called, for example In the log, the field representing the call shown in FIG.
  • X represents a service call marked X by a certain entry from A to D, so that the field of the tag call is read when the log information is acquired.
  • the above example exemplifies that the number representing the sequential topology is recorded in the log when each node is called, so that when the log information is acquired, a field indicating the calling sequence and topology of the service node in the service call chain is read. Obtain the log information of the above service call, and associate the nodes in the service call to obtain the service call chain.
  • the service call chain generation unit 112 extracts one or more service call chains from the service call log information, wherein each service call chain includes one or more service nodes that are called in sequence. That is, according to the mark, sequence information, and other information that can be determined in each log of each service node, as well as the order and topology of each service call, the relevant called sequence is extracted in units of each service call. And topology information and associations to generate a chain of service calls for each service call.
  • the obtained call log information is: "alipay, com.alipay.chashier.xxx, 0x0boc123, 0.2.1, AE001"
  • the comma is separated by the log, the first field is the system name alipay, the second The fields are interface methods, the third field is the token representing a service call, the fourth field is the order and topology when called, and the fifth field is the return code "AE001" representing the result of the call execution.
  • Multi-fields are omitted with "", that is, the third field in all logs is searched according to the mark of the record service call, all service call nodes containing "0x0boc123" are searched, and the searched log records are searched.
  • the corresponding node performs the ordering of the calling sequence and the topology according to the fields recorded in the fourth field recorded by the marking method exemplified above, and finally forms a service call such as A0, B0.1, C0.2, and D0.2.1 format. chain.
  • the service invocation model building device 12 constructs a corresponding service invocation model by performing aggregation processing on the service invocation chain, wherein the service invocation model includes one or more service invocation topologies, each service The calling topology is derived from one or more of the service call chain aggregation processes.
  • the service invocation models all include the same service invocation topology, that is, aggregated by service invocation chains with the same service invocation topology. For example, in the above example, there are 30,000 service call chains in the service call chain obtained as A0, B0.1, C0.2, D0.2.1, and 20,000 service call chains are A0, B0.1.
  • service call chain is listed separately as a service call model 003. That is, the service invocation model is a representative of all service call chains that contain the same service node and the service node invocation topology and order, so that the service node call data in the service call chain can be analyzed and monitored by the service invocation model.
  • the processing device 13 processes the corresponding service call chain according to the service call topology.
  • the service invocation topology in the service invocation model to be aggregated is based on data analysis of the service invocation link with the same service invocation topology. For example, A0, B0.1, C0.2, D0.2.1
  • the service invocation model analyzes the data of each of the 30,000 service call chains. For example, the probability of a call to one of the service call nodes is one ten thousandth of an hour, and then based on this analysis.
  • the result is monitored for other service invocation links that have the same service invocation topology as the service invocation model, such that when the service invoking node displays an error probability of more than one ten thousandth per hour, an error is reported. Therefore, the service call chain is processed according to the service invocation topology, so that the system is effectively monitored in a large data sample, and the early warning performance and stability of the system are improved.
  • the apparatus further includes a cleaning device 14 (not shown) that performs a cleaning operation on the service invocation topology.
  • the cleaning operation is to filter out the information of the calling object that is not important.
  • One of the service nodes C will additionally query some information when calling, for example, in the first service call link.
  • the cache C1 is queried, the database C2 is queried after being acquired, and then the data is put into the cache C3, and the cache C1 is directly queried in the second service call link.
  • the situation before the cleaning operation may be two different links, that is, C also calls the C1, C2, C3 nodes, because the nodes of these queries are usually performed in one system after the error, and the execution result is not fed back.
  • the result of the call is usually fed back to the C node, so it can be cleaned and ignored.
  • the nodes of the middleware routing query can be cleaned up without affecting the model, thus highlighting the call of the key service node. Make the service call topology more accurate.
  • the processing device 13 calls the service call chain corresponding to the topology processing according to the service after the cleaning.
  • the service call chain with the same service tune topology after the cleaning operation is aggregated according to its log information to construct a service invocation model, specifically the same as the method described above.
  • the cleaning operation comprises at least one of the following:
  • the predetermined service node in the service invocation topology is deleted, and the called node that is called, for example, the middleware service node routing query, is filtered.
  • the service node in the service invocation topology that does not feed back the result information is deleted, that is, the invoked operation is performed in the system and the result of the service call without the execution result feedback is filtered, such as querying and calling the cache, calling the database, and the like.
  • the occasional service node Deleting the occasional service node in the service invocation topology, wherein the occasional service node accumulates the number of occurrences less than or equal to the predetermined even-use threshold information in the service call chain corresponding to the service invocation topology, that is, Set a certain threshold for the number of times to be called.
  • the predetermined even-use threshold information in the service call chain corresponding to the service invocation topology, that is, Set a certain threshold for the number of times to be called.
  • the occasional service node in the service invocation topology Deleting the occasional service node in the service invocation topology, wherein the occasional service node accumulates the number of occurrences less than or equal to a predetermined even-use threshold information in the service call chain corresponding to the service invocation topology,
  • the service node in the service call topology except for the occasional service node, the cumulative number of occurrences in the service call chain corresponding to the service call topology is greater than or equal to a predetermined common threshold information, that is, by setting a certain call.
  • the threshold of the number of times by reading the number of calls in the log information or other history information, filtering the calling service nodes that are mainly or need to be maintained, and deleting other infrequently called service nodes, thereby performing targeted data. Analysis and monitoring.
  • the service invocation model building device 12 performs aggregation processing on the service call chain in conjunction with call feedback information of the service node in the service call chain to construct a corresponding service invocation model, wherein the service invoking model Include one or more service invocation topologies, each service invocation topology being aggregated by one or more of the service invocation chain aggregations.
  • the call feedback information refers to a return node after each service node is called to represent an execution result of the service node, and the call feedback information includes a certain execution result differently formed according to a certain preset rule. Information, such as successful or failed execution, and certain business scenarios.
  • the call to the Node B is successful, the SUCC is fed back, the D node is called to call the feedback information AE00, and the call feedback information AE01 of the C node is used.
  • Calling feedback information AE10 the call representing the A node needs to be called
  • the C node is executed, and the call of the C node needs to execute the call of the D node first, and the call feedback information can be recorded in the log, so that the call can be queried when the service node calls the topology information for a certain time.
  • the service node corresponds to a number of call feedback information, and usually the call feedback information at the end of the service call link affects the call feedback information of the service call link front end, but may be aggregated at the front end of the service call link, for example, as shown in FIG.
  • the A service node returns a system exception, which may be because the C service node or the D service node network is unreachable or may be a database exception. Therefore, each service call link is a combination of call feedback information. Therefore, when the service invocation link with the same topology aggregates the service invocation model, different service invocation links record all the call feedback information combinations and use them for subsequent service invocation model-based data in order to clearly locate the problem. analysis.
  • FIG. 7 shows a schematic diagram of a processing device in an apparatus for processing service call information according to still another preferred embodiment of the present application.
  • the processing device 13 includes an information analysis unit 131 and a monitoring unit 132.
  • the information analysis unit 131 performs statistical processing on the one or more service call chains corresponding to the service call topology according to the call feedback information of the service node in each of the service call chains; the monitoring unit 132 according to the corresponding The statistical result monitoring processes the service call chain corresponding to the service call topology.
  • the information analysis unit 131 performs statistical processing on the one or more service call chains corresponding to the service call topology according to the call feedback information of the service node in each of the service call chains. That is, the call feedback information in the service call chain with the same service call topology corresponding to the service call model is analyzed and counted, for example, according to the topological relationship of the service call model generated by the service call link, and the same service call topology is used.
  • the service invocation links are tagged with the service invocation model, that is, each service invocation model has a specific tag, and is combined with the invoking feedback information in each service invocation link to give each service invocation link
  • the unique tag and then the call feedback information in the service call link, the call feedback information of the service node in each service call link is uniformly analyzed according to a fixed time ring ratio or year-on-year.
  • the monitoring unit 132 monitors and processes the service call chain corresponding to the service calling topology according to the corresponding statistical result. That is, when the data samples are large enough, the call feedback information of the same node in the service call topology is compared at a fixed time, and the data is monitored, for example, the data is in the statistical sample of more than 95%.
  • the call feedback information is that each service call node has the ability to monitor the abnormal situation and then locate the corresponding service call model.
  • the service link is contiguous to the specific service link and the service node it invokes according to the unique tag corresponding to each service link.
  • FIG. 8 shows a schematic diagram of a monitoring unit in an apparatus for processing service call information according to still another preferred embodiment of the present application.
  • the monitoring unit 132 includes a comparison subunit 1321 and an alarm subunit 1322.
  • the comparison subunit 1321 compares the corresponding statistical result with the call feedback information of each service node in the service call chain corresponding to the service call topology based on a predetermined time period comparison rule; the alarm subunit 1322 exists. When the comparison difference exceeds the predetermined fluctuation threshold information, generating alarm information about the service call chain corresponding to the comparison difference, wherein the alarm information is located at a service node corresponding to the comparison difference.
  • the comparison subunit 1321 compares the corresponding statistical result with the call feedback information of each service node in the service call chain corresponding to the service calling topology based on a predetermined time period comparison rule. That is, according to a certain time comparison rule, the call feedback information of all service call links is analyzed based on the same service call topology, for example, the data of the service call link having the same service call topology is 8:00 am to 9:00 am and 9:00 am 10 points of call feedback information for ring analysis, for example, the service call link with the same service call topology, data from 8:00 am to 9:00 am on March 2, and call feedback information from 8:00 am to 9:00 am on March 3 Year-on-year analysis, so that the normal operation data interval of different scenarios can be obtained. For example, the probability of failure in daily payment is that there are one hundred outstanding balances after the payment has been completed, and then the state of the service call in the system is performed according to the analysis results. monitor.
  • the alarm sub-unit 1322 After analyzing the service call chain with the same topology service model according to the call feedback information, it monitors the service call link data with the same topology in daily operation according to the normal operation data interval of obtaining different scenarios, and In the case of the normal interval, a certain threshold is set. For example, if the probability of the daily payment failure is the case that the completed payment has a shortage of one hundred occurrences, the service node that checks the balance is set.
  • the threshold of the number of incorrect feedbacks is less than or equal to one hundred or one hundred and twenty times, then the number of feedbacks of the service node error checking the balance every day in the system is checked after the set threshold is checked, and the root service calls the link. The information and the globally unique token formed by the call feedback information are tracked to the faulty node.
  • monitoring and early warning methods are merely examples, and other existing or future possible monitoring and early warning methods may be applied to the present application, and should also be included in the scope of protection of the present application. This is hereby incorporated by reference.
  • the present application can be implemented in software and/or a combination of software and hardware, for example, using an application specific integrated circuit (ASIC), a general purpose computer, or any other similar hardware device.
  • the software program of the present application can be executed by a processor to implement the steps or functions described above.
  • the software programs (including related data structures) of the present application can be stored in a computer readable recording medium such as a RAM memory, a magnetic or optical drive or a floppy disk and the like.
  • some of the steps or functions of the present application may be implemented in hardware, for example, as a circuit that cooperates with a processor to perform various steps or functions.
  • a portion of the present application can be applied as a computer program product, such as computer program instructions, which, when executed by a computer, can invoke or provide a method and/or technical solution in accordance with the present application.
  • the program instructions for invoking the method of the present application may be stored in a fixed or removable recording medium, and/or transmitted by a data stream in a broadcast or other signal bearing medium, and/or stored in a The working memory of the computer device in which the program instructions are run.
  • an embodiment in accordance with the present application includes a device including a memory for storing computer program instructions and a processor for executing program instructions, wherein when the computer program instructions are executed by the processor, triggering
  • the apparatus operates based on the aforementioned methods and/or technical solutions in accordance with various embodiments of the present application.

Abstract

本申请的目的是提供一种用于处理服务调用信息的方法与设备。具体地,获取分布式服务系统中的一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点;根据所述服务调用链构建对应的服务调用模型;根据所述服务调用模型处理所述服务调用链。与现有技术相比,本申请获取分布式服务系统中具有服务节点调用顺序信息的服务调用链,将具有相同服务节点调用顺序的服务调用链构建为服务调用模型,从而以服务调用模型为基础分析各服务节点的调用信息并据此分析解决了对服务调用进行常态监控和运行问题快速定位问题,利用服务节点的大数据信息进行分析和监控,提升了分布式服务系统的问题定位效率,增加了分布式服务系统的可靠性。

Description

一种用于处理服务调用信息的方法与设备
本申请要求2015年11月03日递交的申请号为201510734236.2、发明名称为“一种用于处理服务调用信息的方法与设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机领域,尤其涉及一种用于处理服务调用信息的技术。
背景技术
随着互联网的发展,诸如服务平台、在线商城等依托于网络的应用的规模逐步扩大并采用分布式服务系统,其中,越来越多应用的服务之间相互联系和依赖也日益紧密,从而使得在分布式系统中的调用关系错综复杂,同一业务因为运行时的内外界因素不同使得其每次运行所调用的服务也不尽相同,因此在业务运行过程中出现问题出现之后,往往难以定位和监控,现有技术一般采用通过日志和追踪服务调用路径进行问题定位以及通过监控业务运行中的失败服务调用进行监控。
然而,现有技术中通过日志和追踪服务调用路径进行问题定位的方式繁琐耗时准确性偏低,且通过监控失败服务调用往往在发生问题之后不能很好的进行问题的预先规避和预警。
发明内容
本申请的一个目的是提供一种用于处理服务调用信息的方法与设备,用以解决分布式系统中业务运行中问题的定位以及业务运行监控预警问题。
为实现上述目的,根据本申请的一个方面,本申请提供了一种用于处理服务调用信息的方法,该方法解决了分布式系统中业务运行中问题的定位以及业务运行监控预警的问题,该方法包括:
获取分布式服务系统中的一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点;
根据所述服务调用链构建对应的服务调用模型;
根据所述服务调用模型处理所述服务调用链。
根据本申请的另一个方面,本申请提供了一种用于处理服务调用信息的设备,该设 备解决了分布式系统中业务运行中问题的定位以及业务运行监控预警的问题,该设备包括:
服务调用链获取装置,用于获取分布式服务系统中的一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点;
服务调用模型构建装置,用于根据所述服务调用链构建对应的服务调用模型;
处理装置,用于根据所述服务调用模型处理所述服务调用链。
与现有技术相比,本申请获取分布式服务系统中具有服务节点调用顺序信息的服务调用链,将具有相同服务节点调用顺序的服务调用链构建为服务调用模型,从而以服务调用模型为基础分析各服务节点的调用信息并据此分析解决了对服务调用进行常态监控和运行问题快速定位问题,利用服务节点的大数据信息进行分析和监控,提升了分布式服务系统的问题定位效率,增加了分布式服务系统的可靠性。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:
图1示出根据本申请一个方面的一种用于处理服务调用信息的方法流程图;
图2示出根据本申请另一个优选实例的一种用于处理服务调用信息的方法中步骤S1流程图;
图3示出根据本申请又一个优选实例的一种用于处理服务调用信息的方法中步骤S3流程图;
图4示出根据本申请再一个优选实例的一种用于处理服务调用信息的方法中步骤S32流程图;
图5示出根据本申请另一个方面的一种用于处理服务调用信息的设备示意图;
图6示出根据本申请另一个优选实例的一种用于处理服务调用信息的设备中服务调用链获取装置示意图;
图7示出根据本申请又一个优选实例的一种用于处理服务调用信息的设备中处理装置示意图;
图8示出根据本申请再一个优选实例的一种用于处理服务调用信息的设备中监控单元示意图;
图9示出根据本申请再一个优选实例的服务调用示意图。
附图中相同或相似的附图标记代表相同或相似的部件。
具体实施方式
下面结合附图对本申请作进一步详细描述。
在本申请一个典型的配置中,终端、服务网络的设备和可信方均包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
图1示出根据本申请一个方面的一种用于处理服务调用信息的方法流程图。包括步骤S1,步骤S2以及步骤S3。
其中,在步骤S1中设备1获取分布式服务系统中的一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点;在步骤S2中设备1根据所述服务调用链构建对应的服务调用模型;在步骤S3中设备1根据所述服务调用模型处理所述服务调用链。
具体地,在步骤S1中设备1获取分布式服务系统中的一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点。在此,所述分布式服务系统包括但不限于面向服务架构或构建在分布式系统上的软件系统。其中,所述服务节点包括但不限于所述分布式服务系统中供调用的一项服务或一项功能,如在电商平台进行产品消费时的服务节点包括调用用户名,调用用户关联账户,调取支付页面,获取安全验证,检验账户余额等。所述服务调用链是指在所述分布式服务系统中所完成一次服务调用所 涉及的服务节点及其顺序,例如图9所示的服务调用示意图是以A服务节点为入口的一个服务调用链,其中用圆圈标示的A、B、C、D均指代一个服务节点,所示意的服务调用链的入口为服务节点A,而完成A需要先调用B再调用C,又完成C需要调用D,因此图9中所示的服务调用中的服务调用链中的服务调用节点是有调用顺序的,即所示的服务调用链的调用顺序为A→B→C→D,其中,为了便于计算机语言进行识别图9中所示的服务调用链中服务节点的调用顺序可以根据服务节点的被调用顺序将初始的入口节点的顺序标为A0,接着调用的B标为B0.1即0代表A“.”号后面为接在A后面调用的第1个服务节点,接着调用C标为C0.2即0代表A“.”号后面为接在A后面调用的第2个服务节点,接着为了完成C需要调用的D可知D是为了完成C所调用的,因此D标为D0.2.1即0代表A“.”号后面2代表C再“.”号后面1代表为C后面调用的第1个节点,因此,图9中的拓扑图表示的调用链路即可表示为A0,B0.1,C0.2,D0.2.1,其中对顺序的标注方法仅为示例,可在每个节点被调用时即将代表顺序拓扑的数字记录在日志中,例如日志中,记录代表本次调用的字段为X,X字段后面的若干字段记录表示服务节点调用顺序和拓扑的数字和点,从而根据日志中的服务调用记录进行服务调用链路的获取。获取包含被调用时所涉及的服务节点及其调用顺序的服务调用链可以清晰显示服务调用的过程,并获得每次服务调用的拓扑及其特点。
本领域技术人员应能理解上述获取服务调用链以及标注服务调用链的方式仅为举例,其他现有的或今后可能出现的获取服务调用链以及标注服务调用链的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
接着,在步骤S2中设备1根据所述服务调用链构建对应的服务调用模型。其中,所述服务调用模型是指根据所述服务调用链的拓扑结构而构建的具有相同节点调用顺序的服务调用链,例如在获取的服务调用链的样本足够大的情况下,在同一服务提供商的平台或应用系统中会出现相同的服务调用链,例如在一天之中全国用户中有三万次的服务调用均涉及图9中所示的服务节点,且其调用顺序和拓扑均相同,则这三万次服务调用对应一个相同的服务调用模型即为上文举例的A0,B0.1,C0.2,D0.2.1,将若干具有相同拓扑和顺序服务节点调用的服务调用链概括为一个服务调用模型即为构建过程。构建所述服务调用链对应的服务调用模型使得基于大数据样本的对服务调用链的分析更加清晰,且所构建的模型可以代表一种服务调用,从而便于后续基于此对每个模型中的数据进行分析。
接着,在步骤S3中设备1根据所述服务调用模型处理所述服务调用链。是指根据服 务调用模型对与服务调用模型具有相同服务节点调用以及拓扑顺序的服务调用链数据进行分析,每个服务节点的调用数据例如每个节点的调用的完成时间、调用的成功与否等在不同的情况下会有差别,但在数据样本足够大的情况下可以观察到每个服务节点的调用规律,例如某服务节点的调用时间正常在0.1秒内完成,又例如某个服务节点的反馈信息正常会出现十次以内的调用失败,则基于服务调用模型以及具有相同服务节点调用以及拓扑顺序的服务调用链数据所分析的调用规律可以用来监控所述分布系统中的调用是否正常,以及问题定位,例如接上文举例某服务节点的调用时间正常在0.1秒内完成,在某一个时段100次调用中该服务节点的调用时间有50次以上均超过0.1秒十倍以上,则可以检测到所述服务节点的调用存在问题。
本领域技术人员应能理解上述处理服务调用链的方式仅为举例,其他现有的或今后可能出现的处理服务调用链的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
图2示出根据本申请另一个优选实例的一种用于处理服务调用信息的方法中步骤S1流程图。所述步骤S1包括步骤S11以及步骤S12。
其中,在步骤S11中设备1获取分布式系统中的服务调用日志信息;在步骤S12中设备1从所述服务调用日志信息中抽取一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点。
具体地,在步骤S11中设备1获取分布式系统中的服务调用日志信息。其中,所述服务调用日志信息记载每个服务节点每次被调用的标记、顺序信息以及其它可以确定每次服务调用中顺序以及拓扑的信息。例如,因为一个节点在某个时间范围内不止被调用一次,因此,若按照调用顺序如上文举例在图9所示的服务调用中对节点标记D0.2.1,有一定的几率会在两次或以上的被调用过程中都是由起始点调用,再调用第1个服务节点后在第2个服务节点后被调用而被标记为0.2.1,因此,需要在每个节点被调用时即对每次服务调用进行标记并记录在日志中,例如日志中,记录代表图9中所示的调用的字段为X,即X代表某次由A入口到D完成的服务调用标记为X,从而在获取日志信息时读取标记调用的字段。又例如,接上文举例在每个节点被调用时即将代表顺序拓扑的数字记录在日志中,从而在获取日志信息时读取标示服务调用链中服务节点被调用顺序和拓扑的字段。获取上述服务调用的日志信息,可以对服务调用中的节点进行关联,从而获得服务调用链。
本领域技术人员应能理解上述记载服务调用日志的方式仅为举例,其他现有的或今 后可能出现的记载服务调用日志的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
接着,在步骤S12中设备1从所述服务调用日志信息中抽取一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点。即根据日志信息中所记载的每个服务节点每次被调用的标记、顺序信息以及其它可以确定每次服务调用中顺序以及拓扑的信息,以每次服务调用为单位抽取相关的被调用的顺序以及拓扑信息并关联,从而生成每次服务调用的服务调用链。例如,所获取的调用日志信息为:“alipay,com.alipay.chashier.xxx,0x0boc123,0.2.1,AE001…“对这条日志按逗号来分隔,第一个字段是系统名alipay,第二个字段是接口方法,第三个字段是代表一次服务调用的标记,第四个字段是调用时的顺序和拓扑,第五个字段是代表调用执行结果的返回码“AE001”,后面可能带更多字段用“…”进行省略,即根据记录服务调用的标记,对所有日志中的第三个字段进行搜索,将所有含有“0x0boc123”的服务调用节点进行搜索,并将搜索到的日志记录所对应的节点按照按上文举例的标记方法所记录的第四个字段所记载的字段进行调用顺序和拓扑的排序,最终形成例如A0,B0.1,C0.2,D0.2.1格式的服务调用链。
本领域技术人员应能理解上述从日志信息中抽取服务调用链的方式仅为举例,其他现有的或今后可能出现的从日志信息中抽取服务调用链的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
优选地,在步骤S12中设备1通过对所述服务调用链进行聚合处理构建对应的服务调用模型,其中,所述服务调用模型包括一个或多个服务调用拓扑,每个服务调用拓扑由一个或多个所述服务调用链聚合处理而得。所述服务调用模型均包括相同的服务调用拓扑,即由具有相同服务调用拓扑的服务调用链聚合而成。例如,接上文举例在所获取的服务调用链中有三万个服务调用链为A0,B0.1,C0.2,D0.2.1,有两万个服务调用链为A0,B0.1,C0.2,有一个服务调用链为A 0,B0.1,则可将三万个服务调用链聚合为一个服务调用模型001,将两万个服务调用链聚合为一个服务调用模型002,将一个服务调用链单独列为一个服务调用模型003。即所述服务调用模型为所有包含相同服务节点以及服务节点调用拓扑和顺序的服务调用链的代表,从而可以以所述服务调用模型对服务调用链中的服务节点调用数据进行分析和监控。
其中,在步骤S3中设备1根据所述服务调用拓扑处理对应的所述服务调用链。即将聚合所得的服务调用模型中的服务调用拓扑为基础对与其具有相同服务调用拓扑的服务 调用链路进行数据分析,接上文举例,以A0,B0.1,C0.2,D0.2.1的服务调用模型对所有三万个服务调用链中的每个服务节点的数据进行分析,例如分析得出其中一个服务调用节点的调用显示错误的概率在每小时万分之一,继而根据这一分析结果对其它与所述服务调用模型具有相同服务调用拓扑的服务调用链路进行监控,从而当这一服务调用节点显示错误概率在每小时万分之一以上时,即进行报错。因此,根据所述服务调用拓扑对所述服务调用链进行处理,使得在数据样本较大情况下有效对系统进行监控,提升系统的预警性能和稳定性。
优选地,所述设备还包括步骤S4(未示出),在步骤S4中设备1对所述服务调用拓扑执行清洗操作。其中,所述清洗操作就是过滤掉不重要的调用对象信息。接上文举例有与图9所示的服务链路全部调用节点拓扑相同的两个服务调用,其中一个服务节点C在调用时会额外还会查询一些信息,例如在第1个服务调用链路中查询了缓存C1,未获取后又查询了一次数据库C2,后又将数据放至缓存C3,而在第2个服务调用链路中直接查询缓存C1就获取到了。其中,按清洗操作前的情况可能就是两个不同的链路,即C还会调用C1,C2,C3节点这些,因为这些查询的节点出错后通常在一个系统内进行,且不会反馈执行结果,其调用结果通常会反馈到C节点上,因此可以清洗忽略掉,除此外还有中间件路由查询的节点等都是对模型没影响的都可以清洗掉,从而突出关键服务节点的调用情况,使得服务调用拓扑更加准确。
本领域技术人员应能理解上述清洗服务调用拓扑的方式仅为举例,其他现有的或今后可能出现的清洗服务调用拓扑的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
进一步地,在步骤S3中设备1根据清洗后的所述服务调用拓扑处理对应的所述服务调用链。即将清洗操作后的具有相同服务调拓扑的服务调用链依据其日志信息进行聚合,从而构建服务调用模型,具体地聚合与构建方式与上文中所述方法的相同。
更优选地,其中,所述清洗操作包括以下至少任一项:
删除所述服务调用拓扑中的预定服务节点,即将所调用的不属于远程服务的服务节点例如中间件服务节点路由查询的调用节点等进行过滤。
删除所述服务调用拓扑中的未反馈调用结果信息的服务节点,即所调用的操作在系统内进行且无执行结果反馈的服务调用结果进行过滤,例如查询和调用缓存、调用数据库等。
删除所述服务调用拓扑中的偶用服务节点,其中,所述偶用服务节点在所述服务调 用拓扑对应的所述服务调用链中累计出现次数小于或等于预定的偶用阈值信息,即通过设置一定的被调用次数的阈值,通过读取日志信息或其它历史记录信息中的调用次数,将正常在完成某个完整的服务调用链的时很少使用或极偶然的情况下才会使用的服务节点进行删除,以减轻系统对数据分析处理时的负担,同时会获得更加准备的服务调用模型以及数据。
删除所述服务调用拓扑中的偶用服务节点,其中,所述偶用服务节点在所述服务调用拓扑对应的所述服务调用链中累计出现次数小于或等于预定的偶用阈值信息,所述服务调用拓扑中除所述偶用服务节点外的其他服务节点在所述服务调用拓扑对应的所述服务调用链中累计出现次数大于或等于预定的常用阈值信息,即即通过设置一定的被调用次数的阈值,通过读取日志信息或其它历史记录信息中的调用次数,将主要的或需要重点维护的调用服务节点筛选出来而将其它不经常调用的服务节点删除,从而有针对性的进行数据分析和监控。
优选地,在步骤S2中设备1结合所述服务调用链中服务节点的调用反馈信息,对所述服务调用链进行聚合处理,以构建对应的服务调用模型,其中,所述服务调用模型包括一个或多个服务调用拓扑,每个服务调用拓扑由一个或多个所述服务调用链聚合处理而得。其中,所述调用反馈信息是指每个服务节点被调用后的返回节点以代表所述服务节点的执行结果,所述调用反馈信息包含一定的执行结果不同情况的根据一定预置规则形成可辨识信息,例如执行成功或是失败以及一定的业务场景,图9所示的服务调用链路中B节点调用成功则反馈SUCC,调用D节点调用反馈信息AE00,C节点的调用反馈信息AE01,A节点的调用反馈信息AE10,代表A节点的调用需要在调用B成功后执行C节点,而C节点的调用需要先执行D节点的调用,所述调用反馈信息可记录到日志中,从而可以在调取服务节点某次调用拓扑信息时一并查询,每个服务节点对应若干调用反馈信息,且通常服务调用链路末端的调用反馈信息会影响到服务调用链路前端的调用反馈信息,但在服务调用链路前端可能会作聚合,例如图9所示的A服务节点返回系统异常,可能是因为C服务节点或者是D服务节点网络不通,也可能是数据库异常,故而每个服务调用链路都是调用反馈信息的组合。因此具有相同拓扑的服务调用链路在聚合构建服务调用模型时,不同的服务调用链路为了能清晰定位问题,链路中会记录所有的调用反馈信息组合并用于后续的基于服务调用模型的数据分析。
本领域技术人员应能理解上述调用反馈信息的记录和展现方式仅为举例,其他现有的或今后可能出现的调用反馈信息的记录和展现方式如可适用于本申请,也应包含在本 申请保护范围以内,并在此以引用方式包含于此。
图3示出根据本申请又一个优选实例的一种用于处理服务调用信息的方法中步骤S3流程图。包括步骤S31以及步骤S32。
其中,在步骤S31中设备1对于所述服务调用拓扑所对应的一个或多个所述服务调用链,根据每个所述服务调用链中服务节点的调用反馈信息进行统计处理;在步骤S32中设备1根据对应的统计结果监控处理所述服务调用拓扑所对应的所述服务调用链。
具体地,在步骤S31中设备1对于所述服务调用拓扑所对应的一个或多个所述服务调用链,根据每个所述服务调用链中服务节点的调用反馈信息进行统计处理。即对服务调用模型所对应的所有具有相同服务调用拓扑的服务调用链中的调用反馈信息进行分析和统计,例如根据服务调用链路所生成服务调用模型的拓扑关系,将具有相同服务调用拓扑的服务调用链路均打上服务调用模型的标记,即每个服务调用模型具有一个特定的标记,并结合所述标记与每个服务调用链路中的调用反馈信息组合给每个服务调用链路赋予唯一的标记,进而统计服务调用链路中的调用反馈信息对每个服务调用链路中的服务节点的调用反馈信息情况依照固定时间环比或同比进行统一的分析。
本领域技术人员应能理解上述分析调用反馈信息进行数据分析的方式仅为举例,其他现有的或今后可能出现的分析调用反馈信息进行数据分析的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
接着,在步骤S32中设备1根据对应的统计结果监控处理所述服务调用拓扑所对应的所述服务调用链。即在数据样本足够大的情况下对服务调用拓扑中相同节点的调用反馈信息进行按固定时间进行比较,并监控数据,例如数据在百分之九十五以上的统计样本中为正常状态,则一旦系统中出现百分之五的异常情况即会被系统监控到,因为调用反馈信息是每个服务调用节点均有因此可以监控到异常情况后定位到具体服务调用模型所对应的服务链路,接上文举例根据每个服务链路所对应的唯一标记定位至具体的服务链路及其调用的服务节点上。
本领域技术人员应能理解上述根据分析结果监控服务调用的方式仅为举例,其他现有的或今后可能出现的根据分析结果监控服务调用的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
图4示出根据本申请再一个优选实例的一种用于处理服务调用信息的方法中步骤S32流程图。包括步骤S321以及步骤S322。
其中,在步骤S321中设备1基于预定的时段比对规则,将对应的统计结果与所述服 务调用拓扑所对应的所述服务调用链中各服务节点的调用反馈信息相比较;在步骤S322中设备1当存在比较差值超过预定的波动阈值信息时,生成关于所述比较差值对应的所述服务调用链的报警信息,其中,所述报警信息定位于所述比较差值对应的服务节点。
具体地,在步骤S321中设备1基于预定的时段比对规则,将对应的统计结果与所述服务调用拓扑所对应的所述服务调用链中各服务节点的调用反馈信息相比较。即按照一定的时间比较规则基于相同的服务调用拓扑对所有服务调用链路的调用反馈信息进行分析,例如具有同一服务调用拓扑的服务调用链路上午8点至9点的数据与上午9点至10点的调用反馈信息进行环比分析,又例如具有同一服务调用拓扑的服务调用链路3月2日上午8点至9点的数据与3月3日上午8点至9点的调用反馈信息进行同比分析,从而可获得不同场景的正常运行的数据区间,例如在每天支付失败的概率是已完毕支付有一百次出现余额不足的情况,进而根据这些分析结果对系统中的服务调用的状态进行监控。
本领域技术人员应能理解上述获得和比较统计结果的方式仅为举例,其他现有的或今后可能出现的获得和比较统计结果的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
接着,在步骤S322中设备1当存在比较差值超过预定的波动阈值信息时,生成关于所述比较差值对应的所述服务调用链的报警信息,其中,所述报警信息定位于所述比较差值对应的服务节点。是指根据调用反馈信息对有相同拓扑服务模型的服务调用链进行分析后,依据获得不同场景的正常运行的数据区间对日常运行中的具有相同拓扑的服务调用链路数据进行监控,并对所述正常区间的情况设置一定的阈值,例如接上文举例每天支付失败的概率是已完毕支付有一百次出现余额不足的情况,则设置检查余额的服务节点错误的反馈次数的阈值为小于等于一百次或一百二十次,则当系统中每天检查余额的服务节点错误的反馈次数查过所设置的阈值后则进行报警,并根服务调用链路的信息与调用反馈信息所构成的全局唯一标记追踪至出错节点。
本领域技术人员应能理解上述监控和预警的方式仅为举例,其他现有的或今后可能出现的监控和预警的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
图5示出根据本申请另一个方面的一种用于处理服务调用信息的设备示意图。所述设备1包括服务调用链获取装置11,服务调用模型构建装置12以及处理装置13。
其中,所述设备1中服务调用链获取装置11获取分布式服务系统中的一个或多个服 务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点;服务调用模型构建装置12根据所述服务调用链构建对应的服务调用模型;处理装置13根据所述服务调用模型处理所述服务调用链。
具体地,服务调用链获取装置11获取分布式服务系统中的一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点。在此,所述分布式服务系统包括但不限于面向服务架构或构建在分布式系统上的软件系统。其中,所述服务节点包括但不限于所述分布式服务系统中供调用的一项服务或一项功能,如在电商平台进行产品消费时的服务节点包括调用用户名,调用用户关联账户,调取支付页面,获取安全验证,检验账户余额等。所述服务调用链是指在所述分布式服务系统中所完成一次服务调用所涉及的服务节点及其顺序,例如图9所示的服务调用示意图是以A服务节点为入口的一个服务调用链,其中用圆圈标示的A、B、C、D均指代一个服务节点,所示意的服务调用链的入口为服务节点A,而完成A需要先调用B再调用C,又完成C需要调用D,因此图9中所示的服务调用中的服务调用链中的服务调用节点是有调用顺序的,即所示的服务调用链的调用顺序为A→B→C→D,其中,为了便于计算机语言进行识别图9中所示的服务调用链中服务节点的调用顺序可以根据服务节点的被调用顺序将初始的入口节点的顺序标为A0,接着调用的B标为B0.1即0代表A“.”号后面为接在A后面调用的第1个服务节点,接着调用C标为C0.2即0代表A“.”号后面为接在A后面调用的第2个服务节点,接着为了完成C需要调用的D可知D是为了完成C所调用的,因此D标为D0.2.1即0代表A“.”号后面2代表C再“.”号后面1代表为C后面调用的第1个节点,因此,图9中的拓扑图表示的调用链路即可表示为A0,B0.1,C0.2,D0.2.1,其中对顺序的标注方法仅为示例,可在每个节点被调用时即将代表顺序拓扑的数字记录在日志中,例如日志中,记录代表本次调用的字段为X,X字段后面的若干字段记录表示服务节点调用顺序和拓扑的数字和点,从而根据日志中的服务调用记录进行服务调用链路的获取。获取包含被调用时所涉及的服务节点及其调用顺序的服务调用链可以清晰显示服务调用的过程,并获得每次服务调用的拓扑及其特点。
本领域技术人员应能理解上述获取服务调用链以及标注服务调用链的方式仅为举例,其他现有的或今后可能出现的获取服务调用链以及标注服务调用链的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
接着,服务调用模型构建装置12根据所述服务调用链构建对应的服务调用模型。其中,所述服务调用模型是指根据所述服务调用链的拓扑结构而构建的具有相同节点调用 顺序的服务调用链,例如在获取的服务调用链的样本足够大的情况下,在同一服务提供商的平台或应用系统中会出现相同的服务调用链,例如在一天之中全国用户中有三万次的服务调用均涉及图9中所示的服务节点,且其调用顺序和拓扑均相同,则这三万次服务调用对应一个相同的服务调用模型即为上文举例的A0,B0.1,C0.2,D0.2.1,将若干具有相同拓扑和顺序服务节点调用的服务调用链概括为一个服务调用模型即为构建过程。构建所述服务调用链对应的服务调用模型使得基于大数据样本的对服务调用链的分析更加清晰,且所构建的模型可以代表一种服务调用,从而便于后续基于此对每个模型中的数据进行分析。
接着,处理装置13根据所述服务调用模型处理所述服务调用链。是指根据服务调用模型对与服务调用模型具有相同服务节点调用以及拓扑顺序的服务调用链数据进行分析,每个服务节点的调用数据例如每个节点的调用的完成时间、调用的成功与否等在不同的情况下会有差别,但在数据样本足够大的情况下可以观察到每个服务节点的调用规律,例如某服务节点的调用时间正常在0.1秒内完成,又例如某个服务节点的反馈信息正常会出现十次以内的调用失败,则基于服务调用模型以及具有相同服务节点调用以及拓扑顺序的服务调用链数据所分析的调用规律可以用来监控所述分布系统中的调用是否正常,以及问题定位,例如接上文举例某服务节点的调用时间正常在0.1秒内完成,在某一个时段100次调用中该服务节点的调用时间有50次以上均超过0.1秒十倍以上,则可以检测到所述服务节点的调用存在问题。
本领域技术人员应能理解上述处理服务调用链的方式仅为举例,其他现有的或今后可能出现的处理服务调用链的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
图6示出根据本申请另一个优选实例的一种用于处理服务调用信息的设备中服务调用链获取装置示意图。所述服务调用链获取装置11包括调用信息获取单元111以及服务调用链生成单元112。
其中,所述调用信息获取单元111获取分布式系统中的服务调用日志信息;服务调用链生成单元112从所述服务调用日志信息中抽取一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点。
具体地,所述调用信息获取单元111获取分布式系统中的服务调用日志信息。其中,所述服务调用日志信息记载每个服务节点每次被调用的标记、顺序信息以及其它可以确定每次服务调用中顺序以及拓扑的信息。例如,因为一个节点在某个时间范围内不止被 调用一次,因此,若按照调用顺序如上文举例在图9所示的服务调用中对节点标记D0.2.1,有一定的几率会在两次或以上的被调用过程中都是由起始点调用,再调用第1个服务节点后在第2个服务节点后被调用而被标记为0.2.1,因此,需要在每个节点被调用时即对每次服务调用进行标记并记录在日志中,例如日志中,记录代表图9中所示的调用的字段为X,即X代表某次由A入口到D完成的服务调用标记为X,从而在获取日志信息时读取标记调用的字段。又例如,接上文举例在每个节点被调用时即将代表顺序拓扑的数字记录在日志中,从而在获取日志信息时读取标示服务调用链中服务节点被调用顺序和拓扑的字段。获取上述服务调用的日志信息,可以对服务调用中的节点进行关联,从而获得服务调用链。
本领域技术人员应能理解上述记载服务调用日志的方式仅为举例,其他现有的或今后可能出现的记载服务调用日志的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
接着,服务调用链生成单元112从所述服务调用日志信息中抽取一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点。即根据日志信息中所记载的每个服务节点每次被调用的标记、顺序信息以及其它可以确定每次服务调用中顺序以及拓扑的信息,以每次服务调用为单位抽取相关的被调用的顺序以及拓扑信息并关联,从而生成每次服务调用的服务调用链。例如,所获取的调用日志信息为:“alipay,com.alipay.chashier.xxx,0x0boc123,0.2.1,AE001…“对这条日志按逗号来分隔,第一个字段是系统名alipay,第二个字段是接口方法,第三个字段是代表一次服务调用的标记,第四个字段是调用时的顺序和拓扑,第五个字段是代表调用执行结果的返回码“AE001”,后面可能带更多字段用“…”进行省略,即根据记录服务调用的标记,对所有日志中的第三个字段进行搜索,将所有含有“0x0boc123”的服务调用节点进行搜索,并将搜索到的日志记录所对应的节点按照按上文举例的标记方法所记录的第四个字段所记载的字段进行调用顺序和拓扑的排序,最终形成例如A0,B0.1,C0.2,D0.2.1格式的服务调用链。
本领域技术人员应能理解上述从日志信息中抽取服务调用链的方式仅为举例,其他现有的或今后可能出现的从日志信息中抽取服务调用链的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
优选地,所述服务调用模型构建装置12通过对所述服务调用链进行聚合处理构建对应的服务调用模型,其中,所述服务调用模型包括一个或多个服务调用拓扑,每个服务 调用拓扑由一个或多个所述服务调用链聚合处理而得。所述服务调用模型均包括相同的服务调用拓扑,即由具有相同服务调用拓扑的服务调用链聚合而成。例如,接上文举例在所获取的服务调用链中有三万个服务调用链为A0,B0.1,C0.2,D0.2.1,有两万个服务调用链为A0,B0.1,C0.2,有一个服务调用链为A 0,B0.1,则可将三万个服务调用链聚合为一个服务调用模型001,将两万个服务调用链聚合为一个服务调用模型002,将一个服务调用链单独列为一个服务调用模型003。即所述服务调用模型为所有包含相同服务节点以及服务节点调用拓扑和顺序的服务调用链的代表,从而可以以所述服务调用模型对服务调用链中的服务节点调用数据进行分析和监控。
其中,所述处理装置13根据所述服务调用拓扑处理对应的所述服务调用链。即将聚合所得的服务调用模型中的服务调用拓扑为基础对与其具有相同服务调用拓扑的服务调用链路进行数据分析,接上文举例,以A0,B0.1,C0.2,D0.2.1的服务调用模型对所有三万个服务调用链中的每个服务节点的数据进行分析,例如分析得出其中一个服务调用节点的调用显示错误的概率在每小时万分之一,继而根据这一分析结果对其它与所述服务调用模型具有相同服务调用拓扑的服务调用链路进行监控,从而当这一服务调用节点显示错误概率在每小时万分之一以上时,即进行报错。因此,根据所述服务调用拓扑对所述服务调用链进行处理,使得在数据样本较大情况下有效对系统进行监控,提升系统的预警性能和稳定性。
优选地,所述设备还包括清洗装置14(未示出),所述清洗装置14对所述服务调用拓扑执行清洗操作。其中,所述清洗操作就是过滤掉不重要的调用对象信息。接上文举例有与图9所示的服务链路全部调用节点拓扑相同的两个服务调用,其中一个服务节点C在调用时会额外还会查询一些信息,例如在第1个服务调用链路中查询了缓存C1,未获取后又查询了一次数据库C2,后又将数据放至缓存C3,而在第2个服务调用链路中直接查询缓存C1就获取到了。其中,按清洗操作前的情况可能就是两个不同的链路,即C还会调用C1,C2,C3节点这些,因为这些查询的节点出错后通常在一个系统内进行,且不会反馈执行结果,其调用结果通常会反馈到C节点上,因此可以清洗忽略掉,除此外还有中间件路由查询的节点等都是对模型没影响的都可以清洗掉,从而突出关键服务节点的调用情况,使得服务调用拓扑更加准确。
本领域技术人员应能理解上述清洗服务调用拓扑的方式仅为举例,其他现有的或今后可能出现的清洗服务调用拓扑的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
进一步地,所述处理装置13根据清洗后的所述服务调用拓扑处理对应的所述服务调用链。即将清洗操作后的具有相同服务调拓扑的服务调用链依据其日志信息进行聚合,从而构建服务调用模型,具体地聚合与构建方式与上文中所述方法的相同。
更优选地,其中,所述清洗操作包括以下至少任一项:
删除所述服务调用拓扑中的预定服务节点,即将所调用的例如中间件服务节点路由查询的调用节点等进行过滤。
删除所述服务调用拓扑中的未反馈调用结果信息的服务节点,即所调用的操作在系统内进行且无执行结果反馈的服务调用结果进行过滤,例如查询和调用缓存、调用数据库等。
删除所述服务调用拓扑中的偶用服务节点,其中,所述偶用服务节点在所述服务调用拓扑对应的所述服务调用链中累计出现次数小于或等于预定的偶用阈值信息,即通过设置一定的被调用次数的阈值,通过读取日志信息或其它历史记录信息中的调用次数,将正常在完成某个完整的服务调用链的时很少使用或极偶然的情况下才会使用的服务节点进行删除,以减轻系统对数据分析处理时的负担,同时会获得更加准备的服务调用模型以及数据。
删除所述服务调用拓扑中的偶用服务节点,其中,所述偶用服务节点在所述服务调用拓扑对应的所述服务调用链中累计出现次数小于或等于预定的偶用阈值信息,所述服务调用拓扑中除所述偶用服务节点外的其他服务节点在所述服务调用拓扑对应的所述服务调用链中累计出现次数大于或等于预定的常用阈值信息,即即通过设置一定的被调用次数的阈值,通过读取日志信息或其它历史记录信息中的调用次数,将主要的或需要重点维护的调用服务节点筛选出来而将其它不经常调用的服务节点删除,从而有针对性的进行数据分析和监控。
优选地,所述服务调用模型构建装置12结合所述服务调用链中服务节点的调用反馈信息,对所述服务调用链进行聚合处理,以构建对应的服务调用模型,其中,所述服务调用模型包括一个或多个服务调用拓扑,每个服务调用拓扑由一个或多个所述服务调用链聚合处理而得。其中,所述调用反馈信息是指每个服务节点被调用后的返回节点以代表所述服务节点的执行结果,所述调用反馈信息包含一定的执行结果不同情况的根据一定预置规则形成可辨识信息,例如执行成功或是失败以及一定的业务场景,图9所示的服务调用链路中B节点调用成功则反馈SUCC,调用D节点调用反馈信息AE00,C节点的调用反馈信息AE01,A节点的调用反馈信息AE10,代表A节点的调用需要在调用 B成功后执行C节点,而C节点的调用需要先执行D节点的调用,所述调用反馈信息可记录到日志中,从而可以在调取服务节点某次调用拓扑信息时一并查询,每个服务节点对应若干调用反馈信息,且通常服务调用链路末端的调用反馈信息会影响到服务调用链路前端的调用反馈信息,但在服务调用链路前端可能会作聚合,例如图9所示的A服务节点返回系统异常,可能是因为C服务节点或者是D服务节点网络不通,也可能是数据库异常,故而每个服务调用链路都是调用反馈信息的组合。因此具有相同拓扑的服务调用链路在聚合构建服务调用模型时,不同的服务调用链路为了能清晰定位问题,链路中会记录所有的调用反馈信息组合并用于后续的基于服务调用模型的数据分析。
本领域技术人员应能理解上述调用反馈信息的记录和展现方式仅为举例,其他现有的或今后可能出现的调用反馈信息的记录和展现方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
图7示出根据本申请又一个优选实例的一种用于处理服务调用信息的设备中处理装置示意图。所述处理装置13包括信息分析单元131以及监控单元132。
其中,信息分析单元131对于所述服务调用拓扑所对应的一个或多个所述服务调用链,根据每个所述服务调用链中服务节点的调用反馈信息进行统计处理;监控单元132根据对应的统计结果监控处理所述服务调用拓扑所对应的所述服务调用链。
具体地,信息分析单元131对于所述服务调用拓扑所对应的一个或多个所述服务调用链,根据每个所述服务调用链中服务节点的调用反馈信息进行统计处理。即对服务调用模型所对应的所有具有相同服务调用拓扑的服务调用链中的调用反馈信息进行分析和统计,例如根据服务调用链路所生成服务调用模型的拓扑关系,将具有相同服务调用拓扑的服务调用链路均打上服务调用模型的标记,即每个服务调用模型具有一个特定的标记,并结合所述标记与每个服务调用链路中的调用反馈信息组合给每个服务调用链路赋予唯一的标记,进而统计服务调用链路中的调用反馈信息对每个服务调用链路中的服务节点的调用反馈信息情况依照固定时间环比或同比进行统一的分析。
本领域技术人员应能理解上述分析调用反馈信息进行数据分析的方式仅为举例,其他现有的或今后可能出现的分析调用反馈信息进行数据分析的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
接着,监控单元132根据对应的统计结果监控处理所述服务调用拓扑所对应的所述服务调用链。即在数据样本足够大的情况下对服务调用拓扑中相同节点的调用反馈信息进行按固定时间进行比较,并监控数据,例如数据在百分之九十五以上的统计样本中为 正常状态,则一旦系统中出现百分之五的异常情况即会被系统监控到,因为调用反馈信息是每个服务调用节点均有因此可以监控到异常情况后定位到具体服务调用模型所对应的服务链路,接上文举例根据每个服务链路所对应的唯一标记定位至具体的服务链路及其调用的服务节点上。
本领域技术人员应能理解上述根据分析结果监控服务调用的方式仅为举例,其他现有的或今后可能出现的根据分析结果监控服务调用的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
图8示出根据本申请再一个优选实例的一种用于处理服务调用信息的设备中监控单元示意图。所述监控单元132包括比较子单元1321以及报警子单元1322。
其中,比较子单元1321基于预定的时段比对规则,将对应的统计结果与所述服务调用拓扑所对应的所述服务调用链中各服务节点的调用反馈信息相比较;报警子单元1322当存在比较差值超过预定的波动阈值信息时,生成关于所述比较差值对应的所述服务调用链的报警信息,其中,所述报警信息定位于所述比较差值对应的服务节点。
具体地,比较子单元1321基于预定的时段比对规则,将对应的统计结果与所述服务调用拓扑所对应的所述服务调用链中各服务节点的调用反馈信息相比较。即按照一定的时间比较规则基于相同的服务调用拓扑对所有服务调用链路的调用反馈信息进行分析,例如具有同一服务调用拓扑的服务调用链路上午8点至9点的数据与上午9点至10点的调用反馈信息进行环比分析,又例如具有同一服务调用拓扑的服务调用链路3月2日上午8点至9点的数据与3月3日上午8点至9点的调用反馈信息进行同比分析,从而可获得不同场景的正常运行的数据区间,例如在每天支付失败的概率是已完毕支付有一百次出现余额不足的情况,进而根据这些分析结果对系统中的服务调用的状态进行监控。
本领域技术人员应能理解上述获得和比较统计结果的方式仅为举例,其他现有的或今后可能出现的获得和比较统计结果的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
接着,报警子单元1322当存在比较差值超过预定的波动阈值信息时,生成关于所述比较差值对应的所述服务调用链的报警信息,其中,所述报警信息定位于所述比较差值对应的服务节点。是指根据调用反馈信息对有相同拓扑服务模型的服务调用链进行分析后,依据获得不同场景的正常运行的数据区间对日常运行中的具有相同拓扑的服务调用链路数据进行监控,并对所述正常区间的情况设置一定的阈值,例如接上文举例每天支付失败的概率是已完毕支付有一百次出现余额不足的情况,则设置检查余额的服务节点 错误的反馈次数的阈值为小于等于一百次或一百二十次,则当系统中每天检查余额的服务节点错误的反馈次数查过所设置的阈值后则进行报警,并根服务调用链路的信息与调用反馈信息所构成的全局唯一标记追踪至出错节点。
本领域技术人员应能理解上述监控和预警的方式仅为举例,其他现有的或今后可能出现的监控和预警的方式如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。
需要注意的是,本申请可在软件和/或软件与硬件的组合体中被实施,例如,可采用专用集成电路(ASIC)、通用目的计算机或任何其他类似硬件设备来实现。在一个实施例中,本申请的软件程序可以通过处理器执行以实现上文所述步骤或功能。同样地,本申请的软件程序(包括相关的数据结构)可以被存储到计算机可读记录介质中,例如,RAM存储器,磁或光驱动器或软磁盘及类似设备。另外,本申请的一些步骤或功能可采用硬件来实现,例如,作为与处理器配合从而执行各个步骤或功能的电路。
另外,本申请的一部分可被应用为计算机程序产品,例如计算机程序指令,当其被计算机执行时,通过该计算机的操作,可以调用或提供根据本申请的方法和/或技术方案。而调用本申请的方法的程序指令,可能被存储在固定的或可移动的记录介质中,和/或通过广播或其他信号承载媒体中的数据流而被传输,和/或被存储在根据所述程序指令运行的计算机设备的工作存储器中。在此,根据本申请的一个实施例包括一个装置,该装置包括用于存储计算机程序指令的存储器和用于执行程序指令的处理器,其中,当该计算机程序指令被该处理器执行时,触发该装置运行基于前述根据本申请的多个实施例的方法和/或技术方案。
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。装置权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一, 第二等词语用来表示名称,而并不表示任何特定的顺序。

Claims (16)

  1. 一种用于处理服务调用信息的方法,其中,该方法包括:
    a获取分布式服务系统中的一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点;
    b根据所述服务调用链构建对应的服务调用模型;
    c根据所述服务调用模型处理所述服务调用链。
  2. 根据权利要求1所述的方法,其中,所述步骤a包括:
    获取分布式系统中的服务调用日志信息;
    从所述服务调用日志信息中抽取一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点。
  3. 根据权利要求1或2所述的方法,其中,所述步骤b包括:
    通过对所述服务调用链进行聚合处理构建对应的服务调用模型,其中,所述服务调用模型包括一个或多个服务调用拓扑,每个服务调用拓扑由一个或多个所述服务调用链聚合处理而得;
    其中,所述步骤c包括:
    根据所述服务调用拓扑处理对应的所述服务调用链。
  4. 根据权利要求3所述的方法,其中,所述方法还包括:
    对所述服务调用拓扑执行清洗操作;
    其中,所述步骤c包括:
    根据清洗后的所述服务调用拓扑处理对应的所述服务调用链。
  5. 根据权利要求4所述的方法,其中,所述清洗操作包括以下至少任一项:
    删除所述服务调用拓扑中的预定服务节点;
    删除所述服务调用拓扑中的未反馈调用结果信息的服务节点;
    删除所述服务调用拓扑中的偶用服务节点,其中,所述偶用服务节点在所述服务调用拓扑对应的所述服务调用链中累计出现次数小于或等于预定的偶用阈值信息;
    删除所述服务调用拓扑中的偶用服务节点,其中,所述偶用服务节点在所述服务调用拓扑对应的所述服务调用链中累计出现次数小于或等于预定的偶用阈值信息,所述服务调用拓扑中除所述偶用服务节点外的其他服务节点在所述服务调用拓扑对应的所述服务调用链中累计出现次数大于或等于预定的常用阈值信息。
  6. 根据权利要求3至5中任一项所述的方法,其中,所述步骤b包括:
    结合所述服务调用链中服务节点的调用反馈信息,对所述服务调用链进行聚合处理,以构建对应的服务调用模型,其中,所述服务调用模型包括一个或多个服务调用拓扑,每个服务调用拓扑由一个或多个所述服务调用链聚合处理而得。
  7. 根据权利要求6所述的方法,其中,所述步骤c包括:
    c1对于所述服务调用拓扑所对应的一个或多个所述服务调用链,根据每个所述服务调用链中服务节点的调用反馈信息进行统计处理;
    c2根据对应的统计结果监控处理所述服务调用拓扑所对应的所述服务调用链。
  8. 根据权利要求7所述的方法,其中,所述步骤c2包括:
    基于预定的时段比对规则,将对应的统计结果与所述服务调用拓扑所对应的所述服务调用链中各服务节点的调用反馈信息相比较;
    当存在比较差值超过预定的波动阈值信息时,生成关于所述比较差值对应的所述服务调用链的报警信息,其中,所述报警信息定位于所述比较差值对应的服务节点。
  9. 一种用于处理服务调用信息的设备,其中,该设备包括:
    服务调用链获取装置,用于获取分布式服务系统中的一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点;
    服务调用模型构建装置,用于根据所述服务调用链构建对应的服务调用模型;
    处理装置,用于根据所述服务调用模型处理所述服务调用链。
  10. 根据权利要求9所述的设备,其中,所述服务调用链获取装置包括:
    调用信息获取单元,用于获取分布式系统中的服务调用日志信息;
    服务调用链生成单元,用于从所述服务调用日志信息中抽取一个或多个服务调用链,其中,每个服务调用链包括被顺序调用的一个或多个服务节点。
  11. 根据权利要求9或10所述的设备,其中,所述服务调用模型构建装置用于:
    通过对所述服务调用链进行聚合处理构建对应的服务调用模型,其中,所述服务调用模型包括一个或多个服务调用拓扑,每个服务调用拓扑由一个或多个所述服务调用链聚合处理而得;
    其中,所述处理装置用于:
    根据所述服务调用拓扑处理对应的所述服务调用链。
  12. 根据权利要求11所述的设备,其中,所述设备还包括:
    清洗装置,用于对所述服务调用拓扑执行清洗操作;
    其中,所述处理装置用于:
    根据清洗后的所述服务调用拓扑处理对应的所述服务调用链。
  13. 根据权利要求12所述的设备,其中,所述清洗操作包括以下至少任一项:
    删除所述服务调用拓扑中的预定服务节点;
    删除所述服务调用拓扑中的未反馈调用结果信息的服务节点;
    删除所述服务调用拓扑中的偶用服务节点,其中,所述偶用服务节点在所述服务调用拓扑对应的所述服务调用链中累计出现次数小于或等于预定的偶用阈值信息;
    删除所述服务调用拓扑中的偶用服务节点,其中,所述偶用服务节点在所述服务调用拓扑对应的所述服务调用链中累计出现次数小于或等于预定的偶用阈值信息,所述服务调用拓扑中除所述偶用服务节点外的其他服务节点在所述服务调用拓扑对应的所述服务调用链中累计出现次数大于或等于预定的常用阈值信息。
  14. 根据权利要求11至13中任一项所述的设备,其中,所述服务调用模型构建装置用于:
    结合所述服务调用链中服务节点的调用反馈信息,对所述服务调用链进行聚合处理,以构建对应的服务调用模型,其中,所述服务调用模型包括一个或多个服务调用拓扑,每个服务调用拓扑由一个或多个所述服务调用链聚合处理而得。
  15. 根据权利要求14所述的设备,其中,所述处理装置包括:
    信息分析单元,用于对于所述服务调用拓扑所对应的一个或多个所述服务调用链,根据每个所述服务调用链中服务节点的调用反馈信息进行统计处理;
    监控单元,用于根据对应的统计结果监控处理所述服务调用拓扑所对应的所述服务调用链。
  16. 根据权利要求15所述的设备,其中,所述监控单元包括:
    比较子单元,用于基于预定的时段比对规则,将对应的统计结果与所述服务调用拓扑所对应的所述服务调用链中各服务节点的调用反馈信息相比较;
    报警子单元,用于当存在比较差值超过预定的波动阈值信息时,生成关于所述比较差值对应的所述服务调用链的报警信息,其中,所述报警信息定位于所述比较差值对应的服务节点。
PCT/CN2016/103173 2015-11-03 2016-10-25 一种用于处理服务调用信息的方法与设备 WO2017076188A1 (zh)

Priority Applications (10)

Application Number Priority Date Filing Date Title
ES16861456T ES2808966T3 (es) 2015-11-03 2016-10-25 Método y dispositivo para procesar información de llamadas de servicio
EP16861456.8A EP3373516B1 (en) 2015-11-03 2016-10-25 Method and device for processing service calling information
KR1020187015528A KR102146173B1 (ko) 2015-11-03 2016-10-25 서비스 호출 정보 처리 방법 및 디바이스
PL16861456T PL3373516T3 (pl) 2015-11-03 2016-10-25 Sposób i urządzenie do przetwarzania informacji o wywołaniu usługi
JP2018522941A JP6706321B2 (ja) 2015-11-03 2016-10-25 サービス呼び出し情報処理の方法及びデバイス
AU2016351091A AU2016351091B2 (en) 2015-11-03 2016-10-25 Method and device for processing service calling information
MYPI2018701755A MY197612A (en) 2015-11-03 2016-10-25 Method and device for processing service calling information
SG11201803696QA SG11201803696QA (en) 2015-11-03 2016-10-25 Service call information processing method and device
US15/969,364 US10671474B2 (en) 2015-11-03 2018-05-02 Monitoring node usage in a distributed system
PH12018500934A PH12018500934A1 (en) 2015-11-03 2018-05-02 Service call information processing method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510734236.2 2015-11-03
CN201510734236.2A CN106656536B (zh) 2015-11-03 2015-11-03 一种用于处理服务调用信息的方法与设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/969,364 Continuation US10671474B2 (en) 2015-11-03 2018-05-02 Monitoring node usage in a distributed system

Publications (1)

Publication Number Publication Date
WO2017076188A1 true WO2017076188A1 (zh) 2017-05-11

Family

ID=58661593

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/103173 WO2017076188A1 (zh) 2015-11-03 2016-10-25 一种用于处理服务调用信息的方法与设备

Country Status (12)

Country Link
US (1) US10671474B2 (zh)
EP (1) EP3373516B1 (zh)
JP (1) JP6706321B2 (zh)
KR (1) KR102146173B1 (zh)
CN (1) CN106656536B (zh)
AU (1) AU2016351091B2 (zh)
ES (1) ES2808966T3 (zh)
MY (1) MY197612A (zh)
PH (1) PH12018500934A1 (zh)
PL (1) PL3373516T3 (zh)
SG (2) SG11201803696QA (zh)
WO (1) WO2017076188A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114629768A (zh) * 2022-02-17 2022-06-14 亚信科技(南京)有限公司 应用拓扑的处理方法、装置、电子设备、存储介质及产品

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105591821B (zh) * 2016-01-06 2020-06-30 北京京东尚科信息技术有限公司 监控系统和业务系统
CN107301197B (zh) * 2017-05-12 2020-12-01 上海瀚银信息技术有限公司 一种业务数据跟踪处理系统及方法
US10409708B2 (en) * 2017-08-28 2019-09-10 International Business Machines Corporation Performance data collection for a distributed computing system
CN107729210B (zh) * 2017-09-29 2020-09-25 百度在线网络技术(北京)有限公司 分布式服务集群的异常诊断方法和装置
CN107741885B (zh) * 2017-10-09 2020-12-01 用友网络科技股份有限公司 基于cs架构的事务与业务关联方法、关联系统
CN109995817A (zh) * 2017-12-29 2019-07-09 中移信息技术有限公司 一种服务调度方法及装置
CN108809688A (zh) * 2018-02-22 2018-11-13 阿里巴巴集团控股有限公司 故障信息识别方法、装置、服务器及系统
CN109120686A (zh) * 2018-08-08 2019-01-01 联动优势电子商务有限公司 一种服务预验证的方法及装置
CN109254901B (zh) * 2018-09-30 2019-11-29 北京九章云极科技有限公司 一种指标监测方法及系统
CN109582650B (zh) * 2018-11-09 2021-05-25 金色熊猫有限公司 模块调用量处理方法、装置、电子设备、存储介质
CN113169900B (zh) * 2018-11-29 2022-12-27 华为技术有限公司 分布式系统中追踪业务执行过程的方法和装置
CN111259275B (zh) * 2018-12-03 2023-12-15 阿里巴巴集团控股有限公司 一种数据追踪方法、设备及存储介质
CN109684104B (zh) * 2018-12-17 2021-03-26 广州方硅信息技术有限公司 一种服务间调用链的展示实现方法及设备
CN109710446A (zh) * 2018-12-28 2019-05-03 江苏满运软件科技有限公司 支付调用过程校验方法、系统、设备以及介质
CN109981349B (zh) * 2019-02-27 2022-02-25 华为云计算技术有限公司 调用链信息查询方法以及设备
KR102245718B1 (ko) * 2019-05-22 2021-04-28 카페24 주식회사 이상치 발생 여부를 속성별로 시각화하는 방법, 컴퓨팅 디바이스 및 컴퓨터 판독 가능한 저장 매체
CN110674284A (zh) * 2019-08-26 2020-01-10 四川新网银行股份有限公司 多系统的方法调用链路图的生成方法
CN110708212B (zh) * 2019-09-10 2022-04-29 中国平安财产保险股份有限公司 分布式系统中调用链路的追踪方法及装置
CN112631880A (zh) * 2019-10-08 2021-04-09 中国移动通信集团浙江有限公司 基于调用链的性能分析方法及装置
CN112737800B (zh) * 2019-10-28 2023-04-28 顺丰科技有限公司 服务节点故障定位方法、调用链生成方法及服务器
CN111459766B (zh) * 2019-11-14 2024-01-12 国网浙江省电力有限公司信息通信分公司 一种面向微服务系统的调用链跟踪与分析方法
CN111343242B (zh) * 2020-02-13 2022-09-02 北京奇艺世纪科技有限公司 一种信息收集方法、装置、设备、存储介质及分布式系统
CN111913766B (zh) * 2020-09-29 2021-01-15 北京东方通软件有限公司 一种微服务调用链的健康检测方法和健康检测系统
CN112366828A (zh) * 2020-11-30 2021-02-12 深圳供电局有限公司 一种配电网重要用户的供电可靠性监视方法及系统
CN112887123A (zh) * 2021-01-06 2021-06-01 新浪网技术(中国)有限公司 一种基于调用链的业务报警方法、系统及装置
CN112783629A (zh) * 2021-02-01 2021-05-11 天津五八到家货运服务有限公司 服务调用方法、装置、电子设备及存储介质
CN113094157A (zh) * 2021-02-25 2021-07-09 青岛海尔科技有限公司 调用拓扑图的生成方法和装置、存储介质及电子装置
CN113297076B (zh) * 2021-05-21 2023-06-23 建信金融科技有限责任公司 基于调用链图的服务变更识别方法及装置
CN113434399B (zh) * 2021-06-23 2023-06-16 青岛海尔科技有限公司 业务节点日志的处理方法和装置、存储介质及电子装置
CN114531338A (zh) * 2022-04-24 2022-05-24 中邮消费金融有限公司 一种基于调用链数据的监控告警和溯源方法及系统
CN114844768A (zh) * 2022-04-27 2022-08-02 广州亚信技术有限公司 信息分析方法、装置及电子设备
KR102528717B1 (ko) 2022-10-07 2023-05-08 이데아텍(주) 보안 기능을 지원하는 api 통합 처리를 위한 게이트웨이 장치 및 이의 동작 방법

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002015462A1 (en) * 2000-08-17 2002-02-21 Redback Networks Inc. Methods and apparatus for deploying quality of service policies on a data communication network
CN102594851A (zh) * 2011-01-07 2012-07-18 中国科学院遥感应用研究所 一种海洋应用服务链动态构建的方法
CN103401944A (zh) * 2013-08-14 2013-11-20 青岛大学 一种服务组合动态重构系统

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6597777B1 (en) 1999-06-29 2003-07-22 Lucent Technologies Inc. Method and apparatus for detecting service anomalies in transaction-oriented networks
CA2491419A1 (en) 2002-06-28 2004-01-08 Omniture, Inc. Capturing and presenting site visitation path data
JP4093483B2 (ja) * 2003-12-26 2008-06-04 インターナショナル・ビジネス・マシーンズ・コーポレーション 解析システム、解析方法、解析プログラム、及び記録媒体
US20070027974A1 (en) * 2005-08-01 2007-02-01 Microsoft Corporation Online service monitoring
US8578017B2 (en) 2006-05-11 2013-11-05 Ca, Inc. Automatic correlation of service level agreement and operating level agreement
EP1879358A1 (en) * 2006-07-12 2008-01-16 Hewlett-Packard Development Company, L.P. Method of providing composite services in a network and corresponding network element
US8873402B2 (en) * 2009-07-31 2014-10-28 Telefonaktiebolaget L M Ericsson (Publ) Service monitoring and service problem diagnosing in communications network
CN102045182B (zh) 2009-10-20 2012-08-08 华为技术有限公司 一种业务故障确定方法、装置和系统
US8635617B2 (en) * 2010-09-30 2014-01-21 Microsoft Corporation Tracking requests that flow between subsystems using transaction identifiers for generating log data
CN102360295A (zh) * 2011-10-07 2012-02-22 彭志平 一种基于多Web服务链组合的服务匹配方法
CN103186417B (zh) * 2011-12-30 2016-04-06 鼎捷软件股份有限公司 一种服务管理的方法
US8977909B2 (en) * 2012-07-19 2015-03-10 Dell Products L.P. Large log file diagnostics system
CN102945283B (zh) * 2012-11-19 2016-09-28 武汉大学 一种语义Web服务组合方法
CN103269280B (zh) * 2013-04-23 2017-12-15 华为技术有限公司 网络中开展业务的方法、装置及系统
US9444675B2 (en) * 2013-06-07 2016-09-13 Cisco Technology, Inc. Determining the operations performed along a service path/service chain
US20160117196A1 (en) * 2013-07-31 2016-04-28 Hewlett-Packard Development Company, L.P. Log analysis
US11245588B2 (en) * 2013-10-30 2022-02-08 Micro Focus Llc Modifying realized topologies
KR20150094260A (ko) * 2014-02-11 2015-08-19 한국전자통신연구원 소프트웨어 정의 네트워킹 환경에서 네트워크 서비스 체인의 정형 검증을 위한 검증 지원 장치 및 방법과, 그 검증 지원 장치를 구비한 정형 검증 장치
US11159599B2 (en) * 2014-10-10 2021-10-26 Dynatrace Llc Method and system for real-time modeling of communication, virtualization and transaction execution related topological aspects of monitored software applications and hardware entities
WO2018137254A1 (zh) * 2017-01-26 2018-08-02 华为技术有限公司 一种基于调用链的并发控制的方法、装置及控制节点

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002015462A1 (en) * 2000-08-17 2002-02-21 Redback Networks Inc. Methods and apparatus for deploying quality of service policies on a data communication network
CN102594851A (zh) * 2011-01-07 2012-07-18 中国科学院遥感应用研究所 一种海洋应用服务链动态构建的方法
CN103401944A (zh) * 2013-08-14 2013-11-20 青岛大学 一种服务组合动态重构系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114629768A (zh) * 2022-02-17 2022-06-14 亚信科技(南京)有限公司 应用拓扑的处理方法、装置、电子设备、存储介质及产品

Also Published As

Publication number Publication date
ES2808966T3 (es) 2021-03-02
US20180253350A1 (en) 2018-09-06
CN106656536A (zh) 2017-05-10
SG10201909213PA (en) 2019-11-28
US10671474B2 (en) 2020-06-02
EP3373516A4 (en) 2018-10-17
KR20180078296A (ko) 2018-07-09
EP3373516A1 (en) 2018-09-12
SG11201803696QA (en) 2018-06-28
JP2019502191A (ja) 2019-01-24
PL3373516T3 (pl) 2020-11-30
MY197612A (en) 2023-06-28
AU2016351091B2 (en) 2019-10-10
AU2016351091A1 (en) 2018-05-24
PH12018500934A1 (en) 2018-11-12
JP6706321B2 (ja) 2020-06-03
CN106656536B (zh) 2020-02-18
EP3373516B1 (en) 2020-05-20
KR102146173B1 (ko) 2020-08-20

Similar Documents

Publication Publication Date Title
WO2017076188A1 (zh) 一种用于处理服务调用信息的方法与设备
EP2871574B1 (en) Analytics for application programming interfaces
US10956296B2 (en) Event correlation
CN106657192B (zh) 一种用于呈现服务调用信息的方法与设备
US20170109657A1 (en) Machine Learning-Based Model for Identifying Executions of a Business Process
US20170109668A1 (en) Model for Linking Between Nonconsecutively Performed Steps in a Business Process
CN107741911A (zh) 接口测试方法、装置、客户端及计算机可读存储介质
US11860717B1 (en) Graphical user interface for presenting crash data
US20170109667A1 (en) Automaton-Based Identification of Executions of a Business Process
US20180046956A1 (en) Warning About Steps That Lead to an Unsuccessful Execution of a Business Process
US20170109636A1 (en) Crowd-Based Model for Identifying Executions of a Business Process
US20170109639A1 (en) General Model for Linking Between Nonconsecutively Performed Steps in Business Processes
CN108038039B (zh) 记录日志的方法及微服务系统
US11113137B2 (en) Error incident fingerprinting with unique static identifiers
US20170109638A1 (en) Ensemble-Based Identification of Executions of a Business Process
US10248532B1 (en) Sensitive data usage detection using static analysis
US20170109640A1 (en) Generation of Candidate Sequences Using Crowd-Based Seeds of Commonly-Performed Steps of a Business Process
US20170109670A1 (en) Crowd-Based Patterns for Identifying Executions of Business Processes
CN109684220A (zh) 一种基于事件回放的浏览器兼容性分析方法
US20230099325A1 (en) Incident management system for enterprise operations and a method to operate the same
CN114880713B (zh) 基于数据链路的用户行为分析方法、装置、设备及介质
Ramakrishna et al. A platform for end-to-end mobile application infrastructure analytics using system log correlation
US20240119385A1 (en) Methods and systems for discovery and monitoring of business flows
US20230100418A1 (en) Metadata-driven data ingestion
CN117540368A (zh) 一种数据泄露检测方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16861456

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 12018500934

Country of ref document: PH

WWE Wipo information: entry into national phase

Ref document number: 11201803696Q

Country of ref document: SG

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2018522941

Country of ref document: JP

ENP Entry into the national phase

Ref document number: 2016351091

Country of ref document: AU

Date of ref document: 20161025

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20187015528

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2016861456

Country of ref document: EP