WO2022111278A1 - 一种并发请求超时的诊断方法及装置、设备、存储介质 - Google Patents

一种并发请求超时的诊断方法及装置、设备、存储介质 Download PDF

Info

Publication number
WO2022111278A1
WO2022111278A1 PCT/CN2021/129625 CN2021129625W WO2022111278A1 WO 2022111278 A1 WO2022111278 A1 WO 2022111278A1 CN 2021129625 W CN2021129625 W CN 2021129625W WO 2022111278 A1 WO2022111278 A1 WO 2022111278A1
Authority
WO
WIPO (PCT)
Prior art keywords
function
time
consuming
timeout
request
Prior art date
Application number
PCT/CN2021/129625
Other languages
English (en)
French (fr)
Inventor
陈吉
毛伟
周杰
卢道和
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2022111278A1 publication Critical patent/WO2022111278A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/448Execution paradigms, e.g. implementations of programming paradigms
    • G06F9/4488Object-oriented
    • G06F9/449Object-oriented method invocation or resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • G06F11/0754Error or fault detection not based on redundancy by exceeding limits
    • G06F11/0757Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs

Definitions

  • the present invention is based on a Chinese patent application with an application number of 202011379546.4 and an application date of November 30, 2020, and claims the priority of the Chinese patent application.
  • the entire content of the Chinese patent application is hereby incorporated into the present invention by way of introduction.
  • the embodiments of the present application relate to, but are not limited to, the information technology of financial technology (Fintech), and in particular, relate to a method, apparatus, device, and storage medium for diagnosing a concurrent request timeout.
  • Fetech information technology of financial technology
  • the diagnosis of concurrent request timeouts in related technologies includes two methods.
  • One method is: using a monitoring platform to collect central processing units (Central Processing Unit, CPU), Java Virtual Machine (Java Virtual Machine, JVM), The number of threads, the number of middleware accumulation and other related indicators; the log platform collects error logs and develops key performance indicators; the Arthas platform collects time-consuming methods; the big data platform comprehensively analyzes data such as monitoring indicators, key performance indicators, and time-consuming methods to obtain Scenario route, and search the result set in the database according to the obtained scenario route, and obtain the performance bottleneck analysis result with the highest score in the result set and the corresponding solution suggestion.
  • CPU Central Processing Unit
  • JVM Java Virtual Machine
  • the number of threads the number of middleware accumulation and other related indicators
  • the log platform collects error logs and develops key performance indicators
  • the Arthas platform collects time-consuming methods
  • the big data platform comprehensively analyzes data such as monitoring indicators, key performance indicators, and time-consuming methods to obtain Scenario route, and search the
  • Another method is: by analyzing the logs generated during the interface calling process, obtain the ratio of the time-consuming of each function calling module in the interface function to the current calling time, and obtain which function module in the interface function takes the most time to call, thereby Locate the problem that causes the interface call to time out.
  • these two concurrent request timeout diagnosis methods can only track the calling link of the first-level method, and often cannot accurately locate the deep-level cause, resulting in inaccurate diagnosis results.
  • embodiments of the present application provide a method, apparatus, device, and storage medium for diagnosing a concurrent request timeout.
  • an embodiment of the present application provides a method for diagnosing a concurrent request timeout, the method comprising: when a concurrent request interface function times out, determining a timeout rate of the interface function, and the number of levels of the interface function is greater than 1; Acquire a first time-consuming set, where the first time-consuming set includes the time-consuming of calling the first-level function by the interface function; The first function set to be weighted, wherein the first function set to be weighted is composed of functions whose time-consuming percentage order is greater than or equal to the timeout rate in the first-level functions; obtain a second time-consuming set, wherein the first time-consuming set is The second time-consuming set includes the time-consuming for each function in the first function set to be weighted to call a preset hierarchical function; according to the second time-consuming set, determine the calling link that causes the interface function to time out;
  • the end function of the call chain is determined as the function that causes the interface function to time out.
  • an embodiment of the present application provides an apparatus for diagnosing a concurrent request timeout.
  • the apparatus includes: a first determining module configured to determine a timeout rate of the interface function when the concurrent request interface function times out, the The number of levels of interface functions is greater than 1; the first acquisition module is configured to acquire a first time-consuming set, wherein the first time-consuming set includes the time-consuming for the interface function to call the first-level function; the second determining module, It is configured to determine the first function set to be weighted in the first-level function according to the timeout rate and the first time-consuming set, wherein the first function set to be weighted is time-consuming in the first-level function.
  • the second obtaining module is configured to obtain a second time-consuming set, wherein the second time-consuming set includes each function call preset in the first function set to be weighted determining the time-consuming of the hierarchical function; a third determining module, configured to determine, according to the second time-consuming set, a calling link that causes the interface function to time out; a fourth determining module, configured to determine the end of the calling link
  • the function is determined to be the function that caused the interface function to time out.
  • an embodiment of the present application provides a computer device, including a memory and a processor, where the memory stores a computer program that can be executed on the processor, and the processor implements the steps in the above method when the processor executes the program .
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps in the foregoing method.
  • the concurrent request interface function when the concurrent request interface function times out, by determining the timeout rate of the interface function, obtaining the time consumed by the interface function to call the first-level function, and determining the first-level function of the interface function.
  • the interface stress test when the interface stress test times out, in addition to counting the time consumption of the first-level functions, it can also count the time consumption of the pre-set level functions, and accurately locate the interface function call chain through the statistical time-consuming information.
  • the position of the function that is most likely to cause the interface to timeout in the road, and all the function call links that may cause the timeout are given, avoiding the related technology, which can only count the call links of the first-level methods in the interface function and cannot locate deeper calls.
  • the function location problem when the interface stress test times out, in addition to counting the time consumption of the first-level functions, it can also count the time consumption of the pre-set level functions, and accurately locate the interface function call chain through the statistical time-consuming information.
  • FIG. 1 is a schematic flowchart of the implementation of a method for diagnosing a concurrent request timeout according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of the implementation of a method for diagnosing a concurrent request timeout according to an embodiment of the present application
  • FIG. 3 is a schematic flowchart of an implementation of a method for diagnosing a concurrent request timeout according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of the composition and structure of a diagnostic apparatus for concurrent request timeout according to an embodiment of the present application
  • 5A is a schematic diagram of an average value of time-consuming percentage of function calls in a low-pressure mode of a method for diagnosing concurrent request timeout according to an embodiment of the present application;
  • 5B is a schematic diagram of an average value of time-consuming percentage of function calls in a low-pressure mode of a method for diagnosing concurrent request timeout according to an embodiment of the present application;
  • 5C is a schematic diagram of function call link analysis in the diagnostic method for concurrent request timeout according to an embodiment of the present application
  • FIG. 6 is a schematic diagram of the composition and structure of a diagnostic apparatus for concurrent request timeout according to an embodiment of the present application
  • FIG. 7 is a schematic diagram of a hardware entity of a computer device in an embodiment of the present application.
  • first ⁇ second ⁇ third are used to distinguish similar objects and do not represent With regard to the specific ordering of objects, it can be understood that “first ⁇ second ⁇ third” can be interchanged in a specific order or sequence if permitted, so that the embodiments of the present application described herein can be used in a manner other than those shown in the drawings. performed in an order other than that shown or described.
  • FIG. 1 is a schematic flowchart of an implementation of the method for diagnosing a timeout of a concurrent request according to an embodiment of the present application. As shown in FIG. 1 , the method includes:
  • Step S101 in the case of concurrent request interface function timeout, determine the timeout rate of the interface function, and the number of levels of the interface function is greater than 1;
  • the interface function has an interface implementation class, and the interface implementation class has sub-functions, so the number of levels of the interface function is greater than or equal to one.
  • the timeout rate may be determined by the request sending module.
  • the request sending module may calculate the timeout rate of the request timeout.
  • Each HTTP request will have a self-defined timeout period. If the self-defined time is exceeded, the request is considered to be timed out.
  • the request sending module will then record the number of request timeouts per second and The timeout rate is calculated by the total number of requests sent in seconds.
  • Step S102 obtaining a first time-consuming set, wherein the first time-consuming set includes the time-consuming of calling the first-level function by the interface function;
  • the request sending module records the time taken by the interface function to call the first-level function.
  • FIG. 5A is a schematic diagram of the time-consuming ratio of function calls in an embodiment of the present application.
  • the interface function f1 calls four functions f1.1(), f1.2(), and f1.3() , f1.4()
  • the time-consuming of the first-level function is the time-consuming of the function f1 calling these four functions.
  • Step S103 according to the timeout rate and the first time-consuming set, determine the first function set to be weighted in the first-level function, wherein the first function set to be weighted is consumed by the first-level function. It consists of functions whose time percentage sorting is greater than or equal to the timeout rate;
  • the request timeout rate of the interface function in the case of sending concurrent requests to the interface, may be 0, and in the case of the interface timeout rate being 0, no screening is performed, and the first set of functions to be weighted includes: All functions that send requests.
  • the request timeout rate of the interface function may be 0, as shown in Figure 5A: the function f1 internally calls four functions f1.1(), f1.2(), f1. 3() and f1.4(), the average percentage of the time-consuming of each function call to the total time-consuming of the upper-layer function is 20% (%), 20%, 30%, and 30%, respectively.
  • the first set of functions to be weighted includes f1.1(), f1.2(), f1.3(), and f1.4().
  • Step S104 obtaining a second time-consuming set, wherein the second time-consuming set includes the time-consuming of each function in the first function set to be weighted calling a preset level function;
  • FIG. 5C is a schematic diagram of the function call link analysis in the method for diagnosing concurrent request timeout according to the embodiment of the present application, as shown in FIG.
  • the second time-consuming set includes f1.2.n(), f1.3.n(), f1.2.1.n(), and f1.3.1. n() and f1.3.3.n().
  • Step S105 determine the calling link that causes the interface function to time out
  • Step S106 Determine the end function of the calling link as the function that causes the interface function to time out.
  • timeout link is link 1: f1()->f1.2()->f1.2.1()->f1.2.1.1(), f1 .2.1.1() is the function that causes the calling interface function to time out.
  • a second time-consuming set is obtained, wherein the second time-consuming set includes the time-consuming time for each function in the first function set to be weighted to call a preset hierarchical function; according to the The second time-consuming set is to determine the calling link that causes the interface function to time out.
  • the interface stress test times out in addition to counting the time consumption of the first-level functions, it can also count the time consumption of the pre-set level functions, and accurately locate the interface function call chain through the statistical time-consuming information.
  • FIG. 2 is a schematic flowchart of an implementation of the method for diagnosing a timeout of a concurrent request according to an embodiment of the present application. As shown in FIG. 2 , the method includes:
  • Step S201 in the case of concurrent request interface function timeout, determine the timeout rate of the interface function, and the number of levels of the interface function is greater than 1;
  • Step S202 obtaining a first time-consuming set, wherein the first time-consuming set includes the time-consuming time when the interface function calls the first-level function;
  • Step S203 according to the timeout rate and the first time-consuming set, determine the first function set to be weighted in the first-level function, wherein the first function set to be weighted is consumed by the first-level function. It consists of functions whose time percentage sorting is greater than or equal to the timeout rate;
  • Step S204 obtaining a second time-consuming set, wherein the second time-consuming set includes the time-consuming of each function in the first function set to be weighted calling a preset level function;
  • Step S205 determine the calling link that causes the interface function to time out
  • Step S206 determining a time-consuming percentage set according to each second time-consuming in the second time-consuming set and the total time-consuming of the upper-layer function corresponding to the second time-consuming;
  • the function f1.2() takes 80ms to call the function f1.2.1(), and the function f1.2() takes 100ms in total, so the percentage of time spent by the function f1.2.1() can be determined is 80%.
  • Step S207 determine the weight value of each function in the second function set to be weighted, wherein each function is a non-zero time-consuming percentage change in the preset hierarchical function. function;
  • the change amount of the time-consuming percentage of the function in different concurrent request states can be determined, and the function's time-consuming percentage can be determined according to the change amount. Weights.
  • Step S208 according to the weight value of each function, determine the function call link branch with timeout
  • the weight value of each link can be determined by multiplying the weight value of each function on the function call link; the probability of timeout of each link is determined according to the size of the weight value; the The link with the highest probability is determined to be the branch of the function call link that has timed out.
  • Step S209 according to the weight value of the branch of the function call link, determine the call link that causes the interface function to time out;
  • the weight value of each link can be determined by multiplying the weight value of each function on the function call link; the probability of timeout of each link is determined according to the size of the weight value; the The link with the highest probability is determined to be the branch of the function call link that has timed out.
  • Step S210 determining the end function of the calling link as the function that causes the interface function to time out.
  • the step S209 determines the call link that causes the interface function to time out, including:
  • Step S2091 multiplying the weight value corresponding to each function in the function call link branch to obtain the corresponding link weight value
  • Step S2092 arranging the link weight values in a specific order
  • the specific order can be sorted from large to small, or can be sorted from small to large.
  • Step S2093 Determine the function call link branch corresponding to the link weight value that satisfies the preset condition as the call link that causes the interface function to time out.
  • the preset condition may be that after the link weight values are sorted in descending order, the first link weight value is sorted.
  • the link weight values may be sorted in descending order, and link 1 is determined as the calling link of the timeout function.
  • the weight value of each function in the second function set to be weighted is determined according to the time-consuming percentage set, so that the weight value of the function call link branch can be determined according to the real-time time-consuming data analysis.
  • Manual intervention of the sequence is adopted, and the result is more accurate, which avoids the problem of improper setting of manual parameter values due to a lot of manual intervention in the related technology, thereby avoiding the problem of low accuracy of the result.
  • the request mode of the concurrent request may include multiple request modes, for example, may include a first request mode and a second request mode.
  • the first request mode may be a low pressure request mode
  • the second request mode may be a high pressure request mode.
  • the first time-consuming set includes a first sub-time consumption and a second sub-time consumption, wherein the first sub-time consumption The time is the time consumed by the interface function to call the first-level function in the first request mode, and the second sub-time is the time consumed by the interface function to call the first-level function in the second request mode. Time.
  • Fig. 3 is the implementation flow schematic diagram of the diagnosis method of the concurrent request timeout of the embodiment of the present application, as shown in Fig. 3, this method comprises:
  • Step S301 in the case of concurrently requesting the interface function timeout in the first request mode, determine a first timeout rate, and the number of levels of the interface function is greater than 1;
  • the first request mode is a low pressure mode
  • the low pressure mode is a mode in which the request sending module sends a smaller number of concurrent request threads.
  • Step S302 in the case where the concurrent request interface function times out in the second request mode, determine a second time-out rate
  • the second request mode is a high-pressure mode
  • the high-pressure mode is a mode in which the request sending module sends a large number of concurrent request threads.
  • the concurrent number of request threads is greater than or equal to 102, the concurrent number of request threads is larger.
  • Step S303 obtaining a first time-consuming set, wherein the first time-consuming set includes the time-consuming of calling the first-level function by the interface function;
  • Step S304 determining a first time-consuming percentage set according to the first timeout rate and the first sub-time-consuming
  • Step S305 determining a second time-consuming percentage set according to the second timeout rate and the second sub-time-consuming
  • the average time-consuming percentage of function f1.2() has increased from 20% to 35%, an increase of 15%; the average time-consuming percentage of function f1.3() has increased from 30% to 35%, an increase of 5%. It can be seen that the time-consuming percentage of the two functions, f1.2() and f1.3(), has increased significantly. It can be considered that the reason for the timeout of the interface call may appear in the calling process of these two functions. middle.
  • Step S307 determine the first function set to be weighted in the first-level function
  • the first function set to be weighted is composed of functions whose time-consuming percentage ordering is greater than or equal to the timeout rate in the first-level functions; for example, comparing Fig. 5A and Fig. 5B, the first function set to be weighted includes: function There are two functions, f1.2() and function f1.3().
  • Step S308 obtaining a second time-consuming set, wherein the second time-consuming set includes the time-consuming of each function in the first function set to be weighted calling a preset level function;
  • Step S309 according to the second time-consuming set, determine the calling link that causes the interface function to time out;
  • Step S310 determining the end function of the calling link as the function that causes the interface function to time out.
  • the step S304, determining the first time-consuming percentage according to the first timeout rate and the first sub-time-consuming includes:
  • Step S3041 Determine a first request timeout function set according to the first timeout rate and the first sub-time consumption, where the first request timeout function set is a request for a first-level function in the first request mode timeout function set;
  • the step S3041, determining a first request timeout function set according to the first timeout rate and the first sub-time-consuming includes: corresponding to each time-consuming in the first sub-time-consuming The time-consuming percentage is sorted; the function corresponding to the time-consuming percentage whose sorting order is less than or equal to the preset order is determined as the request timeout function, wherein the preset order is the first timeout rate and the interface function calling the first The product of the total number of level functions; according to each of the request timeout functions, a first set of request timeout functions is determined.
  • the first timeout rate is 25%
  • the function f1 internally calls 4 functions f1.1(), f1.2(), f1.3(), f1.4()
  • the first sub-time is a function
  • the time-consuming for f1 to call the four functions, the time-consuming percentages corresponding to each function are 20%, 20%, 20%, and 40%. 40%, 20%, 20%, 20% after sorting from largest to smallest.
  • the function f1.4() corresponding to the time-consuming percentage ranked first is determined as the request timeout function, and the first request timeout function set includes f1.4().
  • Step S3042 determining the average time-consuming percentage of the call time of each function in the first request timeout function set to the total time-consuming of the function call at the previous layer, to obtain a first time-consuming percentage set;
  • step S305 according to the second timeout rate and the second sub-time consumption, determine a second time-consuming percentage set, including:
  • Step S3051 determining a second request timeout function set according to the second timeout rate and the second sub-time, wherein the second sub-time is the consumption of the first-level function called by the interface function in the second request mode.
  • the second request timeout function set is the request timeout function set of the first-level function under the second request mode;
  • the step S3051, determining a second request timeout function set according to the second timeout rate and the second sub-time-consuming includes: corresponding to each time-consuming in the second sub-time-consuming The time-consuming percentage is sorted; the function corresponding to the time-consuming percentage whose sorting order is less than or equal to the preset order is determined as the request timeout function, wherein the preset order is the second timeout rate and the interface function calling the first The product of the total number of level functions; according to each of the request timeout functions, a second set of request timeout functions is determined.
  • the second timeout rate is 50%
  • the function f1 internally calls 4 functions f1.1(), f1.2(), f1.3(), f1.4()
  • the second sub-time is a function
  • the time-consuming time for f1 to call the four functions, and the time-consuming percentage corresponding to each function is 10%, 20%, 30%, and 40%. 40%, 30%, 20%, 10% after sorting from largest to smallest.
  • the functions f1.3() and f1.4() corresponding to the time-consuming percentage whose sorting order is less than or equal to the second are determined as request timeout functions, and the second request timeout function set includes f1.3() and f1.4().
  • Step S3052 Determine the percentage average of the time-consuming of each function in the second request timeout function set to the total time-consuming of calling functions of the previous layer, and obtain a second time-consuming percentage set.
  • Example 1 the first request mode is the low pressure mode, and the number of threads is 10.
  • the function f1 internally calls four functions f1.1(), f1 .2(), f1.3(), f1.4(), the average time-consuming percentage of each function call to the total time-consuming of the upper-layer function is 20%, 20%, 30%, and 30%, respectively.
  • the first time-consuming percentage set is [M1, Ua], where M1 represents 10 threads (low pressure), and Ua represents an array [20%, 20%, 30%, 30%].
  • the second request mode is high pressure mode
  • the number of threads is 100
  • the second timeout rate is 50%
  • the function f1 internally calls four functions f1.1(), f1.2( ), f1.3(), f1.4()
  • the average time-consuming percentage of each function call to the total time-consuming of the upper-layer function is 10%, 35%, 35%, and 20%, respectively.
  • the second time-consuming percentage set is [M2, Ub], where M2 represents 100 threads (high pressure), and Ub represents an array [35%, 35%].
  • the first timeout rate is determined; in the case of the concurrent request interface function timeout in the second request mode, determine a second timeout rate; according to the first timeout rate and the first sub-time-consuming, determine a first time-consuming percentage set; according to the second timeout rate and the second sub-time-consuming, determine a second time-consuming percentage set; according to For the first time-consuming percentage set and the second time-consuming percentage set, a percentage change amount is determined; according to the percentage change amount, a first function set to be weighted in the first-level function is determined.
  • An embodiment of the present application provides a method for diagnosing a concurrent request timeout.
  • the second time-consuming set includes a third sub-time and a fourth sub-time, and the third sub-time is the first time in the first request mode.
  • the fourth sub-time-consuming is the time-consuming time for the first function set to be weighted to call a predetermined level function in the second request mode
  • the The time-consuming percentage set includes a third time-consuming percentage set and a fourth time-consuming percentage set
  • the third time-consuming percentage set is the time-consuming of each function in the third request timeout function set accounting for the total time-consuming of the first-layer function call.
  • the percentage average value of , the fourth time-consuming percentage set is the average time-consuming percentage of the calling time of each function in the fourth request timeout function set accounting for the total time-consuming of first-layer function calls, and the second function set to be weighted Including a first sub-function set and a second sub-function set, the method includes:
  • Step S401 in the case of concurrent request interface function timeout, determine the timeout rate of the interface function, and the number of levels of the interface function is greater than 1;
  • Step S402 obtaining a first time-consuming set, wherein the first time-consuming set includes the time-consuming of calling the first-level function by the interface function;
  • Step S403 according to the timeout rate and the first time-consuming set, determine the first function set to be weighted in the first-level function, wherein the first function set to be weighted is consumed by the first-level function. It consists of functions whose time percentage sorting is greater than or equal to the timeout rate;
  • Step S404 obtaining a second time-consuming set, wherein the second time-consuming set includes the time-consuming of each function in the first function set to be weighted calling a preset level function;
  • the second time-consuming set includes the third sub-time-consuming and the fourth sub-time-consuming;
  • Step S406 determine the third request timeout function set according to the third sub-time-consuming and the total time-consuming of the first function to be weighted, wherein the third sub-time-consuming is the time described in the first request mode.
  • the third sub-time includes the time-consuming information of functions f1.2.1(), f1.3.1(), and f1.3.3() under low pressure .
  • the total time consumption of the first function to be weighted includes: time consumption information of functions f1.2() and f.13() under low pressure.
  • the time-consuming information in the third sub-time-consuming and the time-consuming information in the total time-consuming of the first function to be weighted under low pressure it can be determined that the first function to be weighted calls the next function in the low-pressure concurrent request mode
  • the request timeout function set of the first-level function can be determined, and the third request timeout function set can be determined.
  • Step S407 determining the average time-consuming percentage of the calling time of each function in the third request timeout function set to the total time-consuming of first-layer function calls, to obtain a third time-consuming percentage set;
  • Step S408 according to the total time consumption of the fourth sub-time-consuming and the first function to be weighted, determine the fourth request timeout function set, wherein, the fourth sub-time-consuming is described in the second request mode.
  • the first function set to be weighted calls the time-consuming of the preset hierarchical function, and the fourth request timeout function set is the request timeout function set of the next-level function of the first function to be weighted in the second request mode;
  • the fourth sub-consumption is the consumption of functions f1.2.1(), f1.3.1(), and f1.3.3() under high pressure time information.
  • the total time consumption of the first function to be weighted includes: time consumption information of functions f1.2() and f.13() under high pressure.
  • time-consuming information in the fourth sub-time-consuming and the time-consuming information in the total time-consuming of the first function to be weighted under high pressure it can be determined that the first function to be weighted calls the next function in the high-pressure concurrent request mode
  • the request timeout function set of the first-level function can be determined, and the fourth request timeout function set can be determined.
  • Step S409 determining the average time-consuming percentage of the invocation time of each function in the fourth request timeout function set to the total time-consuming of the first-layer function invocation, to obtain a fourth time-consuming percentage set;
  • the ratio of the function f1.2.1() to the function f1.2() can be used to determine the percentage of the function f1.2.1() in the total time-consuming of the first-layer function f1.2(). average value.
  • Step S410 comparing the third time-consuming percentage set and the fourth time-consuming percentage set, and determining the percentage change of each function in the first sub-function set; wherein, the number of levels of the preset level function is 2;
  • the percent change in the function can be determined by Equation (1)
  • C fn-high represents the average time-consuming percentage of function call fn () in high pressure mode
  • T high represents the average time-consuming of function f() in sender high-pressure mode
  • C fn-low represents the function The time-consuming ratio of calling fn () in the sender's low-pressure mode
  • T low represents the average time-consuming of the function calling f() in the sender's low-pressure mode
  • P fn-incre represents the function calling fn () in the sender's low-pressure mode
  • Step S411 Determine the weight value of each function in the first sub-function set according to the percentage change, wherein each function in the first sub-function set is the percentage in the next-level function of the first function to be weighted A function with a non-zero variation;
  • determining the weight value of each function in the first sub-function set according to the percentage change including:
  • the weight value of the function can be calculated by formula (2),
  • W fn P fn-incre /(P f1-incre +P f2-incre +...+P f3-incre ) (2);
  • W fn represents the call link branch weight value of the function f().
  • Step S412 the determining a function call link branch with a timeout according to the weight value of each function includes: determining a function call link with a timeout according to the first function set to be weighted and the first sub-function set branch, wherein each function is a function with a non-zero time-consuming percentage change in the preset hierarchical functions;
  • the first sub-function set may be a function of the function layer where f1.2.1( ) is located.
  • the branch of the function call chain for the timeout shown can be:
  • Step S413 according to the weight value of the branch of the function call link, determine the call link that causes the interface function to time out;
  • Step S414 determining the end function of the calling link as the function that causes the interface function to time out.
  • the method further includes:
  • Step S415 according to the third sub-time-consuming and the total time-consuming of the upper-layer function corresponding to the third sub-time-consuming, determine a fifth request timeout function set, wherein the fifth request timeout function set is at In the first request mode, the first sub-function set function calls the request timeout function set of the next-level function; the number of levels of the preset level function is 3;
  • the third sub-time includes the low-pressure concurrent request mode, where the function f1.2.1.1() is located.
  • the function time-consuming of the layer the total time-consuming of the function of the previous layer is the total time-consuming of the function of the layer where the function f1.2.1() is located.
  • Step S416 determining the average time-consuming percentage of the call time of each function in the fifth request timeout function set to the total time-consuming of the second-layer function call, to obtain a fifth time-consuming percentage set;
  • the fifth time-consuming percentage set includes functions f1.2.1.1(), f1.2.1.2() and f1.3.3.1() percentage of time spent.
  • Step S417 Determine a sixth request timeout function set according to the fourth sub-time-consuming, wherein the sixth request timeout function level is a request timeout function set in which the function in the first sub-function set calls the next-level function in the second request mode ;
  • the third sub-time consumption includes the location where the function f1.2.1.1() is located in the high-pressure concurrent request mode.
  • the function time-consuming of the layer the total time-consuming of the function of the previous layer is the total time-consuming of the function of the layer where the function f1.2.1() is located.
  • Step S418, determining the average time-consuming percentage of the invocation time of each function in the sixth request timeout function set to the total time-consuming of the second-layer function invocation, to obtain a sixth time-consuming percentage set;
  • the fifth time-consuming percentage set includes functions f1.2.1.1(), f1.2.1.2() and f1.3.3.1() percentage of time spent.
  • Step S419 according to the fifth time-consuming percentage set and the sixth time-consuming percentage set, determine the percentage change of each function in the first sub-function set;
  • Step S420 Determine the weight value of each function in the second sub-function set according to the percentage change, wherein each function in the second sub-function set is a function in the next level function called by the function in the first sub-function set. the function for which the percentage change is non-zero;
  • Step S421 according to the first function set to be weighted, the first sub-function set, and the second sub-function set, determine a function call link branch that has timed out.
  • the functions in the first function set to be weighted are: f1.2() and f.13(), and the functions in the first sub-function set are: f1.2.1() , f1.3.1() and f1.3.3(); the functions in the second sub-function set are: f1.2.1.1(), f1.2.1.2() and f1.3.3.1(). From this, it can be determined that there are four types of function call links that may cause the interface to time out.
  • the method further includes: determining the link timeout probability according to the link weight of the function call link.
  • W chain W 1 *W 2 *...*W n (3);
  • W chain is the weight value of the link
  • W n is the weight value of the link n.
  • the weight of the entire link represents the probability that the link contains the cause of the interface timeout. Sort the link weights in descending order, and output the links that may cause timeouts and the function positions that may cause interface timeouts.
  • the weight value of each function in the first sub-function set is determined according to the percentage change; Function call chain branches.
  • the function layer corresponding to the function set and the first sub-function set; on the other hand, the weight value of each function in the second sub-function set is determined according to the percentage change; A set of sub-functions and a second set of sub-functions determine the branch of the function call chain that times out.
  • the links that may cause timeouts and the functions that may cause interface timeouts can be located to the second sub-function set.
  • the corresponding function layer can accurately locate the deeper position of the function most likely to cause the interface timeout in the interface function call chain through the method of weighting.
  • the time consumed by the interface to call each function will increase linearly with the increase of concurrent requests, and the proportion of the time consumed by calling each function to the total time consumed by the upper-layer functions will not change significantly.
  • the time-consuming of calling each function in the time-out interface request accounts for the proportion of the total time-consuming of calling upper-layer functions, which is significantly different from that of normal performance requests, and the percentage increases significantly.
  • the time-consuming timeout of calling the function is very likely to cause the interface to process the request timeout. Therefore, there is a need for a method that can accurately locate the function position most likely to cause the interface timeout in the call interface function chain, and provide the function call chain that may cause the timeout. Provide reference and guidance for relevant personnel to find problems that cause performance bottlenecks.
  • one method is: using a monitoring platform to collect relevant indicators such as a central processing unit (Central Processing Unit, CPU), a Java virtual machine (Java Virtual Machine, JVM), the number of threads, and the number of middleware accumulations; the log platform collects errors Log and develop key performance indicators; the Arthas platform collects time-consuming methods; the big data platform comprehensively analyzes data such as monitoring indicators, key performance indicators, and time-consuming methods to obtain scenario routes, and search the result set in the database according to the obtained scenario routes , to get the performance bottleneck analysis result with the highest score in the result set and the corresponding solution suggestion.
  • a monitoring platform to collect relevant indicators such as a central processing unit (Central Processing Unit, CPU), a Java virtual machine (Java Virtual Machine, JVM), the number of threads, and the number of middleware accumulations
  • the log platform collects errors Log and develop key performance indicators
  • the Arthas platform collects time-consuming methods
  • the big data platform comprehensively analyzes data such as monitoring indicators, key performance indicators, and time-consuming methods to obtain scenario routes
  • Another method is: by analyzing the logs generated during the interface calling process, obtain the ratio of the time-consuming of each function calling module in the interface function to the current calling time, and obtain which function module in the interface function takes the most time to call, thereby Locate the problem that causes the interface call to time out.
  • the first method has two problems: on the one hand, the accuracy of the results is not enough.
  • the key performance indicators printed in the logs of different subsystems are not necessarily the same. Different performance indicators may cause inconsistencies in the process of scene route matching, resulting in deviations in the accuracy of the performance bottleneck analysis results.
  • the result set matching scoring standard is set according to the test requirements. In the process of setting the test requirements, there are many manual settings, which may easily lead to inaccurate analysis results of the performance bottleneck caused by the improper setting of the test requirements. On the other hand, the detailed exception function location cannot be located. In the process of collecting time-consuming methods on the Arthas platform, the analysis of time-consuming methods can only track the calling link of the first-level method, and cannot locate the method called in the first-level method and the method called in the called method.
  • the second method has three problems: On the one hand, time-consuming statistics rely heavily on logs, and in reality, developers are unlikely to print out the time-consuming logs of all function call modules in an interface. Lack of logs makes time-consuming statistics solutions infeasible. On the other hand, only the call links of the first-level methods in the interface are counted. Unable to locate function location called deeper. On the other hand, this technology counts the time-consuming proportion of all the requested interface functions calling modules internally. In the case of a low timeout rate (such as 5%), the time-consuming proportion of abnormally calling the module may not be too prominent, and the result accuracy is poor.
  • an embodiment of the present application provides a method for diagnosing a concurrent request timeout.
  • the method can accurately locate the interface function call link that is most likely to cause an interface timeout when the interface stress test times out.
  • the detailed location of the function, and all the function call links that may cause timeouts are given;
  • the weight value of the function call link branch is based on real-time time-consuming data analysis, disordered manual intervention, and the result is highly accurate ;
  • the overall design is simple and does not rely on too many external systems, specific indicators, log data, and has good universality and is suitable for various systems.
  • FIG. 4 is a schematic diagram of the composition and structure of the diagnosing device for concurrent request timeout according to an embodiment of the present application. As shown in FIG. 4 , the device includes:
  • the request sending module 41 is configured to start a specified number of concurrent threads and send a request to the interface of the system under test 45 according to the specified pressure.
  • the timeout rate (s%) of the request is collected, and the timeout rate is sent to the time-consuming data analysis module 43 in real time.
  • the Arthas module 42 is configured to monitor and count the time-consuming of each function call inside the specified function, and send the time-consuming data to the time-consuming data analysis module 43 for analysis.
  • the time-consuming data analysis module 43 is configured to arrange the time-consuming data sent by the Arthas module 42 in descending order to obtain a time-consuming ranking. And according to the timeout rate s% sent by the request sending module 41, the top s% requests in the time-consuming ranking are screened out, and a time-consuming data set is obtained, and the time-consuming data set stores the function of request timeout. In the case of low pressure on the sender, or if the interface does not time out, the request is not filtered.
  • the decision-making and control module 44 is configured to analyze the time-consuming data set, count the percentage of time-consuming function calls of each request time-out in the time-consuming data set accounting for the time-consuming calls of the upper-level functions of the request time-out function, and calculate [M3 ,Uc] is sent to the decision-making and control module 44 as a set of data, wherein M3 represents the real-time concurrent thread number (requester pressure) of the request sender, and Uc represents the average time-consuming percentage of each function call.
  • M3 represents the real-time concurrent thread number (requester pressure) of the request sender
  • Uc represents the average time-consuming percentage of each function call.
  • the decision-making and control module 44 is configured to, in the case of receiving the [M3, Uc] data from the time-consuming data analysis module 43, compare the changes in the time-consuming percentage of each function under low pressure and high pressure, and select For a function call whose time-consuming percentage of function calls increases, the reason for the timeout is most likely from the function calls whose time-consuming percentage increases the most. According to the growth data of the time-consuming percentage, a weight value is assigned to the branch of the function call link whose time-consuming percentage of the function call increases. The larger the weight value corresponding to the function call link branch, the more likely it is to cause the function call to time out.
  • it is configured to call the link branch for the selected timeout function, and send a function call time-consuming monitoring instruction for the timeout function to the Arthas module 42, so as to make deeper function call time-consuming statistics.
  • it is configured to repeat the foregoing two processes until the function call analysis of the specified number of layers is completed. Calculate the weight of each function call link branch according to the weight value assigned to each function call link branch, and arrange the weights in descending order. The one with the largest weight is the function call link branch that is most likely to cause timeout. , the reason for the timeout is most likely the function at the end of this branch. In this way, the location of the function that causes the request interface to time out can be located.
  • the embodiment of the present application provides a method for diagnosing concurrent request timeout, the method comprising:
  • Step S501 the request sending module sends a request to the interface of the system under test with the first request mode, and determines the first timeout rate;
  • Step S502 the request sending module sends the first timeout rate to the time-consuming data analysis module
  • the first request mode is a low pressure mode
  • the low pressure mode is a mode in which the request sending module sends a smaller number of concurrent request threads.
  • the request sending module may calculate a timeout rate of the request timeout.
  • Each HTTP request will have a self-defined timeout period. If the self-defined time is exceeded, the request is considered to be timed out.
  • the request sending module records the timeout period in the request header of the HTTP request and the number of request timeouts per second. and the total number of requests sent per second to get the timeout rate. For example, the request sending module first starts a small number of threads, for example, 10 threads, and then sends a request to the interface of the system under test in a low pressure mode, and calculates the request timeout rate s% within the cycle time.
  • Step S503 the time-consuming data analysis module receives the first timeout rate sent by the request sending module
  • Step S504 the time-consuming data analysis module determines the first request timeout function set according to the first timeout rate and the time-consuming function call;
  • the elements in the first request timeout function set are the requests with the top S1% of the time-consuming ranking
  • the time-consuming ranking is the function of requesting the time-consuming timeout according to Arranged in order from largest to smallest.
  • Step S505 the time-consuming data analysis module determines the average time-consuming percentage of the call time of each function in the first request timeout function set to the total time-consuming of upper-layer function calls, and obtains the first time-consuming percentage;
  • the content of the first time-consuming percentage is [M4, Ud], where M4 represents the real-time concurrent thread number (first requester pressure) of the first request sender, and Ud represents the average time-consuming percentage of each function call.
  • the time-consuming data analysis module filters out the requests with the top S1% of the interface request time-consuming according to the current first timeout rate S1%. Calculate the average percentage of the time-consuming of each function call in this part of the request to the total time-consuming of the upper-layer function call in a period of time. Record the average time-consuming percentage of each function call corresponding to M4 equal to 10 threads (low pressure).
  • FIG. 5A is a schematic diagram of the average time-consuming percentage of function calls in the low-pressure mode of the method for diagnosing concurrent request timeout according to an embodiment of the present application, as shown in FIG.
  • the function f1 internally calls four functions f1.1(), f1.2 (), f1.3(), f1.4(), the average time-consuming percentage of each function call to the total time-consuming of the upper-layer function is 20%, 20%, 30%, and 30%, respectively.
  • Step S506 the time-consuming data analysis module sends the first time-consuming percentage to the decision-making and control module
  • Step S507 the decision-making and control module receives the first time-consuming percentage sent by the time-consuming data analysis module
  • Step S508 the request sending module sends a request to the interface of the system under test in the second request mode, and determines the second timeout rate;
  • the second request mode is a high-pressure mode
  • the high-pressure mode is a mode in which the request sending module sends a large number of concurrent request threads.
  • the concurrent number of the second request threads is greater than or equal to 102
  • the concurrent number of the second request threads is larger.
  • the decision and control module controls the request sending module to gradually increase the number of concurrent request threads (eg. 100 threads) to send the second request to the system under test interface in a high stress mode.
  • Step S509 requesting the sending module to send the second timeout rate to the time-consuming data analysis module
  • Step S510 the time-consuming data analysis module receives the second timeout rate sent by the request sending module
  • the decision-making and control module controls the request sending module to gradually increase the number of concurrent request threads, for example, 100 threads, and sends requests to the system-under-test interface in a high-stress mode. At this time, there are many timeouts for the interface requests. The second timeout rate S2% within the acquisition period is obtained.
  • Step S511 the time-consuming data analysis module determines a second request timeout function set according to the second timeout rate
  • the elements in the second request timeout function set are the requests with the top S2% of the time-consuming ranking
  • the time-consuming ranking is the function of requesting the time-consuming timeout according to Arranged in order from largest to smallest.
  • Step S512 the time-consuming data analysis module determines the average time-consuming percentage of the calling time of each function in the second request timeout function set to the total time-consuming of upper-layer function calls, and obtains the second time-consuming percentage;
  • the content of the second time-consuming percentage is [M5, Ue], where M5 represents the real-time concurrent thread number (second requester pressure) of the second request sender, and Ue represents the average time-consuming percentage of each function call.
  • the time-consuming data analysis module filters out the requests with the top S2% of the interface request time-consuming. Calculate the average percentage of the time-consuming of each function call in this part of the request to the total time-consuming of the upper-layer function call in a period of time. Record the average time-consuming percentage of each function call corresponding to M5 equal to 100 threads (high pressure).
  • FIG. 5B is a schematic diagram of the average time-consuming percentage of function calls in the low-pressure mode of the method for diagnosing concurrent request timeout according to an embodiment of the present application, as shown in FIG.
  • the function f1 internally calls four functions f1.1(), f1.2 (), f1.3(), f1.4(), the average time-consuming percentage of each function call to the total time-consuming of the upper-layer function is 10%, 35%, 35%, and 20%, respectively.
  • Step S513 the time-consuming data analysis module sends the second time-consuming percentage to the decision-making and control module;
  • Step S514 the decision-making and control module receives the second time-consuming percentage sent by the time-consuming data analysis module
  • Step S515 the decision-making and control module compares the first time-consuming percentage and the second time-consuming percentage, and determines the function whose function call time-consuming percentage average value changes as the function to be weighted;
  • the function to be weighted is a function to be assigned a weight value.
  • the decision-making control module obtains the function call time-consuming ratio data from the time-consuming data analysis module, and the calculation request sending module sends requests to the interface of the system under test in high pressure mode and low pressure mode, respectively.
  • the average time-consuming percentage of function f1.2() has increased from 20% to 35%, an increase of 15%; the average time-consuming percentage of function f1.3() has increased from 30% to 35%, an increase of 5%. It can be seen that the time-consuming percentage of the two functions, f1.2() and f1.3(), has increased significantly. It can be considered that the reason for the timeout of the interface call may appear in the calling process of these two functions. middle.
  • the decision and control module assigns a weight to the function that increases the average of the two call percentages.
  • the step S516, determining the weight of the function to be weighted includes:
  • Step S517 the decision-making and control module determines the variation of the time-consuming percentage average value of the function to be weighted
  • the variation of the average time-consuming percentage of the function to be weighted can be calculated by formula (8),
  • C fn-high represents the average time-consuming percentage of function call fn () in high pressure mode
  • T high represents the average time-consuming of function f() in sender high-pressure mode
  • C fn-low represents the function The time-consuming ratio of calling fn () in the sender's low-pressure mode
  • T low represents the average time-consuming of the function calling f() in the sender's low-pressure mode
  • P fn-incre represents the function calling fn () in the sender's low-pressure mode
  • Step S518, according to the time-consuming increase percentage, determine the weight value of the function call link branch
  • W fn P fn-incre /(P f1-incre +P f2-incre +...+P f3-incre ) (11);
  • W fn represents the call link branch weight value of the function f().
  • Step S519 the decision-making and control module sends the time-consuming statistical instruction to the function to be weighted to the Arthas module;
  • the Arthas is an open source Java language diagnostic tool, which can help developers and testers to view the system running status, JVM (Java Virtual Machine) real-time running status, generate CPU heat map, and monitor method execution calls at a global time.
  • Arthas supports JDK6+, supports Linux/Mac/Windows, adopts command line interactive mode, and provides rich auto-completion functions, which can facilitate problem location and diagnosis.
  • time-consuming statistics instruction is used to perform calling time-consuming statistics on the internal function of the function to be weighted.
  • steps S505 to S519 are repeated to analyze the calling time-consuming of the second-level function.
  • the number of traces to be invoked is configured by the user. Assuming that the number of traces is set to 3, the result may be as shown in Figure 5C.
  • FIG. 5C is a schematic diagram of the analysis of function call links according to an embodiment of the present application. As shown in FIG. 5C , there are four types of function call links that may cause the interface to time out.
  • f1.3.1.1() is not marked with a weight value, because the difference in time spent by the function f1.3.1.1() in the high-voltage and low-voltage modes of the request sender is less than 5%.
  • the function call link branch weight value is not assigned to the function call branch f1.3.1.1().
  • the function f1.3.1() has only one function call of the function f1.3.1.1(), excluding the effect of the function call on the timeout, the problem can only be in the code of the function f1.3.1() itself, so the function f1.3.1() is the call end function of link 3.
  • Step S520 the decision and control module determines the link timeout probability according to the link weight of the function call link.
  • the link weight value is calculated according to formula (14),
  • W chain W 1 *W 2 *...*W n (14);
  • W chain is the weight value of the link
  • W n is the weight value of the link n.
  • the weight of the entire link represents the probability that the link contains the cause of the interface timeout. Sort the link weights in descending order, and output the links that may cause timeouts and the function positions that may cause interface timeouts.
  • the embodiments of the present application provide an apparatus for diagnosing concurrent request timeout.
  • the apparatus includes each module included, each unit included in each module, and each subunit included in each unit. It can be realized by a processor in the device; of course, it can also be realized by a logic circuit; in the process of implementation, the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP) or a field-available processor. Programming Gate Arrays (FPGA), etc.
  • CPU central processing unit
  • MPU microprocessor
  • DSP digital signal processor
  • FPGA Programming Gate Arrays
  • FIG. 6 is a schematic diagram of the composition and structure of an apparatus for diagnosing concurrent request timeout according to an embodiment of the present application.
  • the apparatus 600 includes a first determination module 601 , a first acquisition module 602 , a second determination module 603 , and a second acquisition module 600 .
  • the first determination module 601 is configured to determine the timeout rate of the interface function and the number of levels of the interface function when the concurrent request interface function times out greater than 1;
  • the first obtaining module 602 is configured to obtain a first time-consuming set, wherein the first time-consuming set includes the time-consuming for the interface function to call the first-level function;
  • the second determining module 603 is configured to The timeout rate and the first time-consuming set determine the first function set to be weighted in the first-level function, wherein the first function set to be weighted is sorted by the time-consuming percentage in the first-level function and is greater than
  • the second acquisition module 604 is configured to acquire a second time-consuming set, wherein the second time-consuming set includes a preset level of each function call in the first function set to be weighted The time-consuming of the function;
  • the third determining module 605 is configured to determine the calling link that causes the interface function to time out according to the second time-consuming set;
  • the fourth determination module 606 is configured
  • the third determination module includes a first determination unit, a second determination unit, a third determination unit and a fourth determination unit, wherein: the first determination unit is configured to Collecting each second time-consuming and the total time-consuming of the upper-layer function corresponding to the second time-consuming, and determining the time-consuming percentage set; the second determining unit is configured to determine the second time-consuming percentage set according to the time-consuming percentage set.
  • each function is a function with a non-zero time-consuming percentage change in the preset hierarchical functions
  • the third determining unit is configured to be based on each function
  • the weight value of the function call link branch is determined
  • the fourth determination unit is configured to determine the call link that causes the interface function to time out according to the weight value of the function call link branch.
  • the fourth determination unit includes a product subunit, a sorting subunit, and a first determination subunit, wherein: the product subunit is configured to The weight values are multiplied to obtain the corresponding link weight values; the sorting subunit is configured to arrange the link weight values in a specific order; the first determination subunit is configured to correspond to the link weight values that meet the preset conditions
  • the function call link branch is determined as the call link of the call link timeout function that causes the interface function to time out.
  • the first determining module 601 includes a fifth determining unit and a sixth determining unit, wherein: the fifth determining unit is configured to perform a concurrent request in the first request mode when the interface function times out Next, the first timeout rate is determined; and the sixth determination unit is configured to determine the second timeout rate in the case of the concurrent request interface function timeout in the second request mode.
  • the second determination module 603 includes a seventh determination unit, an eighth determination unit, a ninth determination unit and a tenth determination unit, wherein: the seventh determination unit is configured to At the time, determine the first time-consuming percentage set; the eighth determining unit is configured to determine the second time-consuming percentage set according to the second timeout rate and the second sub-time-consuming; the ninth determining unit is configured to A time-consuming percentage set and a second time-consuming percentage set determine a percentage change; a tenth determining unit is configured to determine a first function set to be weighted in the first-level function according to the percentage change.
  • the seventh determination unit includes a second determination subunit, a third determination subunit, a fourth determination subunit, a fifth determination subunit, and a sixth determination subunit, wherein: the second determination subunit a unit configured to determine a first request timeout function set according to the first timeout rate and the first sub-time consumption, wherein the first request timeout function set is a set of first-level functions in the first request mode Requesting a timeout function set; a third determining subunit, configured to determine the average percentage of the time-consuming call of each function in the first request-timeout function set to the total time-consuming function call of the previous layer, to obtain a first time-consuming percentage set ;
  • the eighth determining unit includes a fourth determining subunit and a fifth determining subunit, wherein: the fourth determining subunit is configured to determine a second request timeout function according to the second timeout rate and the second sub-time-consuming set, wherein the second sub-time-consuming is the time-consuming of the first-level function called by the interface function in the second request mode, and the second request timeout function set is the request timeout of the first-level function in the second request mode A function set; a fifth determining subunit, configured to determine the percentage average of the time-consuming of each function in the second request timeout function set to the total time-consuming of the function calls of the previous layer, to obtain a second time-consuming percentage set.
  • the first determination unit includes a sixth determination subunit, a seventh determination subunit, an eighth determination subunit, and a ninth determination subunit, wherein: the sixth determination subunit is configured according to the The total time-consuming of the third sub-time-consuming and the first function to be weighted determines the third request timeout function set, wherein the third sub-time-consuming is the first to be weighted under the first request mode The time-consuming of the function set calling a preset level function, and the third request timeout function set is the request timeout function set for the first function to be weighted to call the next-level function in the first request mode; the seventh determination subunit, configure In order to determine the average time-consuming percentage of the calling time of each function in the third request timeout function set to the total time-consuming of first-layer function calls, a third time-consuming percentage set is obtained; the eighth determination subunit is configured according to the The fourth sub-time-consuming and the total time-consuming of the first function to be weighted determine the fourth request timeout function set,
  • the second determination unit includes a tenth determination subunit, an eleventh determination subunit, and a twelfth determination subunit, wherein: the tenth determination subunit is configured to compare the third power consumption The time percentage set and the fourth time consumption percentage set determine the percentage change of each function in the first sub-function set; the eleventh determination subunit is configured to determine each function in the first sub-function set according to the percentage change The weight value of , wherein each function in the first sub-function set is a function whose percentage change is non-zero in the next-level function of the first function to be weighted; the twelfth determination sub-unit is configured to For the weight value of each function, determining a function call link branch that has timed out includes: determining a function call link branch that has timed out according to the first function set to be weighted and the first sub-function set.
  • the second determination unit further includes a thirteenth determination subunit, a fourteenth determination subunit and a fifteenth determination subunit, wherein: the thirteenth determination subunit is configured according to the The fifth time-consuming percentage set and the sixth time-consuming percentage set determine the percentage change of each function in the first sub-function set; the fourteenth determining subunit is configured to determine the second sub-function set according to the percentage change
  • the weight value of each function wherein each function in the second sub-function set is a function whose percentage change is non-zero in the next-level function called by the functions in the first sub-function set; the fifteenth determining sub-unit,
  • the configuration to determine the function call link branch with timeout according to the weight value of each function includes: determining the timeout according to the first function set to be weighted, the first sub-function set and the second sub-function set The function call chain branch of .
  • the second determination unit further includes a sixteenth determination subunit, a seventeenth determination subunit, an eighteenth determination subunit, and a nineteenth determination subunit, wherein: the sixteenth determination subunit a unit configured to determine a fifth request timeout function set according to the third sub-time-consuming and the total time-consuming of the upper-layer function corresponding to the third sub-time-consuming, wherein the fifth request timeout function set is In the first request mode, the first sub-function set functions to call a request timeout function set of the next-level function; a seventeenth determining subunit is configured to determine the calling time of each function in the fifth request timeout function set The percentage average value of the total time-consuming of the second-layer function calls is obtained, and the fifth time-consuming percentage set is obtained; the eighteenth determining subunit is configured to determine the sixth request timeout function set according to the fourth sub-time-consuming, wherein the The sixth request timeout function level is a request timeout function set in which the function in the first sub-function set calls
  • the nineteenth determination subunit is configured to determine an average percentage of the time-consuming of each function in the sixth request timeout function set to the total time-consuming of the second-layer function calls, and obtain a sixth set of time-consuming percentages.
  • the above-mentioned method for diagnosing the timeout of concurrent requests is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer-readable storage medium.
  • the technical solutions of the embodiments of the present application may be embodied in the form of software products in essence or the parts that contribute to related technologies.
  • the computer software products are stored in a storage medium and include several instructions to make A computer device (which may be a personal computer, a server, or a network device, etc.) executes all or part of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: a U disk, a mobile hard disk, a read only memory (Read Only Memory, ROM), a magnetic disk or an optical disk and other media that can store program codes.
  • ROM Read Only Memory
  • the aforementioned storage medium includes: a U disk, a mobile hard disk, a read only memory (Read Only Memory, ROM), a magnetic disk or an optical disk and other media that can store program codes.
  • the embodiments of the present application are not limited to any specific combination of hardware and software.
  • an embodiment of the present application provides a computer device, including a memory and a processor.
  • the memory stores a computer program that can be executed on the processor, and the processor implements the steps in the above method when executing the program.
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps in the above method. It should be pointed out here that the descriptions of the above storage medium and device embodiments are similar to the descriptions of the above method embodiments, and have similar beneficial effects to the method embodiments. For technical details not disclosed in the embodiments of the storage medium and device of the present application, please refer to the description of the method embodiments of the present application to understand.
  • FIG. 7 is a schematic diagram of a hardware entity of the computer device in the embodiment of the application.
  • the hardware entity of the computer device 700 includes: a processor 701 , a communication interface 702 and a memory 703 , wherein
  • the processor 701 generally controls the overall operation of the computer device 700 .
  • the communication interface 702 enables the computer device to communicate with other terminals or servers through a network.
  • the memory 703 is configured to store instructions and applications executable by the processor 701, and may also cache data to be processed or processed by the processor 701 and various modules in the computer device 700 (eg, image data, audio data, voice communication data and Video communication data), which can be realized by flash memory (FLASH) or random access memory (Random Access Memory, RAM).
  • FLASH flash memory
  • RAM Random Access Memory
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are illustrative.
  • the division of the units is a logical function division.
  • multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms. of.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Telephonic Communication Services (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本申请公开了一种并发请求超时的诊断方法及装置、设备、存储介质,该方法包括:并发请求接口函数超时的情况下,确定所述接口函数的超时率,所述接口函数的层级数量大于1;获取第一耗时集,其中,所述第一耗时集包括所述接口函数调用第一级函数的耗时;根据所述超时率和所述第一耗时集,确定第一级函数中的第一待赋权函数集,其中,所述第一待赋权函数集由第一级函数中耗时百分比排序大于等于超时率的函数组成;获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;根据所述第二耗时集,确定导致所述接口函数超时的调用链路;将所述调用链路的末端函数确定为导致所述接口函数超时的函数。

Description

一种并发请求超时的诊断方法及装置、设备、存储介质
相关申请的交叉引用
本发明基于申请号为202011379546.4、申请日为2020年11月30日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本发明。
技术领域
本申请实施例涉及但不限于金融科技(Fintech)的信息技术,尤其涉及一种并发请求超时的诊断方法及装置、设备、存储介质。
背景技术
随着计算机技术的发展,越来越多的技术应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,然而,由于金融行业的安全性、实时性要求,金融科技也对技术提出了更高的要求。金融科技领域下,相关技术中解决并发请求超时的诊断包括两种方法,一种方法为:使用监控平台收集中央处理器(Central Processing Unit,CPU)、Java虚拟机(Java Virtual Machine,JVM)、线程数、中间件堆积数等相关指标;日志平台收集错误日志和开发关键性能指标;Arthas平台收集耗时方法;大数据平台根据监控指标、关键性能指标、耗时方法等数据综合分析,得出场景路线,并根据得到的场景路线在数据库中查找结果集,得到结果集评分最高的性能瓶颈分析结果和对应的解决方案建议。另一种方法为:通过对接口调用过程中产生的日志进行分析,得到接口函数中各个函数调用模块耗时与本次调用时间的占比,得到接口函数中哪个函数模块调用耗时最多,从而定位到导致接口调用超时的问题所在。然而,这两种并发请求超时的诊断方法只能跟踪一级方法的调用链路,往往无法精确的定位到深层次的原因,导致诊断结果不准确。
发明内容
有鉴于此,本申请实施例提供一种并发请求超时的诊断方法及装置、设备、存储介质。
本申请实施例的技术方案是这样实现的:
一方面,本申请实施例提供一种并发请求超时的诊断方法,所述方法包括:并发请求接口函数超时的情况下,确定所述接口函数的超时率,所述接口函数的层级数量大于1;获取第一耗时集,其中,所述第一耗时集包括所述接口函数调用第一级函数的耗时;根据所述超时率和所述第一耗时集,确定第一级函数中的第一待赋权函数集,其中,所述第一待赋权函数集由第一级函数中耗时百分比排序大于等于超时率的函数组成;获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;根据所述第二耗时集,确定导致所述接口函数超时的调用链路;
将所述调用链路的末端函数确定为导致所述接口函数超时的函数。
又一方面,本申请实施例提供一种并发请求超时的诊断装置,所述装置包括:第一确定模块,配置为并发请求接口函数超时的情况下,确定所述接口函数的超时率,所述接口函数的层级数量大于1;第一获取模块,配置为获取第一耗时集,其中,所述第一 耗时集包括所述接口函数调用第一级函数的耗时;第二确定模块,配置为根据所述超时率和所述第一耗时集,确定第一级函数中的第一待赋权函数集,其中,所述第一待赋权函数集由第一级函数中耗时百分比排序大于等于超时率的函数组成;第二获取模块,配置为获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;第三确定模块,配置为根据所述第二耗时集,确定导致所述接口函数超时的调用链路;第四确定模块,配置为将所述调用链路的末端函数确定为导致所述接口函数超时的函数。
再一方面,本申请实施例提供一种计算机设备,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述方法中的步骤。
还一方面,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述方法中的步骤。
本申请实施例提供的并发请求超时的诊断方法,在并发请求接口函数超时的情况下,通过确定所述接口函数的超时率,获取所述接口函数调用第一级函数的耗时,确定第一级函数中耗时百分比排序大于等于超时率的函数;通过获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;根据所述第二耗时集,确定导致所述接口函数超时的调用链路。这样,能够在接口压力测试超时的情况下,除了统计第一层级函数的耗时外,还能够统计预先设定层级函数的耗时,并通过统计的耗时信息,准确定位到接口函数调用链路中最有可能导致接口超时的函数位置,给出所有可能导致超时的函数调用链路,避免了相关技术中,只能统计接口函数中一级方法的调用链路而无法定位到更深层调用的函数位置的问题。
附图说明
图1为本申请实施例并发请求超时的诊断方法的实现流程示意图;
图2为本申请实施例并发请求超时的诊断方法的实现流程示意图;
图3为本申请实施例并发请求超时的诊断方法的实现流程示意图;
图4为本申请实施例并发请求超时的诊断装置的组成结构示意图;
图5A为本申请实施例并发请求超时的诊断方法的低压力模式下函数调用耗时百分比平均值示意图;
图5B为本申请实施例并发请求超时的诊断方法的低压力模式下函数调用耗时百分比平均值示意图;
图5C为本申请实施例并发请求超时的诊断方法中函数调用链路分析的示意图;
图6为本申请实施例并发请求超时的诊断装置的组成结构示意图;
图7为本申请实施例中计算机设备的一种硬件实体示意图。
具体实施方式
为了使本申请的目的、技术方案和优点更加清楚,下面结合附图和实施例对本申请的技术方案进行详细阐述,所描述的实施例不应视为对本申请的限制,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在 不冲突的情况下相互结合。
如果申请文件中出现“第一/第二”的类似描述则增加以下的说明,在以下的描述中,所涉及的术语“第一\第二\第三”用于区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
下面结合附图和实施例对本申请的技术方案进行详细阐述。
本申请实施例提供一种并发请求超时的诊断方法,图1为本申请实施例并发请求超时的诊断方法的实现流程示意图,如图1所示,该方法包括:
步骤S101,并发请求接口函数超时的情况下,确定所述接口函数的超时率,所述接口函数的层级数量大于1;
这里,所述接口函数具有接口实现类,所述接口实现类中有子函数,因此所述接口函数的层级数量大于等于1。
这里,所述超时率可以通过请求发送模块确定。在实施过程中,所述请求发送模块可以计算出请求超时的超时率。每一HTTP请求都会有一个自定义的超时时间,超过自定义时间,认为所述请求超时,所述请求发送模块再根据HTTP请求的请求头中记录的每一秒请求超时的个数和每一秒发送的请求的总数计算得到超时率。
在一些实施例中,所述超时率可以通过每一秒请求超时的个数除以每一秒发送的请求的总数得到,即,所述超时率=每一秒请求超时的个数/每一秒发送的请求的总数。
步骤S102,获取第一耗时集,其中,所述第一耗时集包括所述接口函数调用第一级函数的耗时;
在一些实施例中,请求发送模块记录所述接口函数调用第一级函数的耗时。
举例说明,图5A为本申请实施例中函数调用耗时占比示意图,如图5A所示,接口函数f1调用了4个函数f1.1(),f1.2(),f1.3(),f1.4(),所述第一级函数的耗时为函数f1调用这4个函数的耗时。
步骤S103,根据所述超时率和所述第一耗时集,确定第一级函数中的第一待赋权函数集,其中,所述第一待赋权函数集由第一级函数中耗时百分比排序大于等于超时率的函数组成;
在一些实施例中,在向接口发送并发请求的情况下,接口函数的请求超时率可能为0,在接口超时率为0的情况下,不做筛选,所述第一待赋权函数集中包括发送请求的所有函数。举例说明,向接口发送并发请求的情况下,接口函数的请求超时率可能为0,如图5A所示:函数f1内部调用了4个函数f1.1()、f1.2()、f1.3()和f1.4(),各函数调用耗时占上层函数总耗时的百分比平均值分别是20百分比(%),20%,30%,30%。第一待赋权函数集中包括f1.1(),f1.2(),f1.3(),f1.4()。
步骤S104,获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;
举例说明,在第一待赋权函数集中包括f1.2()和f.13()的情况下,图5C为本申请实施例并发请求超时的诊断方法中函数调用链路分析的示意图,如图5C所示,在预先设 定层级为3的情况下,第二耗时集中包括f1.2.n()、f1.3.n()、f1.2.1.n()、f1.3.1.n()和f1.3.3.n()。
步骤S105,根据所述第二耗时集,确定导致所述接口函数超时的调用链路;
在实施过程中,通过第二耗时集,能够确定可能导致接口超时的函数调用链路。
举例说明,如图5C所示,可能导致接口超时的函数调用链路有4种,
链路1:f1()->f1.2()->f1.2.1()->f1.2.1.1();
链路2:f1()->f1.2()->f1.2.1()->f1.2.1.2();
链路3:f1()->f.13()->f1.3.1();
链路4:f1()->f1.3()->f1.3.3()->f1.3.3.1()。
步骤S106,将所述调用链路的末端函数确定为导致所述接口函数超时的函数。
举例说明,如图5C所示,确定出超时链路为链路1:f1()->f1.2()->f1.2.1()->f1.2.1.1()的情况下,f1.2.1.1()为导致调用接口函数超时的函数。
在本申请实施例中,获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;根据所述第二耗时集,确定导致所述接口函数超时的调用链路。这样,能够在接口压测超时的情况下,除了统计第一层级函数的耗时外,还能够统计预先设定层级函数的耗时,并通过统计的耗时信息,准确定位到接口函数调用链路中最有可能导致接口超时的函数位置,给出所有可能导致超时的函数调用链路,避免了相关技术中,只能统计接口函数中一级方法的调用链路而无法定位到更深层调用的函数位置的问题。
本申请实施例提供一种并发请求超时的诊断方法,图2为本申请实施例并发请求超时的诊断方法的实现流程示意图,如图2所示,该方法包括:
步骤S201,并发请求接口函数超时的情况下,确定所述接口函数的超时率,所述接口函数的层级数量大于1;
步骤S202,获取第一耗时集,其中,所述第一耗时集包括所述接口函数调用第一级函数的耗时;
步骤S203,根据所述超时率和所述第一耗时集,确定第一级函数中的第一待赋权函数集,其中,所述第一待赋权函数集由第一级函数中耗时百分比排序大于等于超时率的函数组成;
步骤S204,获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;
步骤S205,根据所述第二耗时集,确定导致所述接口函数超时的调用链路;
步骤S206,根据所述第二耗时集中每一第二耗时和与所述第二耗时对应的上一层函数总耗时,确定耗时百分比集;
举例说明,如图5C所示,函数f1.2()调用函数f1.2.1()耗时80ms,函数f1.2()总耗时100ms,可以确定出函数f1.2.1()的耗时百分比为80%。
步骤S207,根据所述耗时百分比集,确定第二待赋权函数集中每一函数的权重值,其中,所述每一函数为所述预先设定层级函数中耗时百分比变化量非零的函数;
在实施过程中,在已知不同并发请求状态下的耗时百分比集的情况下,能够确定出函数在不同并发请求状态下的耗时百分比的变化量,根据所述变化量能够确定出函数的权重值。
步骤S208,根据所述每一函数的权重值,确定超时的函数调用链路分支;
在实施过程中,通过将函数调用链路上的所述每一函数的权重值相乘,可以确定每一链路的权重值;根据权重值的大小,确定每一链路超时的概率;将概率最大的链路确定为超时的函数调用链路分支。
步骤S209,根据所述函数调用链路分支的权重值,确定导致所述接口函数超时的调用链路;
在实施过程中,通过将函数调用链路上的所述每一函数的权重值相乘,可以确定每一链路的权重值;根据权重值的大小,确定每一链路超时的概率;将概率最大的链路确定为超时的函数调用链路分支。
步骤S210,将所述调用链路的末端函数确定为导致所述接口函数超时的函数。
在一些实施例中,所述步骤S209,根据所述函数调用链路分支的权重值,确定导致所述接口函数超时的调用链路,包括:
步骤S2091,将所述函数调用链路分支中每一函数对应的权重值相乘,得到对应的链路权重值;
步骤S2092,将所述链路权重值按照特定顺序排列;
这里,所述特定顺序可以为从大到小进行排序,也可以为从小到大进行排序。
步骤S2093,将满足预设条件的链路权重值对应的函数调用链路分支,确定为所述导致所述接口函数超时的调用链路。
这里,所述预设条件可以为链路权重值从大到小进行排序后,排序第一的链路权重值。
举例说明,如图5C所示,可能导致接口超时的函数调用链路有4种:
链路1:f1()->f1.2()->f1.2.1()->f1.2.1.1();
链路2:f1()->f1.2()->f1.2.1()->f1.2.1.2();
链路3:f1()->f.13()->f1.3.1();
链路4:f1()->f1.3()->f1.3.3()->f1.3.3.1()。
计算得到所述链路1至链路4的链路权重为:链路1权重为0.613*1*0.8=0.4904;链路2权重为0.613*1*0.2=0.1226;链路3权重为0.387*0.64=0.24768;链路4权重为0.387*0.36*1=0.13932。这里可以将所述链路权重值按照从大到小进行排序,将链路1确定为所述超时函数的调用链路。
在本申请实施例中,根据所述耗时百分比集,确定第二待赋权函数集中每一函数的权重值,这样,能够根据实时耗时数据分析来确定函数调用链路分支的权重值无序人工干预,结果准确度更高,避免了相关技术中,由于人工干预多,而产生的人工参数值设置不当的问题,从而避免了结果准确度不高的问题。
本申请实施例提供一种并发请求超时的诊断方法,所述并发请求的请求模式可以包括多种请求模式,例如可以包括第一请求模式和第二请求模式,在实施的过程中,第一请求模式可以为低压力请求模式,第二请求模式可以为高压力请求模式。本实施例以请求模式包括第一和第二请求模式为例进行说明,此时,所述第一耗时集包括第一子耗时和第二子耗时,其中,所述第一子耗时为在所述第一请求模式下所述接口函数调用第一级函数的耗时,所述第二子耗时为在所述第二请求模式下所述接口函数调用第一级函数的耗时。
图3为本申请实施例并发请求超时的诊断方法的实现流程示意图,如图3所示,该 方法包括:
步骤S301,在以所述第一请求模式进行并发请求接口函数超时的情况下,确定第一超时率,所述接口函数的层级数量大于1;
这里,所述第一请求模式为低压力模式,所述低压力模式为请求发送模块发送请求线程的并发数量较少的模式。
步骤S302,在以所述第二请求模式进行并发请求接口函数超时的情况下,确定第二超时率;
这里,所述第二请求模式为高压力模式,所述高压力模式为请求发送模块发送请求线程的并发数量较多的模式。在一些实施例中,所述请求线程的并发数量大于或等于102的情况下,请求线程的并发数量较多。
步骤S303,获取第一耗时集,其中,所述第一耗时集包括所述接口函数调用第一级函数的耗时;
步骤S304,根据所述第一超时率和第一子耗时,确定第一耗时百分比集;
步骤S305,根据所述第二超时率和第二子耗时,确定第二耗时百分比集;
步骤S306,根据所述第一耗时百分比和第二耗时百分比集,确定百分比变化量;
举例说明,对比图5A和图5B,函数f1.2()耗时百分比平均值从20%增加到了35%,增加了15%;函数f1.3()耗时百分比平均值从30%增加到了35%,增加了5%。可以看出,这函数f1.2()和函数f1.3()两个函数的耗时百分比发生了明显的增加,可以认为导致接口调用超时的原因可能出现在对这两个函数的调用过程中。
步骤S307,根据所述百分比变化量,确定第一级函数中的第一待赋权函数集;
其中,所述第一待赋权函数集由第一级函数中耗时百分比排序大于等于超时率的函数组成;举例说明,对比图5A和图5B,所第一待赋权函数集包括:函数f1.2()和函数f1.3()两个函数。
步骤S308,获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;
步骤S309,根据所述第二耗时集,确定导致所述接口函数超时的调用链路;
步骤S310,将所述调用链路的末端函数确定为导致所述接口函数超时的函数。
在一些实施例中,所述步骤S304,根据所述第一超时率和第一子耗时,确定第一耗时百分比,包括:
步骤S3041,根据所述第一超时率和第一子耗时,确定第一请求超时函数集,其中,所述第一请求超时函数集为在所述第一请求模式下第一级函数的请求超时函数集;
在一些实施例中,所述步骤S3041,根据所述第一超时率和第一子耗时,确定第一请求超时函数集,包括:对所述第一子耗时中的每一耗时对应的耗时百分比进行排序;将排序次序小于等于预设次序的耗时百分比对应的函数确定为请求超时函数,其中,所述预设次序为所述第一超时率与所述接口函数调用第一级函数的总数的乘积;根据每一所述请求超时函数,确定第一请求超时函数集。
举例说明,第一超时率为25%,函数f1内部调用了4个函数f1.1(),f1.2(),f1.3(),f1.4(),第一子耗时为函数f1调用所述4个函数的耗时,各函数对应的耗时百分比为20%,20%,20%,40%。从大到小排序后为40%,20%,20%,20%。将排序第一的耗时百分比对应的函数f1.4()确定为请求超时函数,第一请求超时函数集中包括f1.4()。
步骤S3042,确定所述第一请求超时函数集中每一函数的调用耗时占上一层函数调用总耗时的百分比平均值,得到第一耗时百分比集;
所述步骤S305,根据所述第二超时率和第二子耗时,确定第二耗时百分比集,包括:
步骤S3051,根据所述第二超时率和第二子耗时,确定第二请求超时函数集,其中,所述第二子耗时为第二请求模式下接口函数调用的第一级函数的耗时,所述第二请求超时函数集为第二请求模式下第一级函数的请求超时函数集;
在一些实施例中,所述步骤S3051,根据所述第二超时率和第二子耗时,确定第二请求超时函数集,包括:对所述第二子耗时中的每一耗时对应的耗时百分比进行排序;将排序次序小于等于预设次序的耗时百分比对应的函数确定为请求超时函数,其中,所述预设次序为所述第二超时率与所述接口函数调用第一级函数的总数的乘积;根据每一所述请求超时函数,确定第二请求超时函数集。
举例说明,第二超时率为50%,函数f1内部调用了4个函数f1.1(),f1.2(),f1.3(),f1.4(),第二子耗时为函数f1调用所述4个函数的耗时,各函数对应的耗时百分比为10%,20%,30%,40%。从大到小排序后为40%,30%,20%,10%。将排序次序小于等于第二的耗时百分比对应的函数f1.3()和f1.4()确定为请求超时函数,第二请求超时函数集中包括f1.3()和f1.4()。
步骤S3052,确定所述第二请求超时函数集中每一函数的调用耗时占上一层函数调用总耗时的百分比平均值,得到第二耗时百分比集。
举例说明1,第一请求模式为低压力模式,线程数为10,如图5A所示,在第一超时率为0的情况下,函数f1内部调用了4个函数f1.1(),f1.2(),f1.3(),f1.4(),各函数调用耗时占上层函数总耗时的百分比平均值分别是20%,20%,30%,30%。所述第一耗时百分比集为[M1,Ua],其中,M1表示10线程(低压力),Ua表示数组[20%,20%,30%,30%]。
举例说明2,第二请求模式为高压力模式,线程数为100,第二超时率为50%,如图5C所示:函数f1内部调用了4个函数f1.1(),f1.2(),f1.3(),f1.4(),各函数调用耗时占上层函数总耗时的百分比平均值分别是10%,35%,35%,20%。则,第二耗时百分比集为[M2,Ub],其中,M2表示100线程(高压力),Ub表示数组[35%,35%]。
在本申请实施例中,在以所述第一请求模式进行并发请求接口函数超时的情况下,确定第一超时率;在以所述第二请求模式进行并发请求接口函数超时的情况下,确定第二超时率;根据所述第一超时率和第一子耗时,确定第一耗时百分比集;根据所述第二超时率和第二子耗时,确定第二耗时百分比集;根据所述第一耗时百分比集和第二耗时百分比集,确定百分比变化量;根据所述百分比变化量,确定第一级函数中的第一待赋权函数集。这样,能够根据各个函数在不同请求模式下的百分比变化量,确定可能发生请求接口超时的函数,从而能够在不依赖过多的外部系统、例如,特定指标和日志数据的情况下,确定出可能发生请求接口超时的函数,使得整体设计简洁、普适性好,能够适用于更多日志数据补充足的系统。
本申请实施例提供一种并发请求超时的诊断方法,所述第二耗时集包括第三子耗时和第四子耗时,所述第三子耗时为第一请求模式下所述第一待赋权函数集调用预先设定层级函数的耗时,所述第四子耗时为第二请求模式下所述第一待赋权函数集调用预先设 定层级函数的耗时,所述耗时百分比集包括第三耗时百分比集和第四耗时百分比集,所述第三耗时百分比集为第三请求超时函数集中每一函数的调用耗时占第一层函数调用总耗时的百分比平均值,所述第四耗时百分比集为第四请求超时函数集中每一函数的调用耗时占第一层函数调用总耗时的百分比平均值,所述第二待赋权函数集包括第一子函数集和第二子函数集,该方法包括:
步骤S401,并发请求接口函数超时的情况下,确定所述接口函数的超时率,所述接口函数的层级数量大于1;
步骤S402,获取第一耗时集,其中,所述第一耗时集包括所述接口函数调用第一级函数的耗时;
步骤S403,根据所述超时率和所述第一耗时集,确定第一级函数中的第一待赋权函数集,其中,所述第一待赋权函数集由第一级函数中耗时百分比排序大于等于超时率的函数组成;
步骤S404,获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;
其中,所述第二耗时集包括第三子耗时和第四子耗时;
步骤S405,根据所述第二耗时集,确定导致所述接口函数超时的调用链路;
步骤S406,根据所述第三子耗时和所述第一待赋权函数的总耗时,确定第三请求超时函数集,其中,所述第三子耗时为第一请求模式下所述第一待赋权函数集调用预先设定层级函数的耗时,所述第三请求超时函数集为第一请求模式下第一待赋权函数调用下一级函数的请求超时函数集;
举例说明,在低压力并发请求的情况下,如图5C所示,第三子耗时包括低压力下函数f1.2.1()、f1.3.1()和f1.3.3()等的耗时信息。所述第一待赋权函数的总耗时包括:低压力下函数f1.2()和f.13()的耗时信息。根据第三子耗时中的耗时信息与第一待赋权函数在低压力下的总耗时中的耗时信息,可以确定出低压力并发请求模式下第一待赋权函数调用下一级函数的请求超时函数集,即可以确定出第三请求超时函数集。
步骤S407,确定所述第三请求超时函数集中每一函数的调用耗时占第一层函数调用总耗时的百分比平均值,得到第三耗时百分比集;
举例说明,如图5C所示,在低压力下,通过函数f1.2.1()与函数f1.2()的比值,即可确定出函数f1.2.1()占第一层函数f1.2()总耗时的百分比平均值。
步骤S408,根据所述第四子耗时和所述第一待赋权函数的总耗时,确定第四请求超时函数集,其中,所述第四子耗时为第二请求模式下所述第一待赋权函数集调用预先设定层级函数的耗时,所述第四请求超时函数集为第二请求模式下第一待赋权函数下一级函数的请求超时函数集;
举例说明,在高压力并发请求的情况下,如图5C所示,所述第四子耗时为高压力下函数f1.2.1()、f1.3.1()和f1.3.3()等的耗时信息。所述第一待赋权函数的总耗时包括:高压力下函数f1.2()和f.13()的耗时信息。根据第四子耗时中的耗时信息与第一待赋权函数在高压力下的总耗时中的耗时信息,可以确定出高压力并发请求模式下第一待赋权函数调用下一级函数的请求超时函数集,即可以确定出第四请求超时函数集。
步骤S409,确定所述第四请求超时函数集中每一函数的调用耗时占第一层函数调用总耗时的百分比平均值,得到第四耗时百分比集;
举例说明,在高压力下,通过函数f1.2.1()与函数f1.2()的比值,即可确定出函数f1.2.1()占第一层函数f1.2()总耗时的百分比平均值。
步骤S410,对比所述第三耗时百分比集和第四耗时百分比集,确定第一子函数集中每一函数的百分比变化量;其中,所述预先设定层级函数的层级数量为2;
举例说明,在所述预先设定层级函数的层级数量为2的情况下,如图5C所示,对比所述第三耗时百分比集和第四耗时百分比集中函数f1.2.1()所在函数层的函数的百分比变化量。
在一些实施例中,可以通过公式(1)确定函数的百分比变化量
P fn-incre=(C fn-high*T high-C fn-low*T low)/C fn-low*T low   (1);
其中,C fn-high表示函数调用f n()在高压力模式下的耗时百分比平均值;T high表示函数f()在发送方高压力模式下的平均耗时;C fn-low表示函数调用f n()在发送方低压力模式下的耗时占比;T low表示函数调用f()在发送方低压力模式下的平均耗时;P fn-incre表示函数调用f n()在请求发送方高压模式下耗时平均值相对于低压模式下增加的百分比。
步骤S411,根据所述百分比变化量,确定第一子函数集中每一函数的权重值,其中,所述第一子函数集中每一函数为第一待赋权函数下一级函数中所述百分比变化量非零的函数;
在一些实施例中,根据所述百分比变化量,确定第一子函数集中每一函数的权重值,包括:
可以通过公式(2)计算出函数的权重值,
W fn=P fn-incre/(P f1-incre+P f2-incre+...+P f3-incre)  (2);
其中,W fn表示函数f()的调用链路分支权重值。
步骤S412,所述根据所述每一函数的权重值,确定超时的函数调用链路分支,包括:根据所述第一待赋权函数集和第一子函数集,确定超时的函数调用链路分支,其中,所述每一函数为所述预先设定层级函数中耗时百分比变化量非零的函数;
举例说明,如图5C所示,第一子函数集可以为f1.2.1()所在函数层的函数。所示超时的函数调用链路分支可以为:
链路1:f1()->f1.2()->f1.2.1();
链路2:f1()->f1.2()->f1.2.1();
链路3:f1()->f.13()->f1.3.1();
链路4:f1()->f1.3()->f1.3.3()。
步骤S413,根据所述函数调用链路分支的权重值,确定导致所述接口函数超时的调用链路;
步骤S414,将所述调用链路的末端函数确定为导致所述接口函数超时的函数。
在一些实施例中,所述方法还包括:
步骤S415,根据所述第三子耗时和所述第三子耗时对应的上一层函数的总耗时,确定第五请求超时函数集,其中,所述第五请求超时函数集为在所述第一请求模式下第一子函数集中函数调用下一级函数的请求超时函数集;所述预先设定层级函数的层级数量为3;
举例说明,在所述预先设定层级函数的层级数量为3的情况下,如图5C所示,所述第三子耗时包括低压力并发请求模式下,函数f1.2.1.1()所在层的函数耗时,所上一层函数的总耗时为函数f1.2.1()所在层函数的总耗时。
步骤S416,确定所述第五请求超时函数集中每一函数的调用耗时占第二层函数调用总耗时的百分比平均值,得到第五耗时百分比集;
举例说明,在低压力并发请求模式下,如图5C所示,所述第五耗时百分比集中包括函数f1.2.1.1()、f1.2.1.2()和f1.3.3.1()的耗时百分比。
步骤S417,根据第四子耗时,确定第六请求超时函数集,其中,所述第六请求超时函数级为第二请求模式下第一子函数集中函数调用下一级函数的请求超时函数集;
举例说明,在所述预先设定层级函数的层级数量为3的情况下,如图5C所示,所述第三子耗时包括高压力并发请求模式下,函数f1.2.1.1()所在层的函数耗时,所上一层函数的总耗时为函数f1.2.1()所在层函数的总耗时。
步骤S418,确定所述第六请求超时函数集中每一函数的调用耗时占第二层函数调用总耗时的百分比平均值,得到第六耗时百分比集;
举例说明,在高压力并发请求模式下,如图5C所示,所述第五耗时百分比集中包括函数f1.2.1.1()、f1.2.1.2()和f1.3.3.1()的耗时百分比。
步骤S419,根据所述第五耗时百分比集和第六耗时百分比集,确定第一子函数集中每一函数的百分比变化量;
步骤S420,根据所述百分比变化量,确定第二子函数集中每一函数的权重值,其中,所述第二子函数集中每一函数为第一子函数集中函数调用的下一级函数中所述百分比变化量非零的函数;
步骤S421,根据所述第一待赋权函数集、第一子函数集和第二子函数集,确定超时的函数调用链路分支。
举例说明,如图5C所示,所述第一待赋权函数集中的函数为:f1.2()和f.13(),所述第一子函数集中的函数为:f1.2.1()、f1.3.1()和f1.3.3();所述第二子函数集中的函数为:f1.2.1.1()、f1.2.1.2()和f1.3.3.1()。由此可以确定,可能导致接口超时的函数调用链路有4种,
链路1:f1()->f1.2()->f1.2.1()->f1.2.1.1();
链路2:f1()->f1.2()->f1.2.1()->f1.2.1.2();
链路3:f1()->f.13()->f1.3.1();
链路4:f1()->f1.3()->f1.3.3()->f1.3.3.1()。
在一些实施例中,在确定能导致接口超时的函数调用链路之后,所述方法还包括:根据函数调用链路的链路权重,确定链路超时概率。
举例说明,根据公式(3)计算链路权重值,
W chain=W 1*W 2*...*W n   (3);
其中,W chain为链路权重值,W n为链路n的权重值。
所述链路1至链路4的链路权重和超时概率的计算式如式(4)至式(7),
链路1权重:0.613*1*0.8=0.4904≈49.0%   (4);
链路2权重:0.613*1*0.2=0.1226≈12.3%   (5);
链路3权重:0.387*0.64=0.24768≈24.8%   (6);
链路4权重:0.387*0.36*1=0.13932≈13.9%。   (7);
整个链路的权重即代表该链路包含导致接口超时原因的概率。将各链路权重值从大到小排序,输出可能导致超时的链路及可能导致接口超时的函数位置。
在本申请实施例中,一方面,根据所述百分比变化量,确定第一子函数集中每一函数的权重值;根据所述第一待赋权函数集和第一子函数集,确定超时的函数调用链路分支。这样,能够在预先设定层级函数的层级数量为2的情况下,通过为函数设定权值的方式,将可能导致超时的链路及可能导致接口超时的函数位置定位到第一待赋权函数集和第一子函数集所对应的函数层;另一方面,根据所述百分比变化量,确定第二子函数集中每一函数的权重值;根据所述第一待赋权函数集、第一子函数集和第二子函数集,确定超时的函数调用链路分支。这样,能够在预先设定层级函数的层级数量为3的情况下,通过为函数设定权值的方式,将可能导致超时的链路及可能导致接口超时的函数位置定位到第二子函数集所对应的函数层,通过赋权的方法准确定位到接口函数调用链路中最有可能导致接口超时的函数的更深层次的位置。
对被调用接口进行压力检测的应用场景中,当请求发送方发送的并发请求增多时,由于接口收到的并发请求数量过大,导致请求压力增大,被测的接口往往会出现请求接口超时的情况。相关人员需要对超时的原因进行分析诊断,找到产生性能瓶颈的问题,解决所述问题,对性能进行调优。
在实施过程中,接口调用各函数的耗时会随着请求并发量的增大呈线性增长,调用各函数的耗时占上层函数总耗时的比例不会出现较大变化。当到达性能瓶颈,接口请求出现超时,超时的接口请求中调用各函数的耗时占调用上层函数总耗时的比例会与性能正常请求时明显不同,百分比会明显增加。在这种情况下,调用函数耗时的超时极有可能是导致接口处理请求超时的原因。因此,需要一种能够准确定位到调用接口函数链路中最有可能导致接口超时的函数位置,并给出可能导致超时的函数调用链路。给相关人员寻找产生性能瓶颈的问题提供参考和指引。
相关技术中,一种方法为:使用监控平台收集中央处理器(Central Processing Unit,CPU)、Java虚拟机(Java Virtual Machine,JVM)、线程数、中间件堆积数等相关指标;日志平台收集错误日志和开发关键性能指标;Arthas平台收集耗时方法;大数据平台根据监控指标、关键性能指标、耗时方法等数据综合分析,得出场景路线,并根据得到的场景路线在数据库中查找结果集,得到结果集评分最高的性能瓶颈分析结果和对应的解决方案建议。另一种方法为:通过对接口调用过程中产生的日志进行分析,得到接口函数中各个函数调用模块耗时与本次调用时间的占比,得到接口函数中哪个函数模块调用耗时最多,从而定位到导致接口调用超时的问题所在。
由此可见,相关技术在解决接口调用的超时问题时,存在以下的问题:
第一种方法存在两方面的问题:一方面,结果准确性不够。在实施过程中,不同子系统日志打印的关键性能指标不一定相同,不同的性能指标会在场景路线匹配的过程中产生不一致的问题,导致性能瓶颈分析结果准确度出现偏差。此外,结果集匹配评分标准是根据测试需求设定的,在测试需求设定的过程中人工设定内容较多,容易引起因测试需求设置不当造成的性能瓶颈分析结果不准确。另一方面,无法定位到详细的异常函数位置。在Arthas平台收集耗时方法的过程中,耗时方法的分析只能跟踪一级方法的调用链路,无法定位到一级方法中所调用的方法以及被调用方法中调用的方法。
第二种方法存在三方面的问题:一方面,耗时统计严重依赖日志,现实中开发人员不太可能把一个接口中所有的函数调用模块的耗时日志打印出来。缺少日志会导致耗时统计方案不可行。另一方面,只统计了接口中一级方法的调用链路。无法定位到更深层调用的函数位置。再一方面,该技术统计了所有请求的接口函数内部调用模块的耗时占比。在超时率较低(比如5%)的情况下,异常调用模块的耗时占比可能不会太突出,结果准确度较差。
为解决以上问题,本申请实施例提供了一种并发请求超时的诊断方法,该方法一方面,在接口压测超时的情况下,能准确定位到接口函数调用链路中最有可能导致接口超时的函数详细位置,并给出所有可能导致超时的函数调用链路;另一方面,函数调用链路分支的权重值是根据实时耗时数据分析而来的,无序人工干预,结果准确度高;再一方面,整体设计简洁,不依赖过多的外部系统、特定指标、日志数据,普适性好,适用于各类系统。
本申请实施例提供一种并发请求超时的诊断装置,图4为本申请实施例并发请求超时的诊断装置的组成结构示意图,如图4所示,该装置包括:
请求发送模块41,配置为启动指定数量的并发线程,按照指定压力向被测系统45的接口发送请求。收集请求的超时率(s%),并将所述超时率实时发送给耗时数据分析模块43。
Arthas模块42,配置为监控和统计指定函数内部各函数调用的耗时,并将所述耗时数据发送到耗时数据分析模块43进行分析。
耗时数据分析模块43,一方面,配置为将所述Arthas模块42发送的所述耗时数据按照从大到小的顺序进行排列,得到耗时排行。并根据所述请求发送模块41发送的所述超时率s%,筛选出所述耗时排行前s%的请求,得到耗时数据集,所述耗时数据集中存储了请求超时的函数。在发送方低压力,或者在接口未出现超时的情况下,不对所述请求进行筛选。另一方面,配置为对所述耗时数据集进行分析,统计所述耗时数据集中各请求超时的函数调用占所述请求超时的函数的上层函数的调用耗时的百分比,并将[M3,Uc]作为一组数据发送给决策与控制模块44,其中,M3表示请求发送方实时并发线程数(请求方压力),Uc表示各函数调用耗时百分比平均值。
决策与控制模块44,一方面,配置为在接收到耗时数据分析模块43传来的[M3,Uc]数据的情况下,对比低压力和高压力下各函数耗时百分比的变化,选出函数调用耗时百分比增加的函数调用,超时原因极有可能来自函数调用耗时百分比增加最多的函数调用。根据耗时百分比的增长数据,给函数调用耗时百分比增加的函数调用链路分支赋权重值。函数调用链路分支对应的权重值越大,越可能是导致函数调用超时的原因。另一方面,配置为针对选出的超时函数调用链路分支,向Arthas模块42发送针对超时函数的函数调用耗时监控指令,做更深层的函数调用耗时统计。再一方面,配置为重复前述两个过程,直到完成指定层数的函数调用分析。根据给每个函数调用链路分支赋予的权重值,计算每条函数调用链路分支的权重,将所述权重由大到小排列,权重最大的就是最有可能导致超时的函数调用链路分支,超时的原因最可能出现在这条分支末端的函数上。如此,能够定位到导致请求接口超时的函数的位置。
本申请实施例提供一种并发请求超时的诊断方法,该方法包括:
步骤S501,请求发送模块以第一请求模式向被测系统的接口发送请求,并确定第 一超时率;
步骤S502,请求发送模块将所述第一超时率发送给耗时数据分析模块;
这里,所述第一请求模式为低压力模式,所述低压力模式为请求发送模块发送请求线程的并发数量较少的模式。
在一些实施例中,所述请求线程的并发数量小于102的情况下,请求线程的并发数量较少。在一些实施例中,所述请求发送模块可以计算出请求超时的超时率。每一HTTP请求都会有一个自定义的超时时间,超过自定义时间,认为所述请求超时,所述请求发送模块根据HTTP请求的请求头中记录的超时的时间、每一秒请求超时的个数和每一秒发送的请求的总数计算得到超时率。举例说明,请求发送模块先启动较少的线程数量,例如,10个线程,然后以低压力模式向被测系统的接口发送请求,并计算周期时间内的请求超时率s%。
步骤S503,耗时数据分析模块接收所述请求发送模块发送的第一超时率;
步骤S504,耗时数据分析模块根据所述第一超时率与所述函数调用耗时,确定第一请求超时函数集;
这里,在所述第一超时率为S1%的情况下,所述第一请求超时函数集中的元素为耗时排行前S1%的请求,所述耗时排行是将请求耗时超时的函数按照从大到小的顺序进行排列得到的。
步骤S505,耗时数据分析模块确定所述第一请求超时函数集中每一函数的调用耗时占上层函数调用总耗时的百分比平均值,得到第一耗时百分比;
这里,所述第一耗时百分比内容为[M4,Ud],其中,M4表示第一请求发送方实时并发线程数(第一请求方压力),Ud表示各函数调用耗时百分比平均值。
举例说明,耗时数据分析模块根据当前第一超时率S1%,筛选出接口请求耗时排名前S1%的请求。计算一段时间内这部分请求中各函数调用耗时占上层函数调用总耗时的百分比平均值。将M4等于10线程(低压力)对应的各函数调用耗时百分比平均值记录下来。图5A为本申请实施例并发请求超时的诊断方法的低压力模式下函数调用耗时百分比平均值示意图,如图5A所示:函数f1内部调用了4个函数f1.1(),f1.2(),f1.3(),f1.4(),各函数调用耗时占上层函数总耗时的百分比平均值分别是20%,20%,30%,30%。
步骤S506,耗时数据分析模块将所述第一耗时百分比发送给决策与控制模块;
步骤S507,决策与控制模块接收所述耗时数据分析模块发送的第一耗时百分比;
步骤S508,请求发送模块以第二请求模式向被测系统的接口发送请求,并确定第二超时率;
这里,所述第二请求模式为高压力模式,所述高压力模式为请求发送模块发送请求线程的并发数量较多的模式。在一些实施例中,所述第二请求线程的并发数量大于或等于102的情况下,第二请求线程的并发数量较多。在一些实施例中,决策与控制模块控制请求发送模块以逐渐增大请求并发线程数(eg.100线程)以高压力模式向被测系统接口发送第二请求。
步骤S509,请求发送模块将所述第二超时率发送给耗时数据分析模块;
步骤S510,耗时数据分析模块接收所述请求发送模块发送的第二超时率;
举例说明,决策与控制模块控制请求发送模块以逐渐增大请求并发线程数例如,100线程,以高压力模式向被测系统接口发送请求,此时接口请求出现较多超时。获取周期 时间内的第二超时率S2%。
步骤S511,耗时数据分析模块根据所述第二超时率,确定第二请求超时函数集;
这里,在所述第二超时率为S2%的情况下,所述第二请求超时函数集中的元素为耗时排行前S2%的请求,所述耗时排行是将请求耗时超时的函数按照从大到小的顺序进行排列得到的。
步骤S512,耗时数据分析模块确定所述第二请求超时函数集中每一函数的调用耗时占上层函数调用总耗时的百分比平均值,得到第二耗时百分比;
这里,所述第二耗时百分比内容为[M5,Ue],其中,M5表示第二请求发送方实时并发线程数(第二请求方压力),Ue表示各函数调用耗时百分比平均值。
举例说明,耗时数据分析模块根据当前第二超时率S2%,筛选出接口请求耗时排名前S2%的请求。计算一段时间内这部分请求中各函数调用耗时占上层函数调用总耗时的百分比平均值。将M5等于100线程(高压力)对应的各函数调用耗时百分比平均值记录下来。图5B为本申请实施例并发请求超时的诊断方法的低压力模式下函数调用耗时百分比平均值示意图,如图5C所示:函数f1内部调用了4个函数f1.1(),f1.2(),f1.3(),f1.4(),各函数调用耗时占上层函数总耗时的百分比平均值分别是10%,35%,35%,20%。
步骤S513,耗时数据分析模块将所述第二耗时百分比发送给决策与控制模块;
步骤S514,决策与控制模块接收所述耗时数据分析模块发送的第二耗时百分比;
步骤S515,决策与控制模块对比所述第一耗时百分比和第二耗时百分比,并将函数调用耗时百分比平均值发生变化的函数确定为待赋权函数;
这里,所述待赋权函数为待赋予权重值的函数。
在实施过程中,决策控制模块获取到耗时数据分析模块传来的函数调用耗时占比数据,计算请求发送模块分别以高压力模式和低压力模式给被测系统的接口发送请求时,各函数调用耗时百分比平均值的变化。
在实施过程中,随着发送方发送的请求数量增多,接口内部各函数调用耗时也会线性增加,但各函数调用耗时相对总耗时占比基本不变。在接口请求超时的情况下,接口内部各函数调用耗时占比会与正常请求时明显不同,表现为调用耗时明显增加。
举例说明,对比图5A和图5B,函数f1.2()耗时百分比平均值从20%增加到了35%,增加了15%;函数f1.3()耗时百分比平均值从30%增加到了35%,增加了5%。可以看出,这函数f1.2()和函数f1.3()两个函数的耗时百分比发生了明显的增加,可以认为导致接口调用超时的原因可能出现在对这两个函数的调用过程中。决策与控制模块会给所述两个调用百分比平均值增长的函数赋予权重。
步骤S516,决策与控制模块确定所述待赋权函数的权重;
在一些实施例中,所述步骤S516,确定所述待赋权函数的权重,包括:
步骤S517,决策与控制模块确定所述待赋权函数的耗时百分比平均值的变化量;
在一些实施例中,所述待赋权函数的耗时百分比平均值的变化量可以通过公式(8)计算,
P fn-incre=(C fn-high*T high-C fn-low*T low)/C fn-low*T low   (8);
其中,C fn-high表示函数调用f n()在高压力模式下的耗时百分比平均值;T high表示函数f()在发送方高压力模式下的平均耗时;C fn-low表示函数调用f n()在发送方低压力模式下的耗时占比;T low表示函数调用f()在发送方低压力模式下的平均耗时;P fn-incre表示函数调 用f n()在请求发送方高压模式下耗时平均值相对于低压模式下增加的百分比。
举例说明,在如图5A和图5B中,函数f1.2()耗时增加百分比为公式(9),
P f1.2()=(3000*35%-500*20%)/500*20%=950%   (9);
函数f1.3()耗时增加百分比为公式(10),
P f1.3()=(3000*35%-500*30%)/500*30%=600%   (10);
步骤S518,根据耗时增加百分比,确定函数调用链路分支的权重值;
在一些实施例中,可以通过公式(11)进行计算,
W fn=P fn-incre/(P f1-incre+P f2-incre+...+P f3-incre)   (11);
其中,W fn表示函数f()的调用链路分支权重值。
举例说明,在如图5A和图5B中,根据函数f1.2()耗时增加百分比,通过公式(11)计算函数调用链路分支权重值,可以得到式(12),
W f1.2()=950%/(950%+600%)=0.613   (12);
根据函数f1.3()耗时增加百分比,通过公式(11)计算函数调用链路分支权重值,可以得到式(13),
W f1.3()=600%/(950%+600%)=0.387   (13);
在一些实施例中,函数调用链路分支权重值越大,导致超时的原因越可能出现在这个函数中。
步骤S519,决策与控制模块向Arthas模块发送对所述待赋权函数的耗时统计指令;
这里,所述Arthas是一个开源的Java语言诊断工具,可以帮助开发测试人员全局时间查看系统运行状态、JVM(Java虚拟机)实时运行状态、生成CPU热点火焰图、方法执行调用监控等。Arthas支持JDK6+,支持Linux/Mac/Windows,采用命令行交互模式,同时提供丰富的自动补全功能,能够方便进行问题的定位和诊断。
这里,所述耗时统计指令用于对所述待赋权函数的内部函数进行调用耗时统计。
在一些实施例中,在向模块发送对这两个函数的内部函数调用耗时统计之后,重复步骤S505到步骤S519,进行第二级函数的调用耗时分析。由用户配置调用跟踪的级数,假设跟踪级数设为3,则可能得到结果如图5C所示。
举例说明,图5C为本申请实施例函数调用链路分析的示意图,如图5C所示,可能导致接口超时的函数调用链路有4种,
链路1:f1()->f1.2()->f1.2.1()->f1.2.1.1();
链路2:f1()->f1.2()->f1.2.1()->f1.2.1.2();
链路3:f1()->f.13()->f1.3.1();
链路4:f1()->f1.3()->f1.3.3()->f1.3.3.1()。
在链路3中,f1.3.1.1()没有标注权重值,是由于函数f1.3.1.1()在请求发送方高压和低压模式下耗时的差值小于5%。在上述情况下,不认为是这个函数调用导致超时,因此不给函数调用分支f1.3.1.1()赋函数调用链路分支权重值。函数f1.3.1()只有函数f1.3.1.1()这一个函数调用,排除函数调用对超时的影响,问题只可能出在函数f1.3.1()本身的代码,因此函数f1.3.1()是链路3的调用末端函数。
步骤S520,决策与控制模块根据函数调用链路的链路权重,确定链路超时概率。
举例说明,根据公式(14)计算链路权重值,
W chain=W 1*W 2*...*W n   (14);
其中,W chain为链路权重值,W n为链路n的权重值。
所述链路1至链路4的链路权重和超时概率的计算式如式(15)至式(18),
链路1权重:0.613*1*0.8=0.4904≈49.0%   (15);
链路2权重:0.613*1*0.2=0.1226≈12.3%   (16);
链路3权重:0.387*0.64=0.24768≈24.8%   (17);
链路4权重:0.387*0.36*1=0.13932≈13.9%。   (18);
整个链路的权重即代表该链路包含导致接口超时原因的概率。将各链路权重值从大到小排序,输出可能导致超时的链路及可能导致接口超时的函数位置。
从式(15)至式(18)可以看出,接口超时原因有49.0%的概率出现在链路1这条函数调用链路中,因此推荐开发测试人员优先去查链路1,在链路1的末端函数f1.2.1.1()中找原因。如果相关人员确定了问题不出现在链路1,则按照权重由高到低依次排查其他可能调用链路
基于前述的实施例,本申请实施例提供一种并发请求超时的诊断装置,该装置包括所包括的各模块、以及各模块所包括的各单元,以及各单元所包括的各子单元可以通过计算机设备中的处理器来实现;当然也可通过逻辑电路实现;在实施的过程中,处理器可以为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等。
图6为本申请实施例并发请求超时的诊断装置的组成结构示意图,如图6所示,所述装置600包括第一确定模块601、第一获取模块602、第二确定模块603、第二获取模块604、第三确定模块605和第四确定模块606,其中:第一确定模块601,配置为并发请求接口函数超时的情况下,确定所述接口函数的超时率,所述接口函数的层级数量大于1;第一获取模块602,配置为获取第一耗时集,其中,所述第一耗时集包括所述接口函数调用第一级函数的耗时;第二确定模块603,配置为根据所述超时率和所述第一耗时集,确定第一级函数中的第一待赋权函数集,其中,所述第一待赋权函数集由第一级函数中耗时百分比排序大于等于超时率的函数组成;第二获取模块604,配置为获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;第三确定模块605,配置为根据所述第二耗时集,确定导致所述接口函数超时的调用链路;第四确定模块606,配置为将所述调用链路的末端函数确定为导致所述接口函数超时的函数。
在一些实施例中,所述第三确定模块,包括第一确定单元、第二确定单元、第三确定单元和第四确定单元,其中:第一确定单元,配置为根据所述第二耗时集中每一第二耗时和与所述第二耗时对应的上一层函数总耗时,确定耗时百分比集;第二确定单元,配置为根据所述耗时百分比集,确定第二待赋权函数集中每一函数的权重值,其中,所述每一函数为所述预先设定层级函数中耗时百分比变化量非零的函数;第三确定单元,配置为根据所述每一函数的权重值,确定超时的函数调用链路分支;第四确定单元,配置为根据所述函数调用链路分支的权重值,确定导致所述接口函数超时的调用链路。
在一些实施例中,所述第四确定单元,包括乘积子单元、排序子单元和第一确定子单元,其中:乘积子单元,配置为将所述函数调用链路分支中每一函数对应的权重值相乘,得到对应的链路权重值;排序子单元,配置为将所述链路权重值按照特定顺序排列; 第一确定子单元,配置为将满足预设条件的链路权重值对应的函数调用链路分支,确定为所述导致所述接口函数超时的调用链路超时函数的调用链路。
在一些实施例中,所述第一确定模块601,包括第五确定单元和第六确定单元,其中:第五确定单元,配置为在以所述第一请求模式进行并发请求接口函数超时的情况下,确定第一超时率;第六确定单元,配置为在以所述第二请求模式进行并发请求接口函数超时的情况下,确定第二超时率。所述第二确定模块603,包括第七确定单元、第八确定单元、第九确定单元和第十确定单元,其中:第七确定单元,配置为根据所述第一超时率和第一子耗时,确定第一耗时百分比集;第八确定单元,配置为根据所述第二超时率和第二子耗时,确定第二耗时百分比集;第九确定单元,配置为根据所述第一耗时百分比集和第二耗时百分比集,确定百分比变化量;第十确定单元,配置为根据所述百分比变化量,确定第一级函数中的第一待赋权函数集。
在一些实施例中,所述第七确定单元,包括第二确定子单元、第三确定子单元、第四确定子单元、第五确定子单元和第六确定子单元,其中:第二确定子单元,配置为根据所述第一超时率和第一子耗时,确定第一请求超时函数集,其中,所述第一请求超时函数集为在所述第一请求模式下第一级函数的请求超时函数集;第三确定子单元,配置为确定所述第一请求超时函数集中每一函数的调用耗时占上一层函数调用总耗时的百分比平均值,得到第一耗时百分比集;
所述第八确定单元,包括第四确定子单元和第五确定子单元,其中:第四确定子单元,配置为根据所述第二超时率和第二子耗时,确定第二请求超时函数集,其中,所述第二子耗时为第二请求模式下接口函数调用的第一级函数的耗时,所述第二请求超时函数集为第二请求模式下第一级函数的请求超时函数集;第五确定子单元,配置为确定所述第二请求超时函数集中每一函数的调用耗时占上一层函数调用总耗时的百分比平均值,得到第二耗时百分比集。
在一些实施例中,所述第一确定单元,包括第六确定子单元、第七确定子单元、第八确定子单元和第九确定子单元,其中:第六确定子单元,配置为根据所述第三子耗时和所述第一待赋权函数的总耗时,确定第三请求超时函数集,其中,所述第三子耗时为第一请求模式下所述第一待赋权函数集调用预先设定层级函数的耗时,所述第三请求超时函数集为第一请求模式下第一待赋权函数调用下一级函数的请求超时函数集;第七确定子单元,配置为确定所述第三请求超时函数集中每一函数的调用耗时占第一层函数调用总耗时的百分比平均值,得到第三耗时百分比集;第八确定子单元,配置为根据所述第四子耗时和所述第一待赋权函数的总耗时,确定第四请求超时函数集,其中,所述第四子耗时为第二请求模式下所述第一待赋权函数集调用预先设定层级函数的耗时,所述第四请求超时函数集为第二请求模式下第一待赋权函数下一级函数的请求超时函数集;第九确定子单元,配置为确定所述第四请求超时函数集中每一函数的调用耗时占第一层函数调用总耗时的百分比平均值,得到第四耗时百分比集。
在一些实施例中,所述第二确定单元,包括第十确定子单元、第十一确定子单元和第十二确定子单元,其中:第十确定子单元,配置为对比所述第三耗时百分比集和第四耗时百分比集,确定第一子函数集中每一函数的百分比变化量;第十一确定子单元,配置为根据所述百分比变化量,确定第一子函数集中每一函数的权重值,其中,所述第一子函数集中每一函数为第一待赋权函数下一级函数中所述百分比变化量非零的函数;第 十二确定子单元,配置为所述根据所述每一函数的权重值,确定超时的函数调用链路分支,包括:根据所述第一待赋权函数集和第一子函数集,确定超时的函数调用链路分支。
在一些实施例中,所述第二确定单元,还包括第十三确定子单元、第十四确定子单元和第十五确定子单元,其中:第十三确定子单元,配置为根据所述第五耗时百分比集和第六耗时百分比集,确定第一子函数集中每一函数的百分比变化量;第十四确定子单元,配置为根据所述百分比变化量,确定第二子函数集中每一函数的权重值,其中,所述第二子函数集中每一函数为第一子函数集中函数调用的下一级函数中所述百分比变化量非零的函数;第十五确定子单元,配置为所述根据所述每一函数的权重值,确定超时的函数调用链路分支,包括:根据所述第一待赋权函数集、第一子函数集和第二子函数集,确定超时的函数调用链路分支。
在一些实施例中,所述第二确定单元,还包括第十六确定子单元、第十七确定子单元、第十八确定子单元和第十九确定子单元,其中:第十六确定子单元,配置为根据所述第三子耗时和所述第三子耗时对应的上一层函数的总耗时,确定第五请求超时函数集,其中,所述第五请求超时函数集为在所述第一请求模式下第一子函数集中函数调用下一级函数的请求超时函数集;第十七确定子单元,配置为确定所述第五请求超时函数集中每一函数的调用耗时占第二层函数调用总耗时的百分比平均值,得到第五耗时百分比集;第十八确定子单元,配置为根据第四子耗时,确定第六请求超时函数集,其中,所述第六请求超时函数级为第二请求模式下第一子函数集中函数调用下一级函数的请求超时函数集;
第十九确定子单元,配置为确定所述第六请求超时函数集中每一函数的调用耗时占第二层函数调用总耗时的百分比平均值,得到第六耗时百分比集。
以上装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请装置实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。
需要说明的是,本申请实施例中,如果以软件功能模块的形式实现上述的并发请求超时的诊断方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本申请实施例不限制于任何特定的硬件和软件结合。
对应地,本申请实施例提供一种计算机设备,包括存储器和处理器所述存储器存储有可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述方法中的步骤。对应地,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述方法中的步骤。这里需要指出的是:以上存储介质和设备实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请存储介质和设备实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。
需要说明的是,图7为本申请实施例中计算机设备的一种硬件实体示意图,如图7 所示,该计算机设备700的硬件实体包括:处理器701、通信接口702和存储器703,其中
处理器701通常控制计算机设备700的总体操作。
通信接口702可以使计算机设备通过网络与其他终端或服务器通信。
存储器703配置为存储由处理器701可执行的指令和应用,还可以缓存待处理器701以及计算机设备700中各模块待处理或已经处理的数据(例如,图像数据、音频数据、语音通信数据和视频通信数据),可以通过闪存(FLASH)或随机访问存储器(Random Access Memory,RAM)实现。
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。上述本申请实施例序号用于描述,不代表实施例的优劣。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例是示意性的,例如,所述单元的划分,为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
以上所述,仅为本申请的实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
工业实用性
本实施例中,在接口压力测试超时的情况下,除了统计第一层级函数的耗时外,还能够统计预先设定层级函数的耗时,并通过统计的耗时信息,准确定位到接口函数调用链路中最有可能导致接口超时的函数位置,给出所有可能导致超时的函数调用链路,避免了相关技术中,只能统计接口函数中一级方法的调用链路而无法定位到更深层调用的函数位置的问题。

Claims (12)

  1. 一种并发请求超时的诊断方法,所述方法包括:
    并发请求接口函数超时的情况下,确定所述接口函数的超时率,所述接口函数的层级数量大于1;
    获取第一耗时集,其中,所述第一耗时集包括所述接口函数调用第一级函数的耗时;
    根据所述超时率和所述第一耗时集,确定第一级函数中的第一待赋权函数集,其中,所述第一待赋权函数集由第一级函数中耗时百分比排序大于等于超时率的函数组成;
    获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;
    根据所述第二耗时集,确定导致所述接口函数超时的调用链路;
    将所述调用链路的末端函数确定为导致所述接口函数超时的函数。
  2. 根据权利要求1所述的方法,其中,所述根据所述第二耗时集,确定导致所述接口函数超时的调用链路,包括:
    根据所述第二耗时集中每一第二耗时和与所述第二耗时对应的上一层函数总耗时,确定耗时百分比集;
    根据所述耗时百分比集,确定第二待赋权函数集中每一函数的权重值,其中,所述每一函数为所述预先设定层级函数中耗时百分比变化量非零的函数;
    根据所述每一函数的权重值,确定超时的函数调用链路分支;
    根据所述函数调用链路分支的权重值,确定导致所述接口函数超时的调用链路。
  3. 根据权利要求2所述的方法,所述根据所述函数调用链路分支的权重值,确定导致所述接口函数超时的调用链路,包括:
    将所述函数调用链路分支中每一函数对应的权重值相乘,得到对应的链路权重值;
    将所述链路权重值按照特定顺序排列;
    将满足预设条件的链路权重值对应的函数调用链路分支,确定为所述导致所述接口函数超时的调用链路。
  4. 根据权利要求1至3任一项所述的方法,其中,所述并发请求模式包括第一请求模式和第二请求模式,所述第一耗时集包括第一子耗时和第二子耗时,其中,所述第一子耗时为在所述第一请求模式下所述接口函数调用第一级函数的耗时,所述第二子耗时为在所述第二请求模式下所述接口函数调用第一级函数的耗时;
    所述并发请求接口函数超时的情况下,确定所述接口函数的超时率,包括:
    在以所述第一请求模式进行并发请求接口函数超时的情况下,确定第一超时率;
    在以所述第二请求模式进行并发请求接口函数超时的情况下,确定第二超时率;
    对应地,所述根据所述超时率和所述第一耗时集,确定第一级函数中的第一待赋权函数集,包括:
    根据所述第一超时率和第一子耗时,确定第一耗时百分比集;
    根据所述第二超时率和第二子耗时,确定第二耗时百分比集;
    根据所述第一耗时百分比集和第二耗时百分比集,确定百分比变化量;
    根据所述百分比变化量,确定第一级函数中的第一待赋权函数集。
  5. 根据权利要求4所述的方法,其中,所述根据所述第一超时率和第一子耗时,确定第一耗时百分比集,包括:
    根据所述第一超时率和第一子耗时,确定第一请求超时函数集,其中,所述第一请求超时函数集为在所述第一请求模式下第一级函数的请求超时函数集;
    确定所述第一请求超时函数集中每一函数的调用耗时占上一层函数调用总耗时的百分比平均值,得到第一耗时百分比集;
    所述根据所述第二超时率和第二子耗时,确定第二耗时百分比集,包括:
    根据所述第二超时率和第二子耗时,确定第二请求超时函数集,其中,所述第二子耗时为第二请求模式下接口函数调用的第一级函数的耗时,所述第二请求超时函数集为第二请求模式下第一级函数的请求超时函数集;
    确定所述第二请求超时函数集中每一函数的调用耗时占上一层函数调用总耗时的百分比平均值,得到第二耗时百分比集。
  6. 根据权利要求4所述的方法,其中,所述第二耗时集包括第三子耗时和第四子耗时,所述耗时百分比集包括第三耗时百分比集和第四耗时百分比集,所述根据所述第二耗时集中每一第二耗时和与所述第二耗时对应的上一层函数总耗时,确定耗时百分比集,包括:
    根据所述第三子耗时和所述第一待赋权函数的总耗时,确定第三请求超时函数集,其中,所述第三子耗时为第一请求模式下所述第一待赋权函数集调用预先设定层级函数的耗时,所述第三请求超时函数集为第一请求模式下第一待赋权函数调用下一级函数的请求超时函数集;
    确定所述第三请求超时函数集中每一函数的调用耗时占第一层函数调用总耗时的百分比平均值,得到第三耗时百分比集;
    根据所述第四子耗时和所述第一待赋权函数的总耗时,确定第四请求超时函数集,其中,所述第四子耗时为第二请求模式下所述第一待赋权函数集调用预先设定层级函数的耗时,所述第四请求超时函数集为第二请求模式下第一待赋权函数下一级函数的请求超时函数集;
    确定所述第四请求超时函数集中每一函数的调用耗时占第一层函数调用总耗时的百分比平均值,得到第四耗时百分比集。
  7. 根据权利要求6所述的方法,其中,所述预先设定层级函数的层级数量为2,所述第二待赋权函数集包括第一子函数集,所述根据所述耗时百分比集,确定第二待赋权函数集中每一函数的权重值,包括:
    对比所述第三耗时百分比集和第四耗时百分比集,确定第一子函数集中每一函数的百分比变化量;
    根据所述百分比变化量,确定第一子函数集中每一函数的权重值,其中,所述第一子函数集中每一函数为第一待赋权函数下一级函数中所述百分比变化量非零的函数;
    所述根据所述每一函数的权重值,确定超时的函数调用链路分支,包括:根据所述第一待赋权函数集和第一子函数集,确定超时的函数调用链路分支。
  8. 根据权利要求7所述的方法,其中,所述预先设定层级函数的层级数量为3,所述第二待赋权函数集还包括第二子函数集,所述根据所述耗时百分比集,确定第二待赋权函数集中每一函数的权重值,还包括:
    确定第五耗时百分比集和第六耗时百分比集,其中,所述第五耗时百分比集为在所述第一请求模式下第一子函数集中函数调用下一级函数的请求超时函数集中每一函数 的调用耗时占第二层函数调用总耗时的百分比平均值,所述第六耗时百分比集为第二请求模式下第一子函数集中函数调用下一级函数的请求超时函数集中每一函数的调用耗时占第二层函数调用总耗时的百分比平均值;
    根据所述第五耗时百分比集和所述第六耗时百分比集,确定第一子函数集中每一函数的百分比变化量;
    根据所述百分比变化量,确定第二子函数集中每一函数的权重值,其中,所述第二子函数集中每一函数为第一子函数集中函数调用的下一级函数中所述百分比变化量非零的函数;
    所述根据所述每一函数的权重值,确定超时的函数调用链路分支,包括:根据所述第一待赋权函数集、第一子函数集和第二子函数集,确定超时的函数调用链路分支。
  9. 根据权利要求8所述的方法,其中,所述确定第五耗时百分比集和第六耗时百分比集,包括:
    根据所述第三子耗时和所述第三子耗时对应的上一层函数的总耗时,确定第五请求超时函数集,其中,所述第五请求超时函数集为在所述第一请求模式下第一子函数集中函数调用下一级函数的请求超时函数集;
    确定所述第五请求超时函数集中每一函数的调用耗时占第二层函数调用总耗时的百分比平均值,得到第五耗时百分比集;
    根据第四子耗时,确定第六请求超时函数集,其中,所述第六请求超时函数集为第二请求模式下第一子函数集中函数调用下一级函数的请求超时函数集;
    确定所述第六请求超时函数集中每一函数的调用耗时占第二层函数调用总耗时的百分比平均值,得到第六耗时百分比集。
  10. 一种并发请求超时的诊断装置,所述装置包括:
    第一确定模块,配置为并发请求接口函数超时的情况下,确定所述接口函数的超时率,所述接口函数的层级数量大于1;
    第一获取模块,配置为获取第一耗时集,其中,所述第一耗时集包括所述接口函数调用第一级函数的耗时;
    第二确定模块,配置为根据所述超时率和所述第一耗时集,确定第一级函数中的第一待赋权函数集,其中,所述第一待赋权函数集由第一级函数中耗时百分比排序大于等于超时率的函数组成;
    第二获取模块,配置为获取第二耗时集,其中,所述第二耗时集包括所述第一待赋权函数集中的每一函数调用预先设定层级函数的耗时;
    第三确定模块,配置为根据所述第二耗时集,确定导致所述接口函数超时的调用链路;
    第四确定模块,配置为将所述调用链路的末端函数确定为导致所述接口函数超时的函数。
  11. 一种计算机设备,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,所述处理器执行所述程序时实现权利要求1至9任一项所述方法中的步骤。
  12. 一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现权利要求1至9任一项所述方法中的步骤。
PCT/CN2021/129625 2020-11-30 2021-11-09 一种并发请求超时的诊断方法及装置、设备、存储介质 WO2022111278A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011379546.4A CN112328335B (zh) 2020-11-30 2020-11-30 一种并发请求超时的诊断方法及装置、设备、存储介质
CN202011379546.4 2020-11-30

Publications (1)

Publication Number Publication Date
WO2022111278A1 true WO2022111278A1 (zh) 2022-06-02

Family

ID=74309704

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/129625 WO2022111278A1 (zh) 2020-11-30 2021-11-09 一种并发请求超时的诊断方法及装置、设备、存储介质

Country Status (2)

Country Link
CN (1) CN112328335B (zh)
WO (1) WO2022111278A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112328335B (zh) * 2020-11-30 2023-03-21 深圳前海微众银行股份有限公司 一种并发请求超时的诊断方法及装置、设备、存储介质
CN114115751B (zh) * 2022-01-25 2022-04-19 苏州浪潮智能科技有限公司 一种大规模分布式存储的操作耗时自动监控方法与系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180307578A1 (en) * 2017-04-20 2018-10-25 International Business Machines Corporation Maintaining manageable utilization in a system to prevent excessive queuing of system requests
CN109753406A (zh) * 2018-12-05 2019-05-14 平安科技(深圳)有限公司 接口的性能监控方法、装置、设备及计算机可读存储介质
CN110865927A (zh) * 2019-11-20 2020-03-06 腾讯科技(深圳)有限公司 区块链调用链路异常检测方法、装置和计算机设备
CN111459547A (zh) * 2020-04-01 2020-07-28 腾讯科技(深圳)有限公司 一种函数调用链路的展示方法和装置
CN112328335A (zh) * 2020-11-30 2021-02-05 深圳前海微众银行股份有限公司 一种并发请求超时的诊断方法及装置、设备、存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100337228C (zh) * 2003-12-26 2007-09-12 华为技术有限公司 远程同步调用过程中的超时自适应方法
US9304536B2 (en) * 2013-08-22 2016-04-05 International Business Machines Corporation Calibrated timeout interval on a configuration value, shared timer value, and shared calibration factor
US9612807B2 (en) * 2014-09-18 2017-04-04 Facebook, Inc. Code placement using a dynamic call graph
CN107589986B (zh) * 2017-07-28 2020-02-28 北京北信源软件股份有限公司 一种数据处理超时的通用处理方法与装置
CN109451020B (zh) * 2018-11-06 2021-07-06 深圳前海微众银行股份有限公司 超时管理方法、设备及计算机可读存储介质
CN110888704A (zh) * 2019-11-08 2020-03-17 北京浪潮数据技术有限公司 一种高并发接口处理方法、装置、设备及存储介质
CN111522746B (zh) * 2020-04-23 2021-11-02 腾讯科技(深圳)有限公司 数据处理的方法、装置、设备及计算机可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180307578A1 (en) * 2017-04-20 2018-10-25 International Business Machines Corporation Maintaining manageable utilization in a system to prevent excessive queuing of system requests
CN109753406A (zh) * 2018-12-05 2019-05-14 平安科技(深圳)有限公司 接口的性能监控方法、装置、设备及计算机可读存储介质
CN110865927A (zh) * 2019-11-20 2020-03-06 腾讯科技(深圳)有限公司 区块链调用链路异常检测方法、装置和计算机设备
CN111459547A (zh) * 2020-04-01 2020-07-28 腾讯科技(深圳)有限公司 一种函数调用链路的展示方法和装置
CN112328335A (zh) * 2020-11-30 2021-02-05 深圳前海微众银行股份有限公司 一种并发请求超时的诊断方法及装置、设备、存储介质

Also Published As

Publication number Publication date
CN112328335B (zh) 2023-03-21
CN112328335A (zh) 2021-02-05

Similar Documents

Publication Publication Date Title
WO2019104854A1 (zh) 性能测试评价方法、装置、终端设备及存储介质
WO2022111278A1 (zh) 一种并发请求超时的诊断方法及装置、设备、存储介质
US8903801B2 (en) Fully automated SQL tuning
CN110362473B (zh) 测试环境的优化方法及装置、存储介质、终端
US8132170B2 (en) Call stack sampling in a data processing system
JP7004902B2 (ja) 性能評価プログラム、および性能評価方法
US8631280B2 (en) Method of measuring and diagnosing misbehaviors of software components and resources
TW201636839A (zh) 一種實現資源調度的方法與設備
CN111563014A (zh) 接口服务性能测试方法、装置、设备和存储介质
CN108763093A (zh) 一种自动化测试方法和系统
US20190280945A1 (en) Method and apparatus for determining primary scheduler from cloud computing system
WO2023125272A1 (zh) Radius环境下的全链路压测方法、装置、计算机设备及存储介质
CN110647447A (zh) 用于分布式系统的异常实例检测方法、装置、设备和介质
CN113656174A (zh) 资源分配方法、系统、计算机设备和存储介质
CN110647472A (zh) 崩溃信息统计方法、装置、计算机设备及存储介质
CN111124791A (zh) 一种系统测试方法及装置
US9397921B2 (en) Method and system for signal categorization for monitoring and detecting health changes in a database system
CN109992408B (zh) 一种资源分配方法、装置、电子设备和存储介质
CN117130851B (zh) 一种高性能计算集群运行效率评价方法及系统
US11797366B1 (en) Identifying a root cause of an error
CN115629953A (zh) 一种适用于国产基础软硬件环境的性能基准评估方法
CN112860763B (zh) 实时流数据处理方法、装置、计算机设备和存储介质
CN112948229A (zh) 调度集群的性能确定方法、装置、计算机设备及存储介质
CN113760989A (zh) 一种无界流数据处理方法、装置、设备及存储介质
US11036561B2 (en) Detecting device utilization imbalances

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896779

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 140923)

122 Ep: pct application non-entry in european phase

Ref document number: 21896779

Country of ref document: EP

Kind code of ref document: A1