CN114706733A - Section program abnormity monitoring method and device - Google Patents

Section program abnormity monitoring method and device Download PDF

Info

Publication number
CN114706733A
CN114706733A CN202210596591.8A CN202210596591A CN114706733A CN 114706733 A CN114706733 A CN 114706733A CN 202210596591 A CN202210596591 A CN 202210596591A CN 114706733 A CN114706733 A CN 114706733A
Authority
CN
China
Prior art keywords
index
program
abnormal
service
tangent plane
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210596591.8A
Other languages
Chinese (zh)
Other versions
CN114706733B (en
Inventor
张锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202210596591.8A priority Critical patent/CN114706733B/en
Publication of CN114706733A publication Critical patent/CN114706733A/en
Application granted granted Critical
Publication of CN114706733B publication Critical patent/CN114706733B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the specification describes a method and a device for monitoring section program exception. According to the method of an embodiment, a first index of a tangent program and a second index of a non-tangent program are monitored. And then, respectively judging whether the first index and the second index are abnormal, and determining the abnormal state of the section program according to the judgment result of whether the first index and the second index are abnormal. The first index and the second index are indexes which can cause the abnormity of the service to be realized by the running program, but the first index is the index of the tangent program, and the second index is the index of the non-tangent program. Therefore, whether the abnormal service is caused by the tangent plane program or not can be more accurately determined by combining the index of the tangent plane program and the index of the non-tangent plane program.

Description

Section program abnormity monitoring method and device
Technical Field
One or more embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method and an apparatus for monitoring tangent plane program anomalies.
Background
In the process of Java section-oriented programming AOP, the code function can be enhanced by dynamically injecting a section program into a target method of a business application program, and the method is a common software development method in the field of computer software development.
However, since the injected cut plane enhances the code to be essentially run parasitically on the original program's processes. When the function, performance index and the like of the program injected into the section program are abnormal, it is difficult to distinguish whether the function, performance index and the like are caused by the section program or the service program. Therefore, it is necessary to monitor the exception of the facet program.
Disclosure of Invention
One or more embodiments of the present specification describe a method and an apparatus for monitoring a tangent plane program exception, which are capable of determining whether a service exception occurs due to the tangent plane program exception.
According to a first aspect, a method for monitoring section program exception is provided, which includes:
monitoring a first index of the tangent plane program and a second index of at least one non-tangent plane program in real time; the first index and the second index are both indexes capable of causing the service to be realized by the running program to be abnormal; the running program is obtained by injecting the tangent plane program into the main program;
respectively judging whether the first index and the second index are abnormal or not;
and determining the abnormal state of the section program according to the judged result of whether the first index and the second index are abnormal or not.
In a possible implementation manner, when the tangent program is injected into the main program, the memory usage of the tangent program satisfies at least one of the following conditions:
the memory occupied by the section program can be released;
after the tangent plane program is injected, the virtual machine executing the running program has the residual memory for running the program.
In a possible implementation manner, the determining whether the first index is abnormal includes performing at least one of the following determinations:
judging whether the number of times of throwing the abnormity of the section program in the execution process is more than or equal to an abnormity throwing number threshold value or not;
judging whether the number of times of intercepting the request of the section program in the execution process is greater than or equal to a request intercepting number threshold value or not;
judging whether the response time in the section program execution process is greater than or equal to a response time threshold value;
judging whether the executed times of the section program are more than or equal to the threshold of the accumulated execution times;
if so, the first index of the tangent plane program is abnormal; otherwise, the first index of the tangent plane program is not abnormal.
In one possible implementation, the second index of the at least one non-tangent program includes: a resource indicator of a system for executing the running program.
In one possible implementation, the resource indicator of the system for executing the running program includes: at least one of a utilization rate of a CPU executing the running program, a memory utilization rate of a system executing the running program, a load of the CPU, and a number of times of garbage collection of a virtual machine executing the running program;
the determining whether the second indicator is abnormal includes performing at least one of the following determinations:
judging whether the utilization rate of a CPU executing the running program is greater than or equal to a first utilization rate threshold value or not;
judging whether the memory utilization rate of the system executing the running program is greater than or equal to a second utilization rate threshold value or not;
judging whether the load of the CPU is greater than or equal to a CPU load threshold value or not;
judging whether the garbage collection times of the virtual machine executing the running program is greater than or equal to a garbage collection time threshold value or not;
if so, determining that the resource index of the system for executing the running program is abnormal; otherwise, the resource index of the system for executing the running program is not abnormal.
In one possible implementation, the second index of the at least one non-tangent program includes: a service indicator for assessing quality of service.
In one possible implementation, the service index for evaluating the service quality includes: at least one of the number of failures of the web service to be realized by the running program, and the number of failures of the message subscription in the service to be realized by the running program;
the determining whether the second indicator is abnormal includes performing at least one of the following determinations:
judging whether the failure times of the web service to be realized by the running program are greater than or equal to a first failure time threshold value or not;
judging whether the failure times of the program interface service to be realized by the running program are greater than or equal to a second failure time threshold value or not;
judging whether the failure times of the message subscription in the service to be realized by the running program are more than or equal to a third failure time threshold value;
if yes, the service index for evaluating the service quality is abnormal; otherwise, the service index for evaluating the service quality is not abnormal.
In one possible implementation, the second index of the at least one non-tangent program includes: a resource index for executing the system running the program and a service index for evaluating a quality of service;
the determining the abnormal state of the section program according to the judged result of whether the first index and the second index are abnormal includes:
and determining the abnormal risk level of the tangent plane program according to the judgment result of whether the first index is abnormal, the judgment result of whether the resource index is abnormal and the judgment result of whether the service index is abnormal.
In a possible implementation manner, the determining an abnormal risk level of the tangent plane program according to a determination result of whether the first index is abnormal, a determination result of whether the resource index is abnormal, and a determination result of whether the service index is abnormal includes:
if the first index is abnormal, judging whether the resource index and the service index are abnormal or not;
if neither the resource index nor the service index is abnormal, determining the abnormal risk level of the tangent plane program as a first risk level;
if the resource index is abnormal and the service index is not abnormal, determining the risk level of the tangent plane program as a second risk level; wherein the degree of abnormal risk of the second risk level is greater than the degree of abnormal risk of the first risk level;
if the resource index is not abnormal and the service index is abnormal, determining the risk level of the tangent plane program as a third risk level; wherein the degree of abnormal risk of the third risk level is greater than the degree of abnormal risk of the second risk level;
and if the resource index and the service index are abnormal, determining the abnormal risk level of the tangent plane program as a third risk level.
According to a second aspect, there is provided an apparatus for monitoring section program exceptions, including: the system comprises an index monitoring module, an index abnormity judging module and a section program abnormity determining module;
the index monitoring module is configured to monitor a first index of the tangent plane program and a second index of at least one non-tangent plane program in real time; the first index and the second index are both indexes capable of causing the service to be realized by the running program to be abnormal; the running program is obtained by injecting the tangent plane program into the main program;
the index abnormality judgment module is configured to respectively judge whether the first index and the second index monitored by the index monitoring module are abnormal;
the section program abnormity determining module is configured to determine the abnormal state of the section program according to the judgment result of whether the first index and the second index are abnormal or not, which is judged by the index abnormity judging module.
According to a third aspect, there is provided a computing device comprising: a memory having executable code stored therein, and a processor, the processor when executing the executable code implementing the method of any of the first aspects above.
According to the method and the device provided by the embodiment of the specification, when the abnormity of the tangent program is monitored, the first index of the tangent program and the second index of the non-tangent program can be monitored firstly. And then, respectively judging whether the first index and the second index are abnormal, and determining the abnormal state of the section program according to the judgment result of whether the first index and the second index are abnormal. The first index and the second index are indexes which can cause the abnormity of the service to be realized by the running program, but the first index is the index of the tangent program, and the second index is the index of the non-tangent program. Therefore, whether the abnormal service is caused by the tangent plane program or not can be more accurately determined by combining the index of the tangent plane program and the index of the non-tangent plane program.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present specification, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a method for monitoring section program exceptions according to an embodiment of the present disclosure;
fig. 2 is a flowchart of a method for determining an abnormal risk level of a section program according to an embodiment of the present specification;
fig. 3 is a schematic diagram of a device for monitoring section procedure exceptions according to an embodiment of the present disclosure.
Detailed Description
In the field of computer software development, a section refers to a code block which is injected into a target program method by an injection framework, and the enhancement of a code function is completed by dynamically injecting section codes into a business application program.
However, the injected section enhancement code and the original program are coupled and affected with each other, which results in that when the service function and performance index after injecting the section program are abnormal, it is difficult to determine whether the abnormal result is caused by the section program problem or caused by the program of the service itself regardless of the section code. In addition, if the tangent plane program is deployed to a large number of online business application program containers, all business service or system resource monitoring index transactions need to be monitored and covered indiscriminately. The failure to distinguish whether the data is caused by the section code will result in too large monitoring alarm amount to perform abnormal problem investigation.
Therefore, the index which can cause the abnormity of the service to be realized in the monitoring section program is considered, and the abnormity state of the section program is determined by combining the index of the non-section program, so that whether the abnormity of the service is caused by the section program can be determined, and the abnormal problem cannot be checked due to the overlarge monitoring alarm amount.
As shown in fig. 1, an embodiment of the present disclosure provides a method for monitoring a section program exception, where the method may include the following steps:
step 101: monitoring a first index of a tangent plane program and a second index of at least one non-tangent plane program in real time; the first index and the second index are both indexes capable of causing the service to be realized by the running program to be abnormal; the running program is a program obtained after a section program is injected into the main program;
step 103: respectively judging whether the first index and the second index are abnormal or not;
step 105: and determining the abnormal state of the section program according to the judgment result of whether the first index and the second index are abnormal or not.
In this embodiment of the present specification, when monitoring an exception of a facet program, first a first indicator of the facet program and a second indicator of a non-facet program may be monitored. And then, respectively judging whether the first index and the second index are abnormal, and determining the abnormal state of the section program according to the judgment result of whether the first index and the second index are abnormal. The first index and the second index are indexes which can cause the abnormity of the service to be realized by the running program, but the first index is the index of the tangent program, and the second index is the index of the non-tangent program. Therefore, whether the abnormal service is caused by the tangent plane program or not can be more accurately determined by combining the index of the tangent plane program and the index of the non-tangent plane program.
Moreover, when the tangent plane program is deployed in the business application and the monitoring coverage of all the business service or system resource monitoring index transaction needs to be carried out in a total amount without distinction, the monitoring on the abnormity of the tangent plane program is realized through the scheme provided by the embodiment of the specification. Therefore, the alarm amount of monitoring can be greatly reduced by screening whether the abnormity of the section program occurs or not. Therefore, even if massive online service applications are deployed, abnormal problems cannot be eliminated due to the fact that monitoring alarm quantity is too large, and troubleshooting performance of the abnormal problems is improved.
It is easily understood that the determination result of whether the first index and/or the second index is abnormal may include a result of determining that the index is abnormal and a result of determining that the index is not abnormal.
The steps in FIG. 1 are described below with reference to specific examples.
Firstly, in step 101, monitoring a first index of a tangent plane program and a second index of at least one non-tangent plane program in real time;
in this step, the first index of the tangent plane program and the second index of the non-tangent plane program are considered to be monitored in real time at the same time. The non-tangent program is a related index which does not directly characterize the execution of the tangent program. For example, the index of the non-tangent program may be an index describing system resources, and may also be an index describing service quality.
As before, the tangent programs are injected into the main program to get the running programs that can implement some functions or services. Therefore, it is easy to understand that the first index of the tangent program and the second index of the non-tangent program should be indexes that can cause an abnormality to the function or service to be realized by the running program. Therefore, the abnormity monitoring of the tangent plane program can be more accurately realized by using the indexes.
When injecting the section program into the main program, the memory usage of the section program can be controlled in order to eliminate the abnormal situation caused by the improper memory of the section program. For example, when the tangent program is injected into the main program, in one possible implementation, the memory usage of the tangent program may satisfy that the memory occupied by the tangent program can be released, and in another possible implementation, the memory usage of the tangent program may also satisfy that after the tangent program is injected, the virtual machine executing the running program has the remaining memory for running the program. Of course, in another possible implementation manner, the memory usage of the tangent plane program may also be: the memory occupied by the tangent plane program can be released, and after the tangent plane program is injected, the virtual machine executing the running program has the residual memory for running the program.
Therefore, in order to eliminate the abnormal situation caused by improper use of the memory of the section program and further improve the accuracy of monitoring the abnormality of the section program, the memory use of the section program is further considered to be controlled. For example, the memory application and use of the tangent program can be strictly controlled from the code audit level, so that no memory leakage is ensured, and the memory occupied by the tangent program can be released, i.e. the memory occupation is controllable. For another example, the execution memory of the virtual machine is controlled, so that when the JVM calls the parameters, the running program for implementing the service and the function can have enough remaining memory space, and thus the running of the tangent program after being injected cannot be affected by insufficient memory.
Then, in step 103, respectively judging whether the first index and the second index are abnormal;
after the first index of the tangent plane program and the second index of the non-tangent plane program are monitored in real time, whether the first index and the second index are abnormal or not can be judged according to the monitoring result. It is of course readily understood that the results obtained by monitoring the first and second indicators should be results obtained for a set period of time.
Next, the determination of whether or not the first index and the second index are abnormal will be described.
Next, the determination of whether or not an abnormality occurs in the first index will be described.
Since the first index is an index for monitoring the tangent plane program, the index should be an index that may affect the normal operation of the service of the original program after the tangent plane program is injected into the business program for executing the related service. As such, when determining whether the first index is abnormal, at least one of the following determinations may be performed:
judging whether the number of times of throwing the abnormity of the section program in the execution process is more than or equal to an abnormity throwing number threshold value or not;
judging whether the times of intercepting the request of the section program in the execution process is more than or equal to a threshold value of the request intercepting times;
judging whether the response time in the section program execution process is greater than or equal to a response time threshold value;
judging whether the executed times of the section program are more than or equal to the threshold of the accumulated execution times;
if so, the first index of the tangent plane program is abnormal; otherwise, the first index of the tangent plane program is not abnormal.
In this embodiment, for one execution process of one injection point, a relationship between the number of times of throwing an exception in the tangent plane program and a throwing number threshold value, or a relationship between the number of times of requesting to be intercepted in the tangent plane program and a request intercepting number threshold value, or a relationship between response time and a response time threshold value in the tangent plane program may be considered. By judging whether at least one of the conditions meets the corresponding threshold value, whether the first index of the tangent plane program is abnormal or not can be determined. In addition, for multiple executions of one injection point, the relation between the executed times of the tangent plane program and the accumulated execution time threshold value can be judged, and if the executed times of the tangent plane program are more than or equal to the accumulated execution time threshold value, the first index of the tangent plane program can be determined to be abnormal.
The following explains each of the above-described judgment items:
and in the first condition, judging the throwing abnormity. During the execution of the injected tangent program, the program cannot be processed due to program problems and the like. At this time, an exception is thrown, and the upper layer calling program is told that the exception occurs. The threshold of the number of abnormal throws may be determined according to a specific fault tolerance, for example, the threshold of the number of abnormal throws may be set to 1, that is, the first index of the tangent plane program may be determined to be in an abnormal state as long as an abnormality is thrown.
And in case two, the request is intercepted and judged. When the injected tangent plane program meets a certain condition, the original execution flow of the injected method is skipped, so that the request or transaction is intercepted, and the normal execution of the service is influenced. For example, the original execution flow is A, B, C, D, but a program is artificially added between B and C, so that the program needs to judge, if the condition 1 is satisfied, C and D are executed, otherwise, C, D is not executed, and at this time, it appears that C and D are intercepted. For another example, in a service scenario, in a certain step, it may be determined whether the operating user is a member user. And if the operation user is determined to be a non-member user, intercepting a subsequent operation request of the operation user, which is embodied as interception.
And in case three, judging the response time. During program execution, if a single action takes too long to execute, the entire injected recipe can be executed too long, further resulting in service timeout or failure. For example, for an authentication program, 5s is required for the whole program to run, but no return value is obtained within 5 s. Then this time, which is reflected by the response time exceeding the response time threshold, authentication fails.
And in the fourth case, judging the execution times. The determination of the situation may be performed to determine whether the first index is abnormal according to a specific scene or time period, or to determine whether the first index is abnormal by setting different thresholds of the cumulative execution times in different scenes or time periods. For example, in the period of double 11, the facet program is called 10 ten thousand times in a certain time period, and then the first index is not abnormal due to a special scene; as another example, during a non-promotion period, the facet program is called 10 ten thousand times in a certain 30 minutes, and then the first indicator should be that an exception occurs.
Next, the determination of whether or not an abnormality occurs in the second index will be described.
The second index is an index of the non-tangent program, such as a resource index of a system for executing the running program, or a service index for evaluating service quality. These two types of second indicators are explained below separately.
In the first type, when the second index of the at least one non-tangent program includes a resource index for executing a system running the program, the resource index may include: at least one of a utilization rate of a CPU executing the running program, a memory utilization rate of a system executing the running program, a load of the CPU, and a number of times of garbage collection of a virtual machine executing the running program;
then, in determining whether the second indicator is abnormal, at least one of the following determinations may be performed:
judging whether the utilization rate of a CPU executing the running program is greater than or equal to a first utilization rate threshold value or not;
judging whether the memory utilization rate of a system executing the running program is greater than or equal to a second utilization rate threshold value or not;
judging whether the load of the CPU is greater than or equal to a CPU load threshold value or not;
judging whether the garbage collection times of the virtual machine executing the running program is greater than or equal to a garbage collection time threshold value or not;
if yes, the resource index of the system for executing the running program is abnormal; otherwise, the resource index of the system for executing the running program is not abnormal.
In this embodiment of the present description, when the second indicator includes a resource indicator, a size relationship between a CPU utilization of the executing running program and a first utilization threshold, or a size relationship between a memory utilization of a system executing the running program and a second utilization threshold, or a size relationship between a load of the CPU and a CPU load threshold, or a size relationship between a number of times of performing garbage collection by a virtual machine executing the running program and a garbage collection number threshold may be determined. Since in general, if the service is deteriorated due to the tangent problem, it will experience abnormal tangent program index, then system resource deterioration, and finally service index deterioration. Therefore, the resource index is monitored to obtain the second index, and the abnormal degree of the section program can be more accurately determined by combining the first index obtained by monitoring the section program.
It should be noted that, if at least one of the above-mentioned judgment items regarding the resource index is satisfied, it may be determined that the second index is abnormal.
When the second index of the at least one non-tangent program includes a service index for evaluating service quality, the service index may include at least one of the number of failures of the web service to be implemented by the running program, the number of failures of the page service to be implemented by the running program, and the number of failures of a message subscription in the service to be implemented by the running program;
then, in determining whether the second indicator is abnormal, at least one of the following determinations may be performed:
judging whether the failure times of the webpage service to be realized by the running program are larger than or equal to a first failure time threshold value or not;
judging whether the failure times of the program interface service to be realized by the running program are more than or equal to a second failure time threshold value or not;
judging whether the failure times of the message subscription in the service to be realized by the running program are more than or equal to a third failure time threshold value or not;
if yes, the service index for evaluating the service quality is abnormal; otherwise, the service index for evaluating the service quality is not abnormal.
In this embodiment of the present specification, when the second index includes a service index, a size relationship between a failure frequency of a web service to be implemented by executing the running program and a first failure frequency threshold may be determined, a size relationship between a failure frequency of a program interface service to be implemented by executing the running program and a second failure frequency threshold may be determined, and a size relationship between a failure frequency of a message subscription in a service to be implemented by the running program and a third failure frequency threshold may be determined. The service mainly comprises the webpage service, the program interface service and the message subscription and delivery, so that whether the service index for evaluating the service quality is abnormal or not can be accurately judged by judging the three indexes in the service.
It should be noted that the program interface service is a service that makes a program call through an interface. In the above-mentioned determinations regarding the service index, if at least one of the service index values is satisfied, it is determined that the second index is abnormal.
Finally, in step 105, the abnormal state of the section program is determined according to the judgment result of whether the first index and the second index are abnormal.
Since in general, if the service is deteriorated due to the tangent problem, it will experience abnormal tangent program index, then system resource deterioration, and finally service index deterioration. Therefore, the second index may include a plurality of types of indices at the same time. For example, in one possible implementation, the second index of the at least one non-tangent program includes both a resource index for executing a system running the program and a service index for evaluating a quality of service. Next, step 105 will be described with respect to a case where the second index includes both the resource index and the service index.
When the second index includes both the resource index and the service index, step 105 may determine the abnormal risk level of the tangent plane program according to the determination result of whether the first index is abnormal, the determination result of whether the resource index is abnormal, and the determination result of whether the service index is abnormal, when determining the abnormal state of the tangent plane program according to the determination result of whether the first index and the second index are abnormal.
For example, as shown in fig. 2, the abnormal risk level of the tangent plane program may be divided into different abnormal states, such as no abnormal risk, a first risk level, a second risk level, and a third risk level, according to the determination result of whether the first index, the resource index, and the service index are abnormal, where the risk degrees of the four divided risk levels are sequentially increased:
step 201: judging whether the first index is abnormal or not; if yes, go to step 203, otherwise go to step 207;
step 203: judging whether the resource index and the service index are abnormal or not, and executing the step 205;
step 205: determining the abnormal risk level of the tangent plane program according to the judgment result of whether the resource index and the service index are abnormal or not:
if neither the resource index nor the service index is abnormal, determining the abnormal risk level of the tangent plane program as a first risk level;
if the resource index is abnormal and the service index is not abnormal, determining the risk level of the tangent plane program as a second risk level; wherein the abnormal risk degree of the second risk level is greater than the abnormal risk degree of the first risk level;
if the resource index is not abnormal and the service index is abnormal, determining the risk level of the tangent plane program as a third risk level; wherein the degree of abnormal risk of the third risk level is greater than the degree of abnormal risk of the second risk level;
if the resource index and the service index are abnormal, determining the abnormal risk level of the tangent plane program as a third risk level;
step 207: and if the first index is not abnormal, determining that the tangent plane program has no abnormal risk when the resource index and the service index are in any states.
In the embodiment of the present specification, when determining the abnormal risk level of the section program, it may be determined whether the first index is abnormal or not first. If the first index is not abnormal, the abnormal operation is not caused by the section program, and therefore the section program has no abnormal risk. And if the first index is abnormal, whether the resource index and the service index are abnormal can be further judged. By utilizing the states of the resource indexes and the service indexes, the method can assist in judging the stage to which the current business abnormity is deteriorated, so that the abnormity risk level of the tangent plane program can be accurately determined.
Service degradation due to a profile problem typically experiences profile program index anomalies, then system resource degradation, and finally service index degradation. Therefore, when the first index is abnormal, the following four situations may be specifically included:
(1) if only the first index is abnormal, and the resource index and the service index are not abnormal, only the section program is abnormal, and the system resource and the service quality are not further deteriorated, namely the abnormal risk degree of the current section program is lower;
(2) if the first index and the resource index are abnormal and the service index is not abnormal, the abnormal tangent plane program is proved to have caused the deterioration of the resource index, and further possibly caused the deterioration of the service index, namely the abnormal risk degree of the current tangent plane program is relatively high;
(3) if the first index and the service index are abnormal and the resource index is not abnormal, the abnormal condition of the section program is very likely to directly cause the deterioration of the service quality of the business, and the abnormal risk degree of the section program is in a very high condition;
(4) if the first index, the resource index and the service index are all abnormal, it indicates that the index of the tangent plane program is abnormal, which may cause the deterioration of the system resource environment and further cause the deterioration of the service quality, and at this time, the abnormal risk degree of the tangent plane program is also in a very high situation.
Therefore, the state of the section program abnormity monitoring index is designed through the service quality deterioration process, and the grading principle of section abnormity monitoring is designed through the index association degree analysis of different process stages. And after the memory problem is eliminated, other possible influence logics are traversed, so that the monitoring of the section program and the service problem are effectively decoupled, and the abnormity caused by the section program can be better monitored.
As shown in fig. 3, the present embodiment provides a device for monitoring section program exception, including: an index monitoring module 301, an index abnormality judging module 302 and a section program abnormality determining module 303;
the index monitoring module 301 is configured to monitor a first index of a tangent plane program and a second index of at least one non-tangent plane program in real time; the first index and the second index are both indexes capable of causing the service to be realized by the running program to be abnormal; the running program is a program obtained after a section program is injected into the main program;
an index abnormality determination module 302 configured to determine whether the first index and the second index monitored by the index monitoring module 301 are abnormal, respectively;
the tangent plane program abnormality determining module 303 is configured to determine an abnormal state of the tangent plane program according to a determination result of whether the first index and the second index are abnormal, which is determined by the index abnormality determining module 302.
In a possible implementation manner, when the tangent plane program is injected into the main program, the memory usage of the tangent plane program monitored by the index monitoring module 301 should satisfy at least one of the following conditions:
the memory occupied by the section program can be released;
after the tangent plane program is injected, the virtual machine executing the running program has the residual memory for running the program.
In one possible implementation, the index abnormality determining module 302, when determining whether the first index is abnormal, is configured to perform at least one of the following determinations:
judging whether the number of times of throwing the abnormity of the section program in the execution process is more than or equal to an abnormity throwing number threshold value or not;
judging whether the number of times of intercepting the request of the section program in the execution process is greater than or equal to a request intercepting number threshold value or not;
judging whether the response time in the section program execution process is greater than or equal to a response time threshold value;
judging whether the executed times of the section program are more than or equal to the threshold of the accumulated execution times;
if so, the first index of the tangent plane program is abnormal; otherwise, the first index of the tangent plane program is not abnormal.
In one possible implementation, the second index of the at least one non-tangent program includes: a resource indicator for executing a system running the program; the resource indicators of the system for executing the running program include: at least one of a utilization rate of a CPU executing the running program, a memory utilization rate of a system executing the running program, a load of the CPU, and a number of times of garbage collection of a virtual machine executing the running program;
the index abnormality determination module 302, when determining whether the second index is abnormal, is configured to perform at least one of the following determinations:
judging whether the utilization rate of a CPU executing the running program is greater than or equal to a first utilization rate threshold value or not;
judging whether the memory utilization rate of a system executing the running program is greater than or equal to a second utilization rate threshold value or not;
judging whether the load of the CPU is greater than or equal to a CPU load threshold value or not;
judging whether the garbage collection times of the virtual machine executing the running program is greater than or equal to a garbage collection time threshold value or not;
if yes, the resource index of the system for executing the running program is abnormal; otherwise, the resource index of the system for executing the running program is not abnormal.
In one possible implementation, the second index of the at least one non-tangent program includes: a service indicator for assessing quality of service; the service indicators for assessing quality of service include: at least one of the number of failures of the web service to be realized by the running program, and the number of failures of the message subscription in the service to be realized by the running program;
the index abnormality determination module 302, when determining whether the second index is abnormal, is configured to perform at least one of the following determinations:
judging whether the failure times of the webpage service to be realized by the running program are larger than or equal to a first failure time threshold value or not;
judging whether the failure times of the program interface service to be realized by the running program are more than or equal to a second failure time threshold value or not;
judging whether the failure times of the message subscription in the service to be realized by the running program are more than or equal to a third failure time threshold value or not;
if yes, the service index for evaluating the service quality is abnormal; otherwise, the service index for evaluating the service quality is not abnormal.
In one possible implementation, the second index of the at least one non-tangent program includes: a resource index for executing a system running the program and a service index for evaluating a quality of service; the section program abnormality determining module 303 is configured to, when determining the abnormal state of the section program according to the determination result of whether the determined first index and the second index are abnormal, perform the following operations:
and determining the abnormal risk level of the tangent plane program according to the judgment result of whether the first index is abnormal, the judgment result of whether the resource index is abnormal and the judgment result of whether the service index is abnormal.
In a possible implementation manner, the tangent plane program abnormality determining module 303, when determining the abnormal risk level of the tangent plane program according to the determination result of whether the first index is abnormal, the determination result of whether the resource index is abnormal, and the determination result of whether the service index is abnormal, is configured to perform the following operations:
if the first index is abnormal, judging whether the resource index and the service index are abnormal or not;
if neither the resource index nor the service index is abnormal, determining the abnormal risk level of the tangent plane program as a first risk level;
if the resource index is abnormal and the service index is not abnormal, determining the risk level of the tangent plane program as a second risk level; wherein the degree of abnormal risk of the second risk level is greater than the degree of abnormal risk of the first risk level;
if the resource index is not abnormal and the service index is abnormal, determining the risk level of the tangent plane program as a third risk level; wherein the degree of abnormal risk of the third risk level is greater than the degree of abnormal risk of the second risk level;
and if the resource index and the service index are abnormal, determining the abnormal risk level of the tangent plane program as a third risk level.
The present specification also provides a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of the embodiments of the specification.
The present specification also provides a computing device comprising a memory and a processor, the memory having stored therein executable code, the processor, when executing the executable code, implementing the method in any of the embodiments of the specification.
It is to be understood that the schematic structure of the embodiments in this specification does not constitute a specific limitation on the monitoring device for the abnormal condition of the facet program. In other embodiments of the specification, the means for monitoring for tangent program anomalies may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
For the information interaction, execution process, and other contents between the units in the apparatus, the specific contents may refer to the description in the method embodiment of the present specification because the same concept is based on the method embodiment of the present specification, and are not described herein again.
Those skilled in the art will recognize that in one or more of the examples described above, the functions described in this specification can be implemented in hardware, software, hardware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, the purpose, technical solutions and advantages described in the present specification are further described in detail, it should be understood that the above-mentioned embodiments are only specific embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (11)

1. The section program abnormity monitoring method comprises the following steps:
monitoring a first index of the tangent plane program and a second index of at least one non-tangent plane program in real time; the first index and the second index are both indexes capable of causing the service to be realized by the running program to be abnormal; the running program is obtained by injecting the tangent plane program into the main program;
respectively judging whether the first index and the second index are abnormal or not;
and determining the abnormal state of the section program according to the judged result of whether the first index and the second index are abnormal or not.
2. The method of claim 1, wherein when the section program is injected into the main program, the memory usage of the section program satisfies at least one of the following conditions:
the memory occupied by the section program can be released;
after the tangent plane program is injected, the virtual machine executing the running program has the residual memory for running the program.
3. The method of claim 1, wherein the determining whether the first indicator is abnormal comprises performing at least one of:
judging whether the number of times of throwing the abnormity of the section program in the execution process is more than or equal to an abnormity throwing number threshold value or not;
judging whether the number of times of intercepting the request of the section program in the execution process is greater than or equal to a request intercepting number threshold value or not;
judging whether the response time in the section program execution process is greater than or equal to a response time threshold value;
judging whether the executed times of the section program are more than or equal to the threshold of the accumulated execution times;
if yes, the first index of the tangent plane program is abnormal; otherwise, the first index of the tangent plane program is not abnormal.
4. The method of claim 1, wherein the second scaling of the at least one non-tangent program comprises: a resource indicator of a system for executing the running program.
5. The method of claim 4, wherein,
the resource indicators of the system for executing the running program include: at least one of a utilization rate of a CPU executing the running program, a memory utilization rate of a system executing the running program, a load of the CPU, and a number of times of garbage collection of a virtual machine executing the running program;
the determining whether the second indicator is abnormal includes performing at least one of the following determinations:
judging whether the utilization rate of a CPU executing the running program is greater than or equal to a first utilization rate threshold value or not;
judging whether the memory utilization rate of the system executing the running program is greater than or equal to a second utilization rate threshold value or not;
judging whether the load of the CPU is greater than or equal to a CPU load threshold value or not;
judging whether the garbage collection times of the virtual machine executing the running program is greater than or equal to a garbage collection time threshold value or not;
if so, determining that the resource index of the system for executing the running program is abnormal; otherwise, the resource index of the system for executing the running program is not abnormal.
6. The method of claim 1, wherein the second guideline for the at least one non-tangent procedure comprises: a service indicator for assessing quality of service.
7. The method of claim 6, wherein,
the service indicators for assessing quality of service include: at least one of the failure times of the web service to be realized by the running program, and the failure times of message subscription in the service to be realized by the running program;
the determining whether the second indicator is abnormal includes performing at least one of the following determinations:
judging whether the failure times of the web service to be realized by the running program are greater than or equal to a first failure time threshold value or not;
judging whether the failure times of the program interface service to be realized by the running program are more than or equal to a second failure time threshold value or not;
judging whether the failure times of the message subscription in the service to be realized by the running program are more than or equal to a third failure time threshold value;
if yes, the service index for evaluating the service quality is abnormal; otherwise, the service index for evaluating the service quality is not abnormal.
8. The method of claim 1, wherein the second scaling of the at least one non-tangent program comprises: a resource index for executing the system running the program and a service index for evaluating a quality of service;
the determining the abnormal state of the section program according to the judged result of whether the first index and the second index are abnormal includes:
and determining the abnormal risk level of the tangent plane program according to the judgment result of whether the first index is abnormal, the judgment result of whether the resource index is abnormal and the judgment result of whether the service index is abnormal.
9. The method of claim 8, wherein the determining the abnormal risk level of the tangent program according to the determination result of whether the first index is abnormal, the determination result of whether the resource index is abnormal and the determination result of whether the service index is abnormal comprises:
if the first index is abnormal, judging whether the resource index and the service index are abnormal or not;
if neither the resource index nor the service index is abnormal, determining the abnormal risk level of the tangent plane program as a first risk level;
if the resource index is abnormal and the service index is not abnormal, determining the risk level of the tangent plane program as a second risk level; wherein the degree of abnormal risk of the second risk level is greater than the degree of abnormal risk of the first risk level;
if the resource index is not abnormal and the service index is abnormal, determining the risk level of the tangent plane program as a third risk level; wherein the degree of abnormal risk of the third risk level is greater than the degree of abnormal risk of the second risk level;
and if the resource index and the service index are abnormal, determining the abnormal risk level of the tangent plane program as a third risk level.
10. The abnormal monitoring device of tangent plane procedure includes: the system comprises an index monitoring module, an index abnormity judging module and a section program abnormity determining module;
the index monitoring module is configured to monitor a first index of the tangent plane program and a second index of at least one non-tangent plane program in real time; the first index and the second index are both indexes capable of causing the service to be realized by the running program to be abnormal; the running program is obtained after the tangent plane program is injected into the main program;
the index abnormality judgment module is configured to respectively judge whether the first index and the second index monitored by the index monitoring module are abnormal;
the section program abnormity determining module is configured to determine the abnormal state of the section program according to the judgment result of whether the first index and the second index are abnormal or not, which is judged by the index abnormity judging module.
11. A computing device comprising a memory having executable code stored therein and a processor that, when executing the executable code, implements the method of any of claims 1-9.
CN202210596591.8A 2022-05-30 2022-05-30 Section program abnormity monitoring method and device Active CN114706733B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210596591.8A CN114706733B (en) 2022-05-30 2022-05-30 Section program abnormity monitoring method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210596591.8A CN114706733B (en) 2022-05-30 2022-05-30 Section program abnormity monitoring method and device

Publications (2)

Publication Number Publication Date
CN114706733A true CN114706733A (en) 2022-07-05
CN114706733B CN114706733B (en) 2022-09-20

Family

ID=82176711

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210596591.8A Active CN114706733B (en) 2022-05-30 2022-05-30 Section program abnormity monitoring method and device

Country Status (1)

Country Link
CN (1) CN114706733B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115080356A (en) * 2022-07-21 2022-09-20 支付宝(杭州)信息技术有限公司 Abnormity warning method and device

Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040060055A1 (en) * 2000-01-28 2004-03-25 Kukura Robert A. Method and system for dynamic configuration of interceptors in a client-server environment
US20070162246A1 (en) * 2006-01-06 2007-07-12 Roland Barcia Exception thrower
US20150242623A1 (en) * 2014-02-26 2015-08-27 Ca, Inc. Real-time recording and monitoring of mobile applications
JP2018088211A (en) * 2016-11-30 2018-06-07 株式会社日立システムズ Resource monitoring system and resource monitoring method
CN109800101A (en) * 2019-02-01 2019-05-24 北京字节跳动网络技术有限公司 Report method, device, terminal device and the storage medium of small routine abnormal conditions
CN109960635A (en) * 2019-04-18 2019-07-02 江苏满运软件科技有限公司 The monitoring of real-time computing platform and alarm method, system, equipment and storage medium
CN109976973A (en) * 2019-02-19 2019-07-05 深圳点猫科技有限公司 Abnormality monitoring method and electronic equipment on a kind of small routine line
US20190294796A1 (en) * 2018-03-23 2019-09-26 Microsoft Technology Licensing, Llc Resolving anomalies for network applications using code injection
CN110362459A (en) * 2019-06-18 2019-10-22 中国平安人寿保险股份有限公司 A kind of system performance monitoring method and device, electronic equipment based on SpringAop
CN111427738A (en) * 2019-01-09 2020-07-17 阿里巴巴集团控股有限公司 Display method, application monitoring module, bytecode enhancement module and display system
CN111444065A (en) * 2020-05-18 2020-07-24 江苏电力信息技术有限公司 AspectJ-based mobile terminal performance index monitoring method
CN111522746A (en) * 2020-04-23 2020-08-11 腾讯科技(深圳)有限公司 Data processing method, device, equipment and computer readable storage medium
CN112162908A (en) * 2020-09-30 2021-01-01 中国工商银行股份有限公司 Program call link monitoring implementation method and device based on bytecode injection technology
WO2021008031A1 (en) * 2019-07-16 2021-01-21 平安普惠企业管理有限公司 Processing method for implementing monitoring intellectualization on the basis of micro-services, and electronic device
CN112948835A (en) * 2021-03-26 2021-06-11 支付宝(杭州)信息技术有限公司 Applet risk detection method and device
CN113285836A (en) * 2021-05-27 2021-08-20 中国人民解放军陆军工程大学 System and method for enhancing toughness of software system based on micro-service real-time migration
CN113537590A (en) * 2021-07-14 2021-10-22 深圳供电局有限公司 Data anomaly prediction method and system
WO2021212756A1 (en) * 2020-04-23 2021-10-28 平安科技(深圳)有限公司 Index anomaly analysis method and apparatus, and electronic device and storage medium
CN113590429A (en) * 2021-08-18 2021-11-02 北京爱奇艺科技有限公司 Server fault diagnosis method and device and electronic equipment
WO2021248754A1 (en) * 2020-06-09 2021-12-16 北京旷视科技有限公司 System testing method and apparatus, and storage medium and electronic device
CN113900896A (en) * 2021-10-11 2022-01-07 北京博睿宏远数据科技股份有限公司 Code operation monitoring method, device, equipment and storage medium
CN113986669A (en) * 2021-10-28 2022-01-28 北京航天云路有限公司 Call chain tracking and business analysis method based on AOP annotation
CN114428546A (en) * 2022-01-25 2022-05-03 惠州Tcl移动通信有限公司 Background application cleaning method and device, storage medium and terminal equipment
CN114528204A (en) * 2022-01-13 2022-05-24 阿里巴巴(中国)有限公司 Method for processing code, method for processing exception and respective device

Patent Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040060055A1 (en) * 2000-01-28 2004-03-25 Kukura Robert A. Method and system for dynamic configuration of interceptors in a client-server environment
US20070162246A1 (en) * 2006-01-06 2007-07-12 Roland Barcia Exception thrower
US20150242623A1 (en) * 2014-02-26 2015-08-27 Ca, Inc. Real-time recording and monitoring of mobile applications
JP2018088211A (en) * 2016-11-30 2018-06-07 株式会社日立システムズ Resource monitoring system and resource monitoring method
US20190294796A1 (en) * 2018-03-23 2019-09-26 Microsoft Technology Licensing, Llc Resolving anomalies for network applications using code injection
CN111427738A (en) * 2019-01-09 2020-07-17 阿里巴巴集团控股有限公司 Display method, application monitoring module, bytecode enhancement module and display system
CN109800101A (en) * 2019-02-01 2019-05-24 北京字节跳动网络技术有限公司 Report method, device, terminal device and the storage medium of small routine abnormal conditions
CN109976973A (en) * 2019-02-19 2019-07-05 深圳点猫科技有限公司 Abnormality monitoring method and electronic equipment on a kind of small routine line
CN109960635A (en) * 2019-04-18 2019-07-02 江苏满运软件科技有限公司 The monitoring of real-time computing platform and alarm method, system, equipment and storage medium
CN110362459A (en) * 2019-06-18 2019-10-22 中国平安人寿保险股份有限公司 A kind of system performance monitoring method and device, electronic equipment based on SpringAop
WO2021008031A1 (en) * 2019-07-16 2021-01-21 平安普惠企业管理有限公司 Processing method for implementing monitoring intellectualization on the basis of micro-services, and electronic device
WO2021212756A1 (en) * 2020-04-23 2021-10-28 平安科技(深圳)有限公司 Index anomaly analysis method and apparatus, and electronic device and storage medium
CN111522746A (en) * 2020-04-23 2020-08-11 腾讯科技(深圳)有限公司 Data processing method, device, equipment and computer readable storage medium
CN111444065A (en) * 2020-05-18 2020-07-24 江苏电力信息技术有限公司 AspectJ-based mobile terminal performance index monitoring method
WO2021248754A1 (en) * 2020-06-09 2021-12-16 北京旷视科技有限公司 System testing method and apparatus, and storage medium and electronic device
CN112162908A (en) * 2020-09-30 2021-01-01 中国工商银行股份有限公司 Program call link monitoring implementation method and device based on bytecode injection technology
CN112948835A (en) * 2021-03-26 2021-06-11 支付宝(杭州)信息技术有限公司 Applet risk detection method and device
CN113285836A (en) * 2021-05-27 2021-08-20 中国人民解放军陆军工程大学 System and method for enhancing toughness of software system based on micro-service real-time migration
CN113537590A (en) * 2021-07-14 2021-10-22 深圳供电局有限公司 Data anomaly prediction method and system
CN113590429A (en) * 2021-08-18 2021-11-02 北京爱奇艺科技有限公司 Server fault diagnosis method and device and electronic equipment
CN113900896A (en) * 2021-10-11 2022-01-07 北京博睿宏远数据科技股份有限公司 Code operation monitoring method, device, equipment and storage medium
CN113986669A (en) * 2021-10-28 2022-01-28 北京航天云路有限公司 Call chain tracking and business analysis method based on AOP annotation
CN114528204A (en) * 2022-01-13 2022-05-24 阿里巴巴(中国)有限公司 Method for processing code, method for processing exception and respective device
CN114428546A (en) * 2022-01-25 2022-05-03 惠州Tcl移动通信有限公司 Background application cleaning method and device, storage medium and terminal equipment

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BRUNO CABRAL 等: "Implementing Retry - Featuring AOP", 《2009 FOURTH LATIN-AMERICAN SYMPOSIUM ON DEPENDABLE COMPUTING》 *
GILJONG YOO 等: "Monitoring methodology using Aspect Oriented Programming in functional based system", 《2010 THE 12TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT)》 *
秦东 等: "AOP在异常处理中的应用研究", 《唐山学院学报》 *
陈亮等: "基于探针的Web服务运行时监控方法研究", 《装备学院学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115080356A (en) * 2022-07-21 2022-09-20 支付宝(杭州)信息技术有限公司 Abnormity warning method and device
CN115080356B (en) * 2022-07-21 2022-12-13 支付宝(杭州)信息技术有限公司 Abnormity warning method and device

Also Published As

Publication number Publication date
CN114706733B (en) 2022-09-20

Similar Documents

Publication Publication Date Title
US7707588B2 (en) Software application action monitoring
US9529694B2 (en) Techniques for adaptive trace logging
US20080005281A1 (en) Error capture and reporting in a distributed computing environment
CN110309029B (en) Abnormal data acquisition method and device, computer equipment and storage medium
US9158606B2 (en) Failure repetition avoidance in data processing
US20130226526A1 (en) Automated Performance Data Management and Collection
US8627150B2 (en) System and method for using dependency in a dynamic model to relate performance problems in a complex middleware environment
CN108763089B (en) Test method, device and system
KR20060046276A (en) Method, system, and apparatus for providing custom product support for a software program based upon states of program execution instability
CN108255716B (en) Software evaluation method based on cloud computing technology
US7818625B2 (en) Techniques for performing memory diagnostics
CN113468009B (en) Pressure testing method and device, electronic equipment and storage medium
CN108416665B (en) Data interaction method and device, computer equipment and storage medium
US20090204946A1 (en) Intelligent software code updater
CN111209110A (en) Task scheduling management method, system and storage medium for realizing load balance
CN114706733B (en) Section program abnormity monitoring method and device
US7487380B2 (en) Execution recovery escalation policy
CN111723380A (en) Method and device for detecting component bugs
US11502894B2 (en) Predicting performance of a network order fulfillment system
US11847207B2 (en) Security-adaptive code execution
Araujo et al. A software maintenance methodology: An approach applied to software aging
CN115935341A (en) Vulnerability defense method, system, server and storage medium
CN114386047A (en) Application vulnerability detection method and device, electronic equipment and storage medium
CN113094243B (en) Node performance detection method and device
CN101964922B (en) Abnormal condition capturing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant