CN109388546B - Method, device and system for processing faults of application program - Google Patents

Method, device and system for processing faults of application program Download PDF

Info

Publication number
CN109388546B
CN109388546B CN201710665615.XA CN201710665615A CN109388546B CN 109388546 B CN109388546 B CN 109388546B CN 201710665615 A CN201710665615 A CN 201710665615A CN 109388546 B CN109388546 B CN 109388546B
Authority
CN
China
Prior art keywords
abnormal
name
data
key
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710665615.XA
Other languages
Chinese (zh)
Other versions
CN109388546A (en
Inventor
李政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710665615.XA priority Critical patent/CN109388546B/en
Publication of CN109388546A publication Critical patent/CN109388546A/en
Application granted granted Critical
Publication of CN109388546B publication Critical patent/CN109388546B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging

Abstract

The invention discloses a method, a device and a system for processing faults of an application program, and relates to the field of computers. One embodiment of the method comprises: the method comprises the steps of separately obtaining abnormal data generated when an application program runs, wherein any one of the abnormal data comprises at least one abnormal characteristic information; aggregating the abnormal data according to the abnormal characteristic information to obtain the quantity of the abnormal data corresponding to any abnormal characteristic information; and displaying any abnormal characteristic information and the abnormal data quantity corresponding to the abnormal characteristic information so as to process the fault of the application program. The implementation method can quickly find the fault and accurately position the fault point.

Description

Method, device and system for processing faults of application program
Technical Field
The present invention relates to the field of computers, and in particular, to a method, an apparatus, and a system for processing a failure of an application.
Background
In the technical field of computers, faults generated in the running process of an application program need to be processed in time, so that normal operation of services can be guaranteed, and user experience is improved.
The fault discovery and processing flow in the prior art is generally as follows:
s101, generating an acquisition program through hard coding, and outputting abnormal information generated when the application program runs to a log file.
And S102, the log collection service reports the contents of the log files to a log system for storage.
S103, when the monitoring system finds that the system has a fault, an abnormal log is inquired in a log system in a manual mode so as to locate the fault reason; the code can only be temporarily modified if the cause cannot be located.
In the process of implementing the invention, the inventor finds that the prior art has at least the following problems:
1. the log output depends on program hard coding, and when a fault occurs, a fault point without output abnormal data cannot be positioned, and the log output needs to be processed online, so that the efficiency is low.
2. The abnormal data output code is coupled with the service code, and the maintenance is difficult.
3. The output log content can not be controlled, wherein a large amount of content irrelevant to faults exists, so that the log data volume is huge, and the query cannot be carried out or is slow.
4. The manual query efficiency is low, meanwhile, logs are unstructured data, aggregation analysis cannot be achieved, fault points are difficult to accurately locate, and fault reasons are difficult to obtain. The above-described flow is shown in FIG. 1.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method, an apparatus, and a system for processing a fault of an application program, which can output abnormal data isolated from service data, and display the abnormal data after aggregating the abnormal data, so as to quickly find a fault and locate a fault point.
To achieve the above object, according to one aspect of the present invention, a method, apparatus, and system for processing a failure of an application are provided.
The method for processing the faults of the application program comprises the following steps: the method comprises the steps of separately obtaining abnormal data generated when an application program runs, wherein any one of the abnormal data comprises at least one abnormal characteristic information; aggregating the abnormal data according to the abnormal characteristic information to obtain the quantity of the abnormal data corresponding to any abnormal characteristic information; and displaying any abnormal characteristic information and the abnormal data quantity corresponding to the abnormal characteristic information so as to process the fault of the application program.
Optionally, the abnormal feature information includes at least one of: exception time, exception application name, exception interface name, exception calling method name, exception IP address, exception description information, and exception call stack information.
Optionally, the method further comprises: and before the abnormal data is aggregated, removing the data in the abnormal call stack information after the line data which contains the preset configuration value for the first time.
Optionally, the method further comprises: and before the abnormal data is aggregated, removing data except the stack top data and the line data containing the preset configuration value for the first time in the abnormal call stack information.
Optionally, the aggregating the abnormal data according to the abnormal feature information, and acquiring the quantity of the abnormal data corresponding to any abnormal feature information includes at least one of: taking the abnormal time as a first key, taking the name of an abnormal application program as a second key, and counting the number of abnormal data corresponding to the key values of the first key and the second key; counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by using the abnormal time and the name of the abnormal application program as the first key and using the name of the abnormal interface and the name of the abnormal calling method as the second key; and counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by taking the abnormal time, the name of the abnormal application program, the name of the abnormal interface and the name of the abnormal calling method as the first key and taking the abnormal description information, the abnormal calling stack information and the abnormal IP address as the second key.
Optionally, the displaying of any abnormal feature information and the abnormal data quantity corresponding to the abnormal feature information include at least one of: based on the aggregated anomaly data: displaying the quantity of abnormal data corresponding to any abnormal time and any abnormal application program name to obtain application dimension display information; displaying the quantity of abnormal data corresponding to any abnormal interface name and any abnormal calling method name to obtain method dimension display information; and displaying the abnormal data quantity corresponding to any abnormal description information, any abnormal call stack information and any abnormal IP address to obtain IP dimension display information.
Optionally, the anomaly time is in seconds or in units of minutes.
Optionally, the method further comprises: based on application dimension display information, when an abnormal application program name and abnormal time meeting a first alarm condition exist, determining the abnormal application program name as a fault application program name, and determining the abnormal time as fault time; for the name of the fault application program and the fault time, based on method dimension display information, when an abnormal interface name and an abnormal calling method name which accord with a second alarm condition exist, determining the abnormal interface name as a fault interface name, and determining the abnormal calling method name as a fault calling method name; and for the fault interface name and the fault calling method name, based on IP dimension display information, when abnormal description information, abnormal call stack information and an abnormal IP address which accord with a third alarm condition exist, determining the abnormal description information as fault description information, determining the abnormal call stack information as fault call stack information, and determining the abnormal IP address as a fault IP address.
To achieve the above object, according to another aspect of the present invention, there is provided an apparatus for processing a failure of an application.
The device for processing the faults of the application program comprises the following components: the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit can be used for independently acquiring abnormal data generated when an application program runs, and any one of the abnormal data comprises at least one type of abnormal characteristic information; the aggregation unit is used for aggregating the abnormal data according to the abnormal characteristic information to acquire the quantity of the abnormal data corresponding to any abnormal characteristic information; and the display unit can be used for displaying any abnormal characteristic information and the abnormal data quantity corresponding to the abnormal characteristic information so as to process the fault of the application program.
Optionally, the abnormal feature information includes at least one of: the method comprises the following steps of (1) abnormal time, abnormal application program name, abnormal interface name, abnormal calling method name, abnormal IP address, abnormal description information and abnormal calling stack information;
and, the polymerization unit is for at least one of: taking the abnormal time as a first key, taking the name of an abnormal application program as a second key, and counting the number of abnormal data corresponding to the key values of the first key and the second key; counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by using the abnormal time and the name of the abnormal application program as the first key and using the name of the abnormal interface and the name of the abnormal calling method as the second key; counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by taking abnormal time, an abnormal application program name, an abnormal interface name and an abnormal calling method name as the first key and taking abnormal description information, abnormal calling stack information and an abnormal IP address as the second key;
and the display unit is used for at least one of the following: displaying the quantity of abnormal data corresponding to any abnormal time and any abnormal application program name based on the aggregated abnormal data to obtain application dimension display information; displaying the quantity of abnormal data corresponding to any abnormal interface name and any abnormal calling method name based on the aggregated abnormal data to obtain method dimension display information; displaying the quantity of abnormal data corresponding to any abnormal description information, any abnormal call stack information and any abnormal IP address based on the aggregated abnormal data to obtain IP dimension display information;
and, the presentation unit is further configured to: based on application dimension display information, when an abnormal application program name and abnormal time meeting a first alarm condition exist, determining the abnormal application program name as a fault application program name, and determining the abnormal time as fault time; for the name of the fault application program and the fault time, based on method dimension display information, when an abnormal interface name and an abnormal calling method name which accord with a second alarm condition exist, determining the abnormal interface name as a fault interface name, and determining the abnormal calling method name as a fault calling method name; and for the fault interface name and the fault calling method name, based on IP dimension display information, when abnormal description information, abnormal call stack information and an abnormal IP address which accord with a third alarm condition exist, determining the abnormal description information as fault description information, determining the abnormal call stack information as fault call stack information, and determining the abnormal IP address as a fault IP address.
To achieve the above object, according to still another aspect of the present invention, there is provided a system for processing a failure of an application.
The system for processing the faults of the application program comprises at least one application node, a data aggregation node and a data display node; wherein: the application node is used for separately acquiring abnormal data generated when the application program runs, wherein any one of the abnormal data comprises at least one abnormal characteristic information; the data aggregation node is used for aggregating the abnormal data according to the abnormal characteristic information to acquire the quantity of the abnormal data corresponding to any abnormal characteristic information; the data display node is used for displaying any abnormal characteristic information and abnormal data quantity corresponding to the abnormal characteristic information so as to process faults of the application program.
Optionally, the abnormal feature information includes at least one of: the method comprises the following steps of (1) abnormal time, abnormal application program name, abnormal interface name, abnormal calling method name, abnormal IP address, abnormal description information and abnormal calling stack information;
and the data aggregation node is configured to at least one of: taking the abnormal time as a first key and an abnormal application program name as a second key, and counting the number of abnormal data corresponding to the key values of the first key and the second key; counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by using the abnormal time and the name of the abnormal application program as the first key and using the name of the abnormal interface and the name of the abnormal calling method as the second key; counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by taking the abnormal time, the name of the abnormal application program, the name of the abnormal interface and the name of the abnormal calling method as the first key and taking the abnormal description information, the abnormal calling stack information and the abnormal IP address as the second key;
and the data presentation node is configured to at least one of: displaying the quantity of abnormal data corresponding to any abnormal time and any abnormal application program name based on the aggregated abnormal data to obtain application dimension display information; displaying the quantity of abnormal data corresponding to any abnormal interface name and any abnormal calling method name based on the aggregated abnormal data to obtain method dimension display information; displaying the quantity of abnormal data corresponding to any abnormal description information, any abnormal call stack information and any abnormal IP address based on the aggregated abnormal data to obtain IP dimension display information;
and the data presentation node is further configured to: based on application dimension display information, when an abnormal application program name and abnormal time meeting a first alarm condition exist, determining the abnormal application program name as a fault application program name, and determining the abnormal time as fault time; for the name of the fault application program and the fault time, determining the name of an abnormal interface as the name of a fault interface and determining the name of an abnormal calling method as the name of a fault calling method when the name of the abnormal interface and the name of the abnormal calling method which accord with a second alarm condition exist on the basis of method dimension display information; and for the fault interface name and the fault calling method name, based on IP dimension display information, when abnormal description information, abnormal call stack information and an abnormal IP address which accord with a third alarm condition exist, determining the abnormal description information as fault description information, determining the abnormal call stack information as fault call stack information, and determining the abnormal IP address as a fault IP address.
To achieve the above object, according to still another aspect of the present invention, there is provided an electronic apparatus.
An electronic device of the present invention includes: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the method for processing the faults of the application program.
To achieve the above object, according to still another aspect of the present invention, there is provided a computer-readable storage medium.
A computer-readable storage medium of the present invention has stored thereon a computer program which, when executed by a processor, implements the method of handling a failure of an application provided by the present invention.
According to the technical scheme of the invention, one embodiment of the invention has the following advantages or beneficial effects: by isolating and completely outputting the abnormal data and the service data, the log data volume is greatly reduced, and the technical problems of coupling of the abnormal data and the service data, inconvenience in system maintenance and low log query efficiency in the prior art are solved; the output abnormal data is subjected to data filtering, so that the log data volume is further reduced; by carrying out aggregation statistics on abnormal data in multiple dimensions, support is provided for subsequent data analysis and fault point positioning, and the technical problem that the abnormal data cannot be subjected to aggregation analysis and thus a fault point cannot be determined in the prior art is solved; by displaying abnormal data from multiple dimensions, the technical effects of quickly finding and positioning the fault point and providing necessary data for obtaining the fault reason are achieved, and the technical problems that in the prior art, the manual query efficiency is low, the fault point cannot be found and accurately positioned in time, and the fault reason is difficult to obtain are solved.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic flow diagram of a prior art method for handling application failures;
FIG. 2 is a schematic diagram of the main steps of a method of handling a failure of an application according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating application dimension exposure information of a method for handling failures of an application according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a main part of an apparatus for processing a failure of an application according to an embodiment of the present invention;
FIG. 5 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;
fig. 6 is a schematic structural diagram of an electronic device for implementing the method for processing the failure of the application program according to the embodiment of the present invention.
Fig. 7 is a schematic diagram of the composition of a system for handling a failure of an application according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
According to the technical scheme of the embodiment of the invention, the abnormal data and the service data are isolated and completely output, so that the log data volume is greatly reduced, and the technical problems of coupling of the abnormal data and the service data, inconvenience in system maintenance and low log query efficiency in the prior art are solved; the output abnormal data is subjected to data filtering, so that the data volume of the log is further reduced; by carrying out aggregation statistics on abnormal data in multiple dimensions, support is provided for subsequent data analysis and fault point positioning, and the technical problem that the abnormal data cannot be subjected to aggregation analysis and thus a fault point cannot be determined in the prior art is solved; by displaying abnormal data from multiple dimensions, the technical effects that fault points can be found and positioned quickly without depending on a monitoring system, and necessary data for obtaining fault reasons are provided are achieved, and the technical problems that in the prior art, the manual query efficiency is low, the fault points cannot be found and positioned accurately in time, and the fault reasons are difficult to obtain are solved.
Example one
Fig. 2 is a schematic diagram of the main steps of the method for processing the failure of the application according to the present embodiment.
As shown in fig. 2, the method for processing a failure of an application according to an embodiment of the present invention mainly includes the following steps:
step S201: and acquiring exception data generated during the running of the application program separately, wherein any exception data comprises at least one exception characteristic information.
In the present embodiment, the abnormal data refers to data that is different from most of data generated when the application program runs. Generally, when a fault occurs, abnormal data will be concentrated and appear in a large amount, so that the concentrated and large-amount appearance of the abnormal data can be regarded as that the application program has a fault, and the abnormal data can be used as key data for judging a fault point. The embodiment realizes the discovery and the positioning of the fault by monitoring the abnormal data.
It should be noted that, the separately acquiring the plurality of abnormal data generated by the application program running refers to: and separating the service data from the abnormal data in the data generated during the operation of the application program, and independently acquiring the abnormal data. In practical applications, the abnormal data can be obtained separately through AOP (Aspect Oriented Programming). Specifically, a customized collecting program may be woven before and after the business method is executed in an AOP Around (AOP Around notification) manner in a predetermined business method that may have a fault, and when the collecting program calls the business method and captures abnormal data in the event of an abnormality, the abnormal data is converted into structured data and output to an abnormality log.
It is understood that the step of acquiring the abnormal data may also be implemented by means of AOP Before (AOP pre-notification) or AOP After (AOP post-notification).
In this embodiment, any abnormal data includes at least one kind of abnormal feature information for subsequent aggregate analysis. The abnormal characteristic information refers to characteristic information carried in abnormal data, and comprises at least one of the following: exception time, exception application name, exception interface name, exception calling method name, exception IP address, exception description information, and exception call stack information. Wherein: the abnormal time refers to the time when an abnormality occurs, and is generally expressed by a time stamp, and the abnormal time is in seconds or divided into units. The exception application name refers to the name of the application in which the exception occurred. The exception interface name refers to the name of the interface where the exception occurred. The exception calling method name refers to a name of a calling method in which an exception occurs, and in the present embodiment, the exception calling method name includes a class name in which an exception occurs and a corresponding method name. The abnormal IP address refers to an IP (Internet Protocol, Protocol for interconnecting networks) address of a node where the abnormal application program is located. It can be understood that when the abnormal data includes all the above information, all the functions of data aggregation and data display can be realized. When the abnormal data includes partial information, only partial subsequent functions can be realized.
The following example is formatted exception data:
Figure BDA0001371685900000101
in an optional implementation manner of this embodiment, the exception call stack information in the exception data may be further filtered to reduce the data amount. Specifically, the configuration values are first determined. The configuration value generally adopts the name of a self-developed code packet in the current application program, the line data containing the configuration value for the first time can be regarded as the position where the exception occurs for the first time, and the data occurring before the line data on the time axis (namely the data after the line data in the exception call stack) are not considered and can be removed. For each row of data remaining in the abnormal call stack information, because the row of data can reflect the relative position of the occurrence of the abnormality and the top of the stack data can reflect the actual position of the occurrence of the abnormality, only the top of the stack data and the row of data can be reserved in the abnormal call stack according to actual needs, and the remaining data in the abnormal call stack information is removed, so that the data volume is further reduced. It is understood that the top-of-stack data refers to the first line of data (at the top of the stack) in the exception call stack, and the first line of data containing the configuration value refers to: each row of data of the exception call stack is traversed from the top of the stack to the bottom of the stack, the first time the row of data contains the configuration value.
For example, data filtering is performed by:
Figure BDA0001371685900000111
with the above configuration, the present embodiment avoids output of a large amount of useless data, while not affecting the determination of a failure.
Step S202: and aggregating the abnormal data according to the abnormal characteristic information to obtain the quantity of the abnormal data corresponding to any abnormal characteristic information.
In this step, aggregation refers to merging or statistics according to the characteristics of the data, so that the data has strong pertinence and is favorable for fault handling.
Specifically, this step may be performed as follows:
and (I) counting the abnormal data quantity corresponding to the key value of the first key and the key value of the second key by taking the abnormal time as the first key and taking the abnormal application program name as the second key.
It is to be understood that the above-mentioned "first" and "second" are used only to distinguish one key from another. For example, a first key may be referred to as a second key, or a second key may be referred to as a first key, without departing from the scope of the invention, where the first and second keys are both keys, but not the same key. In specific application, the first key and the second key can be flexibly set according to application environment. For example: in the hbase database, the row key may be the first key and the column key may be the second key. The key value refers to a specific value of a key, for example: when the abnormal time is the first key, the specific abnormal time t1、t2Are all key values of the first key. The exception data amount refers to the exception data amount corresponding to the name of the exception application and the exception time, namely, the number of exceptions of the application from the last exception time (starting from the exception time) to the exception time.
For example: the abnormal data quantity is stored by taking the abnormal time as a row key (RowKey) and the name of an abnormal application program as a column key (ColumnKey), and is shown in the following table:
Figure BDA0001371685900000121
through the step (one), the aggregation data of the application dimension is generated.
And (II) counting the number of abnormal data corresponding to the first key value and the second key value by using the abnormal time and the name of the abnormal application program as a first key and using the name of the abnormal interface and the name of the abnormal calling method as a second key.
In a specific application, the MD5 code or HASH code corresponding to the name of the abnormal application may be combined with the abnormal time as the first key. For example:
Figure BDA0001371685900000122
through the step (two), the aggregation data of the method dimension is generated.
And thirdly, counting the number of abnormal data corresponding to the key values of the first key and the second key by using the abnormal time, the name of the abnormal application program, the name of the abnormal interface and the name of the abnormal calling method as the first key and using the abnormal description information, the abnormal calling stack information and the abnormal IP address as the second key.
In specific application, the name of an abnormal application program, the name of an abnormal interface and the name of an abnormal calling method can be combined, an MD5 code or a HASH code is generated through an MD5 algorithm or a HASH algorithm, and the MD5 code or the HASH code is combined with abnormal time to serve as a first key; combining the abnormal description information with the abnormal call stack information, generating an MD5 code or a HASH code through an MD5 algorithm or a HASH algorithm, and combining the MD5 code or the HASH code with an abnormal IP address to serve as a second key. For example:
Figure BDA0001371685900000131
through the step (three), the aggregated data of the IP dimension is generated.
Preferably, in this embodiment, MD5 (exception description information + exception call stack information) and corresponding exception description information and exception call stack information may be further stored, so that the exception description information, the exception call stack information, and the number of exception data are stored separately, thereby saving the storage space.
It should be noted that, the steps (i), (ii), and (iii) are not only executed according to the fixed sequence, but also the execution sequence can be flexibly selected in practical applications. Meanwhile, one or more of the steps (I), (II) and (III) can be selected to aggregate abnormal data.
In particular, the abnormal time in the steps (one), (two) and (three) can be in seconds or divided into units, so that the second-level and hierarchical aggregated data can be generated in the application dimension, the method dimension and the IP dimension, and the subsequent processing is facilitated.
Step S203: and displaying any abnormal characteristic information and the abnormal data quantity corresponding to the abnormal characteristic information so as to process the fault of the application program.
In this step, the presentation refers to outputting the data in a form easily perceived by humans. For example, outputting the data as a map, a table, a voice, etc. The processing comprises the following steps: one or more actions of finding a fault, positioning the fault, acquiring a fault reason and solving the fault.
In an optional implementation manner of this embodiment, step S203 is performed according to the following steps:
(1) and displaying the quantity of the abnormal data corresponding to any abnormal time and any abnormal application program name based on the aggregated abnormal data to obtain application dimension display information.
In practical application, the application dimension display information can be output in a graph, a table and the like.
Fig. 3 is a schematic diagram of application dimension presentation information of the method for processing a failure of an application according to the embodiment.
As shown in fig. 3, each of the plurality of curves represents a change in the amount of abnormal data generated by one application program over time, from which an abnormal change in a different application program can be intuitively grasped.
(2) And displaying the quantity of the abnormal data corresponding to any abnormal interface name and any calling method name based on the aggregated abnormal data to obtain method dimension display information.
The following table shows a specific method dimension presentation information, and the records in the table are arranged in descending order according to the number of abnormal data:
interface Method Number of Operation of
cartAdd xxx.execute 4618 Detailed description of the invention
cartChange xxxx.execute 232 Detailed description of the invention
cartCheckAll xxxxx.execute 38 Detailed description of the invention
cart xxxxxx.execute 28 Detailed description of the invention
(3) And displaying the quantity of the abnormal data corresponding to any abnormal description information, any abnormal call stack information and any abnormal IP address based on the aggregated abnormal data to obtain IP dimension display information.
The following table shows a specific method dimension presentation information, and the records in the table are arranged in descending order according to the number of abnormal data:
Figure BDA0001371685900000151
in the above table, the exception information includes exception description information and exception call stack information.
It should be noted that, the steps (1), (2) and (3) are not only executed according to the fixed sequence, but also the execution sequence can be flexibly selected in practical application. Meanwhile, one or more of the steps (1), (2) and (3) can be selected according to actual needs to display abnormal data.
In particular, the abnormal time in the steps (1), (2) and (3) can be in seconds or divided into units, so that the abnormal data can be displayed in the second level or the grading in the application dimension, the method dimension and the IP dimension respectively. The second-level display can quickly find the abnormal trend of the application program and the fault earlier, and the grading display can show the trend of the abnormal quantity in a period of time. In practical application, the second-level display, the grading display or the combination of the second-level display and the grading display can be selected according to requirements to judge the fault. Through the arrangement, in practical application, the fault point can be directly displayed and positioned within one minute of the fault.
By applying the dimension display information, the method dimension display information and the IP dimension display information, the fault can be quickly and accurately found.
In an optional implementation manner of this embodiment, after data display, the positioning of a fault point and the analysis of a fault cause may also be implemented based on display information of each dimension, which may specifically be performed according to the following steps:
judging whether the abnormal application program name and the abnormal time meeting the first alarm condition exist: if yes, the abnormal application program name is determined as the failure application program name, and the abnormal time is determined as the failure time.
In practical applications, the first alarm condition may be set according to an application environment. In general, the first alarm condition may be one of:
1. the amount of anomalous data is greater than a first amount threshold.
2. The temporal rate of change of the amount of anomalous data is greater than a first rate of change threshold.
3. The duration of the number of anomalous data being greater than the second number threshold exceeds the first time threshold.
For the name of the fault application program and the fault time, judging whether an abnormal interface name and an abnormal calling method name which accord with a second alarm condition exist or not based on method dimension display information: if yes, determining the abnormal interface name as a fault interface name, and determining the abnormal calling method name as a fault calling method name.
In practical applications, the second alarm condition may be set according to an application environment. In general, the second alarm condition may be one of:
1. the number of anomalous data is greater than a third number threshold.
2. The temporal rate of change of the amount of anomalous data is greater than a second rate of change threshold.
3. The duration of the number of anomalous data being greater than the fourth number threshold exceeds a second time threshold.
4. The proportion of the abnormal data quantity corresponding to a certain abnormal interface name and a certain abnormal calling method name in the total quantity of the abnormal data is larger than a proportion threshold value.
For the fault interface name and the fault calling method name, judging whether the abnormal description information, the abnormal calling stack information and the abnormal IP address which accord with a third alarm condition exist or not based on the IP dimension display information: if yes, determining the abnormal description information as fault description information, determining the abnormal call stack information as fault call stack information, and determining the abnormal IP address as a fault IP address.
In practical applications, the third alarm condition may be set according to an application environment. In general, the third alarm condition may be one of:
1. the number of anomalous data is greater than a fifth number threshold.
2. The temporal rate of change of the amount of anomalous data is greater than a third rate of change threshold.
3. The duration of the number of anomalous data being greater than the sixth number threshold exceeds a third time threshold.
And sending at least one of the name of the fault application program, the fault time, the name of the fault interface, the name of the fault calling method, the fault description information, the fault calling stack information and the fault IP address.
Generally, in this embodiment, at least one of the above-mentioned information may be sent to the relevant personnel through WeChat, SMS, or the like, so as to implement fault alarm.
Particularly, the fault point can be accurately positioned by determining the name of the fault interface, the name of the fault calling method and the fault IP address, and the fault reason can be judged by the fault description information and the fault calling stack information. For example: when the external interface has a problem, the abnormal information is basically connection overtime or socket abnormity, and the fault of the external interface can be judged at the moment and can be solved through technical means such as degradation, switching and the like. In addition, the IP address corresponds to a computer in the computer room, so that whether the network fault exists in the computer room can be judged according to the fault condition of the IP.
Through the steps one, two, three and four, the embodiment realizes the quick discovery and accurate positioning of the fault point and can provide necessary data for obtaining the fault reason.
It should be noted that, the above steps one, two, and three may be executed according to the above sequence, or one or more steps may be selected and executed according to actual needs, so as to realize the discovery and location of the fault point.
According to the method for processing the fault of the application program, the technical means of isolating and completely outputting the abnormal data and the service data is adopted, so that the log data volume is greatly reduced, and the technical problems of coupling of the abnormal data and the service data, inconvenience in system maintenance and low log query efficiency in the prior art are solved; the output abnormal data is subjected to data filtering, so that the log data volume is further reduced; by carrying out aggregation statistics on abnormal data in multiple dimensions, support is provided for subsequent data analysis and fault point positioning, and the technical problem that the abnormal data cannot be subjected to aggregation analysis and thus a fault point cannot be determined in the prior art is solved; due to the adoption of the technical means of displaying abnormal data from multiple dimensions, the technical effects of rapidly finding and positioning the fault point and providing necessary data for obtaining the fault reason are achieved, and the technical problems that in the prior art, the manual query efficiency is low, the fault point cannot be found and accurately positioned in time, and the fault reason is difficult to obtain are solved.
Example two
Fig. 4 is a schematic diagram of a main part of an apparatus for processing a failure of an application according to the present embodiment.
As shown in fig. 4, the apparatus 400 for processing a failure of an application program of the present embodiment includes an obtaining unit 401, an aggregating unit 402, and a presenting unit 403. Wherein:
the obtaining unit 401 may be configured to separately obtain exception data generated during the running of the application, where any of the exception data includes at least one exception characteristic information.
The aggregation unit 402 may be configured to aggregate the abnormal data according to the abnormal feature information, and obtain an abnormal data quantity corresponding to any abnormal feature information.
The presentation unit 403 may be configured to present any exception characteristic information and an exception data amount corresponding to the exception characteristic information to handle a failure of the application.
Preferably, in this embodiment, the abnormal feature information includes at least one of the following: exception time, exception application name, exception interface name, exception calling method name, exception IP address, exception description information, and exception call stack information.
In an optional implementation manner of this embodiment, the obtaining unit 401 may further filter exception call stack information in the exception data to reduce the data size. Specifically, the configuration values are first determined. The configuration value generally adopts the name of a self-developed code packet in the current application program, the line data containing the configuration value for the first time can be regarded as the position where the exception occurs for the first time, and the data occurring before the line data on the time axis (namely the data after the line data in the exception call stack) are not considered and can be removed. For each row of data remaining in the abnormal call stack information, because the row of data can reflect the relative position of the occurrence of the abnormality and the top of the stack data can reflect the actual position of the occurrence of the abnormality, only the top of the stack data and the row of data can be reserved in the abnormal call stack according to actual needs, and the remaining data in the abnormal call stack information is removed, so that the data volume is further reduced.
As a preferred approach, the polymerization unit 402 can be used for at least one of: taking the abnormal time as a first key, taking the name of an abnormal application program as a second key, and counting the number of abnormal data corresponding to the key values of the first key and the second key; counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by using the abnormal time and the name of the abnormal application program as the first key and using the name of the abnormal interface and the name of the abnormal calling method as the second key; and counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by taking the abnormal time, the name of the abnormal application program, the name of the abnormal interface and the name of the abnormal calling method as the first key and taking the abnormal description information, the abnormal calling stack information and the abnormal IP address as the second key.
The presentation unit 403 may be used for at least one of: based on the aggregated anomaly data: displaying the quantity of abnormal data corresponding to any abnormal time and any abnormal application program name to obtain application dimension display information; displaying the quantity of abnormal data corresponding to any abnormal interface name and any abnormal calling method name to obtain method dimension display information; and displaying the abnormal data quantity corresponding to any abnormal description information, any abnormal call stack information and any abnormal IP address to obtain IP dimension display information.
In an optional implementation manner of this embodiment, the presentation unit 403 may further be configured to: judging whether an abnormal application program name and abnormal time meeting a first alarm condition exist or not based on application dimension display information: if yes, determining the abnormal application program name as a failure application program name, and determining the abnormal time as failure time; for the name of the fault application program and the fault time, judging whether an abnormal interface name and an abnormal calling method name which accord with a second alarm condition exist or not based on method dimension display information: if yes, determining the abnormal interface name as a fault interface name, and determining the abnormal calling method name as a fault calling method name; and for the fault interface name and the fault calling method name, judging whether the abnormal description information, the abnormal calling stack information and the abnormal IP address which accord with a third alarm condition exist or not based on the IP dimension display information: if the abnormal IP address is the fault IP address, determining the abnormal description information as fault description information, determining the abnormal call stack information as fault call stack information, and determining the abnormal IP address as the fault IP address.
In practical application, the first alarm condition, the second alarm condition and the third alarm condition can be flexibly set according to requirements.
In addition, after locating the failure point, the presentation unit 403 may further send at least one of a failure application name, a failure time, a failure interface name, a failure calling method name, failure description information, failure calling stack information, and a failure IP address.
The apparatus 400 for processing failure of an application program according to the present embodiment is installed as software in a device such as a computer or a mobile terminal. In a specific application, the acquiring unit 401, the aggregating unit 402, and the presenting unit 403 included in the apparatus 400 for processing a failure of an application program according to this embodiment may be installed in the same device, or may be installed in different devices in various ways, which is not limited in this invention. For example, the apparatus 400 for processing the failure of the application program according to the embodiment may be installed in any of the following manners:
1. the apparatus 400 for processing the failure of the application is installed in the computer a, that is, the obtaining unit 401, the aggregating unit 402, and the presenting unit 403 are all installed in the computer a.
2. The acquisition unit 401 is installed in the computer A, B, C, the aggregation unit 402 is installed in the computer D, and the presentation unit 403 is installed in the computer F.
3. The acquisition unit 401 is installed in the computer A, B, C, and the aggregation unit 402 and the presentation unit 403 are both installed in the computer F.
From the above description, it can be seen that the technical means of isolating and completely outputting the abnormal data and the service data is adopted, so that the log data volume is greatly reduced, and the technical problems of coupling of the abnormal data and the service data, inconvenience in system maintenance and low log query efficiency in the prior art are solved; the output abnormal data is subjected to data filtering, so that the log data volume is further reduced; by carrying out aggregation statistics on abnormal data in multiple dimensions, support is provided for subsequent data analysis and fault point positioning, and the technical problem that the abnormal data cannot be subjected to aggregation analysis and thus a fault point cannot be determined in the prior art is solved; due to the adoption of the technical means of displaying abnormal data from multiple dimensions, the technical effects of rapidly finding and positioning the fault point and providing necessary data for obtaining the fault reason are achieved, and the technical problems that in the prior art, the manual query efficiency is low, the fault point cannot be found and accurately positioned in time, and the fault reason is difficult to obtain are solved.
EXAMPLE III
Fig. 5 shows an exemplary system architecture 500 to which the method of handling a failure of an application or the apparatus of handling a failure of an application of the present embodiment may be applied.
As shown in fig. 1, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505 (this architecture is merely an example, and the components included in a particular architecture may be adapted according to application specific circumstances). The network 504 serves to provide a medium for communication links between the terminal devices 501, 502, 503 and the server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 501, 502, 503 to interact with a server 505 over a network 504 to receive or send messages or the like. The terminal devices 501, 502, 503 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 505 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 501, 502, 503. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.
It should be noted that the method for processing the failure of the application provided by the present embodiment is generally executed by the server 505, and accordingly, a device for processing the failure of the application is generally disposed in the server 505.
It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Example four
The embodiment provides an electronic device.
The electronic device of the embodiment includes: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the method for processing the faults of the application program.
Referring now to FIG. 6, there is illustrated a schematic block diagram of a computer system 600 suitable for use in implementing the electronic device of the present embodiment. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the range of use of the present embodiment.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the computer system 600 are also stored. The CPU601, ROM 602, and RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, the processes described in the main step diagrams above may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the main step diagram. In the above-described embodiment, the computer program can be downloaded and installed from the network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the system of the present invention when executed by the central processing unit 601.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present invention may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, an aggregation unit, and a presentation unit. The names of these units do not in some cases form a limitation on the unit itself, and for example, the acquisition unit may also be described as a "unit that sends abnormal data to the aggregation unit".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to perform steps comprising: the method comprises the steps of separately obtaining abnormal data generated when an application program runs, wherein any one of the abnormal data comprises at least one abnormal characteristic information; aggregating the abnormal data according to the abnormal characteristic information to obtain the quantity of the abnormal data corresponding to any abnormal characteristic information; and displaying any abnormal characteristic information and the abnormal data quantity corresponding to the abnormal characteristic information so as to process the fault of the application program.
According to the technical scheme of the embodiment of the invention, the log data volume is greatly reduced by isolating and completely outputting the abnormal data and the service data, and the technical problems of coupling of the abnormal data and the service data, inconvenience in system maintenance and low log query efficiency in the prior art are solved; the output abnormal data is subjected to data filtering, so that the log data volume is further reduced; by carrying out aggregation statistics on abnormal data in multiple dimensions, support is provided for subsequent data analysis and fault point positioning, and the technical problem that the abnormal data cannot be subjected to aggregation analysis and thus a fault point cannot be determined in the prior art is solved; by displaying abnormal data from multiple dimensions, the technical effects of quickly finding and positioning fault points and providing necessary data for obtaining fault reasons are achieved, and the technical problems that in the prior art, the manual query efficiency is low, the fault points cannot be found and accurately positioned in time, and the fault reasons are difficult to obtain are solved.
EXAMPLE five
Fig. 7 is a schematic composition diagram of a system for processing a failure of an application according to the present embodiment.
As shown in fig. 7, the system for processing the failure of the application program of the present embodiment may include: at least one application node 701, a data aggregation node 702, and a data presentation node 703.
It is understood that the application node 701, the data aggregation node 702, or the data presentation node 703 may be a computer, a server in a distributed system, a mobile terminal, or other devices, which is not limited in this respect.
Specifically, the application node 701 may be configured to separately obtain exception data generated during the running of the application program; wherein any of the anomaly data comprises at least one anomaly characteristic information.
The data aggregation node 702 may be configured to aggregate the abnormal data according to the abnormal feature information, and acquire an abnormal data quantity corresponding to any abnormal feature information.
The data presentation node 703 may be configured to present any abnormal feature information and the abnormal data amount corresponding to the abnormal feature information, so as to handle the failure of the application program.
As a preferable solution, the system for processing a failure of an application program according to this embodiment may further include a message queue node 704, configured to obtain exception data from the application node 701 and send the exception data to the data aggregation node 702. In practical application, a data acquisition unit may be deployed in each application node 701 to acquire abnormal data. Specifically, the data collection unit may run in an independent process, monitoring the exception log. When new data is input to the exception log, the data acquisition unit reads the data and sends it to the message queue node 704. The message queue node 704 sends the exception data to the data aggregation node 702.
In an optional implementation manner of this embodiment, the abnormality characteristic information may include at least one of the following: exception time, exception application name, exception interface name, exception calling method name, exception IP address, exception description information, and exception call stack information.
In an optional implementation manner of this embodiment, the application node 701 may include a data filtering unit, which may be configured to remove data in the exception call stack information after the line data that includes the preset configuration value for the first time; or removing data except the stack top data and the line data containing the preset configuration value for the first time in the abnormal call stack information.
In particular, in this embodiment, the data aggregation node 702 may be configured to at least one of: taking the abnormal time as a first key, taking the name of an abnormal application program as a second key, and counting the number of abnormal data corresponding to the key values of the first key and the second key; counting the number of abnormal data corresponding to the key values of the first key and the second key by using the abnormal time and the name of the abnormal application program as the first key and using the name of the abnormal interface and the name of the abnormal calling method as the second key; and counting the number of abnormal data corresponding to the key values of the first key and the second key by taking the abnormal time, the name of the abnormal application program, the name of the abnormal interface and the name of the abnormal calling method as the first key and taking the abnormal description information, the abnormal calling stack information and the abnormal IP address as the second key.
In addition, in this embodiment, the data presentation node 703 may be configured to at least one of: displaying the quantity of abnormal data corresponding to any abnormal time and any abnormal application program name based on the aggregated abnormal data to obtain application dimension display information; displaying the quantity of abnormal data corresponding to any abnormal interface name and any calling method name to obtain method dimension display information; and displaying the abnormal data quantity corresponding to any abnormal description information, any abnormal call stack information and any abnormal IP address to obtain IP dimension display information.
In an optional implementation manner of this embodiment, the data presentation node 703 may be further configured to present information based on the application dimension, determine, if it is determined that there are an abnormal application name and abnormal time that meet the first alarm condition, the abnormal application name as a failed application name, and determine the abnormal time as the failed time; for the name of the fault application program and the fault time, judging whether an abnormal interface name and an abnormal calling method name which accord with a second alarm condition exist or not based on method dimension display information: if yes, determining the abnormal interface name as a fault interface name, and determining the abnormal calling method name as a fault calling method name; and for the fault interface name and the fault calling method name, judging whether the abnormal description information, the abnormal calling stack information and the abnormal IP address which accord with a third alarm condition exist or not based on the IP dimension display information: if the abnormal IP address is the fault IP address, determining the abnormal description information as fault description information, determining the abnormal call stack information as fault call stack information, and determining the abnormal IP address as the fault IP address. The data presentation node 703 may be further configured to send at least one of the failed application name, the failure time, the failed interface name, the failed calling method name, the failure description information, the failed calling stack information, and the failed IP address.
In practical application, the first alarm condition, the second alarm condition and the third alarm condition can be flexibly set according to requirements.
According to the system for processing the fault of the application program, the technical means of isolating and completely outputting the abnormal data and the service data is adopted, so that the log data volume is greatly reduced, and the technical problems of coupling of the abnormal data and the service data, inconvenience in system maintenance and low log query efficiency in the prior art are solved; the output abnormal data is subjected to data filtering, so that the log data volume is further reduced; by carrying out aggregation statistics on abnormal data in multiple dimensions, support is provided for subsequent data analysis and fault point positioning, and the technical problem that the abnormal data cannot be subjected to aggregation analysis and thus a fault point cannot be determined in the prior art is solved; due to the adoption of the technical means of displaying abnormal data from multiple dimensions, the technical effects of quickly finding and positioning the fault point and providing necessary data for obtaining the fault reason are achieved, and the technical problems that in the prior art, the manual query efficiency is low, the fault point cannot be found and accurately positioned in time, and the fault reason is difficult to obtain are solved.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (13)

1. A method of handling failures of an application, comprising:
the method comprises the steps of separately obtaining abnormal data generated when an application program runs, wherein any one of the abnormal data comprises at least one abnormal characteristic information; the abnormal feature information includes at least one of: the method comprises the following steps of (1) abnormal time, abnormal application program name, abnormal interface name, abnormal calling method name, abnormal IP address, abnormal description information and abnormal calling stack information;
aggregating the abnormal data according to the abnormal characteristic information to obtain the quantity of the abnormal data corresponding to any abnormal characteristic information;
displaying any abnormal characteristic information and the abnormal data quantity corresponding to the abnormal characteristic information so as to process the fault of the application program;
the aggregating the abnormal data according to the abnormal feature information, and acquiring the quantity of the abnormal data corresponding to any abnormal feature information includes: and counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by taking the abnormal time as the first key and taking the name of the abnormal application program as the second key.
2. The method of claim 1, further comprising: and before the abnormal data is aggregated, removing the data in the abnormal call stack information after the line data which contains the preset configuration value for the first time.
3. The method of claim 1, further comprising: and before the abnormal data is aggregated, removing data except the stack top data and the line data containing the preset configuration value for the first time in the abnormal call stack information.
4. The method according to claim 1, wherein the aggregating the abnormal data according to the abnormal feature information, and acquiring the abnormal data quantity corresponding to any abnormal feature information includes at least one of:
counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by using the abnormal time and the name of the abnormal application program as the first key and using the name of the abnormal interface and the name of the abnormal calling method as the second key;
and counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by taking the abnormal time, the name of the abnormal application program, the name of the abnormal interface and the name of the abnormal calling method as the first key and taking the abnormal description information, the abnormal calling stack information and the abnormal IP address as the second key.
5. The method according to claim 1, wherein the displaying any abnormal feature information and the abnormal data amount corresponding to the abnormal feature information comprises at least one of:
based on the aggregated anomaly data:
displaying the quantity of abnormal data corresponding to any abnormal time and any abnormal application program name to obtain application dimension display information;
displaying the quantity of abnormal data corresponding to any abnormal interface name and any abnormal calling method name to obtain method dimension display information;
and displaying the abnormal data quantity corresponding to any abnormal description information, any abnormal call stack information and any abnormal IP address to obtain IP dimension display information.
6. The method according to any one of claims 1 to 5, wherein the abnormal time is in seconds or in units of minutes.
7. The method of claim 5, further comprising:
based on application dimension display information, when an abnormal application program name and abnormal time meeting a first alarm condition exist, determining the abnormal application program name as a fault application program name, and determining the abnormal time as fault time;
for the name of the fault application program and the fault time, based on method dimension display information, when an abnormal interface name and an abnormal calling method name which accord with a second alarm condition exist, determining the abnormal interface name as a fault interface name, and determining the abnormal calling method name as a fault calling method name; and
and for the fault interface name and the fault calling method name, based on IP dimension display information, when the abnormal description information, the abnormal call stack information and the abnormal IP address which accord with a third alarm condition exist, determining the abnormal description information as fault description information, determining the abnormal call stack information as fault call stack information, and determining the abnormal IP address as a fault IP address.
8. An apparatus for handling failures of applications, comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for independently acquiring abnormal data generated when an application program runs, and any one of the abnormal data comprises at least one type of abnormal characteristic information; the abnormal feature information includes at least one of: the method comprises the following steps of (1) abnormal time, abnormal application program name, abnormal interface name, abnormal calling method name, abnormal IP address, abnormal description information and abnormal calling stack information;
the aggregation unit is used for aggregating the abnormal data according to the abnormal characteristic information to acquire the quantity of the abnormal data corresponding to any abnormal characteristic information;
the display unit is used for displaying any abnormal characteristic information and the abnormal data quantity corresponding to the abnormal characteristic information so as to process the fault of the application program;
the polymerization unit is further for: and counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by taking the abnormal time as the first key and taking the name of the abnormal application program as the second key.
9. The apparatus of claim 8, wherein the aggregation unit is configured to at least one of:
counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by using the abnormal time and the name of the abnormal application program as the first key and using the name of the abnormal interface and the name of the abnormal calling method as the second key;
counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by taking the abnormal time, the name of the abnormal application program, the name of the abnormal interface and the name of the abnormal calling method as the first key and taking the abnormal description information, the abnormal calling stack information and the abnormal IP address as the second key; and the presentation unit is used for at least one of the following:
displaying the quantity of abnormal data corresponding to any abnormal time and any abnormal application program name based on the aggregated abnormal data to obtain application dimension display information;
displaying the quantity of abnormal data corresponding to any abnormal interface name and any abnormal calling method name based on the aggregated abnormal data to obtain method dimension display information;
displaying the quantity of abnormal data corresponding to any abnormal description information, any abnormal call stack information and any abnormal IP address based on the aggregated abnormal data to obtain IP dimension display information; and the presentation unit is further configured to:
based on application dimension display information, when an abnormal application program name and abnormal time meeting a first alarm condition exist, determining the abnormal application program name as a fault application program name, and determining the abnormal time as fault time;
for the name of the fault application program and the fault time, based on method dimension display information, when an abnormal interface name and an abnormal calling method name which accord with a second alarm condition exist, determining the abnormal interface name as a fault interface name, and determining the abnormal calling method name as a fault calling method name; and
and for the fault interface name and the fault calling method name, based on IP dimension display information, when the abnormal description information, the abnormal call stack information and the abnormal IP address which accord with a third alarm condition exist, determining the abnormal description information as fault description information, determining the abnormal call stack information as fault call stack information, and determining the abnormal IP address as a fault IP address.
10. A system for handling failures of applications, comprising: the system comprises at least one application node, a data aggregation node and a data presentation node; wherein:
the application node is used for independently acquiring abnormal data generated when the application program runs; wherein any of the anomaly data comprises at least one anomaly characteristic information; the abnormal feature information includes at least one of: the method comprises the following steps of (1) abnormal time, abnormal application program name, abnormal interface name, abnormal calling method name, abnormal IP address, abnormal description information and abnormal calling stack information;
the data aggregation node is used for aggregating the abnormal data according to the abnormal characteristic information to acquire the quantity of the abnormal data corresponding to any abnormal characteristic information;
the data display node is used for displaying any abnormal characteristic information and abnormal data quantity corresponding to the abnormal characteristic information so as to process the fault of the application program;
the data aggregation node is configured to: and counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by taking the abnormal time as the first key and taking the name of the abnormal application program as the second key.
11. The system of claim 10, wherein the data aggregation node is configured to at least one of:
counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by using the abnormal time and the name of the abnormal application program as the first key and using the name of the abnormal interface and the name of the abnormal calling method as the second key;
counting the number of abnormal data corresponding to the key value of the first key and the key value of the second key by taking the abnormal time, the name of the abnormal application program, the name of the abnormal interface and the name of the abnormal calling method as the first key and taking the abnormal description information, the abnormal calling stack information and the abnormal IP address as the second key; and the data presentation node is for at least one of:
displaying the quantity of abnormal data corresponding to any abnormal time and any abnormal application program name based on the aggregated abnormal data to obtain application dimension display information;
displaying the quantity of abnormal data corresponding to any abnormal interface name and any abnormal calling method name based on the aggregated abnormal data to obtain method dimension display information;
displaying the quantity of abnormal data corresponding to any abnormal description information, any abnormal call stack information and any abnormal IP address based on the aggregated abnormal data to obtain IP dimension display information; and the data presentation node is further configured to:
based on application dimension display information, when an abnormal application program name and abnormal time meeting a first alarm condition exist, determining the abnormal application program name as a fault application program name, and determining the abnormal time as fault time;
for the name of the fault application program and the fault time, based on method dimension display information, when an abnormal interface name and an abnormal calling method name which accord with a second alarm condition exist, determining the abnormal interface name as a fault interface name, and determining the abnormal calling method name as a fault calling method name; and
and for the fault interface name and the fault calling method name, based on IP dimension display information, when the abnormal description information, the abnormal call stack information and the abnormal IP address which accord with a third alarm condition exist, determining the abnormal description information as fault description information, determining the abnormal call stack information as fault call stack information, and determining the abnormal IP address as a fault IP address.
12. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
13. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN201710665615.XA 2017-08-07 2017-08-07 Method, device and system for processing faults of application program Active CN109388546B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710665615.XA CN109388546B (en) 2017-08-07 2017-08-07 Method, device and system for processing faults of application program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710665615.XA CN109388546B (en) 2017-08-07 2017-08-07 Method, device and system for processing faults of application program

Publications (2)

Publication Number Publication Date
CN109388546A CN109388546A (en) 2019-02-26
CN109388546B true CN109388546B (en) 2022-06-07

Family

ID=65413434

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710665615.XA Active CN109388546B (en) 2017-08-07 2017-08-07 Method, device and system for processing faults of application program

Country Status (1)

Country Link
CN (1) CN109388546B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109798A (en) * 2019-03-19 2019-08-09 中国平安人寿保险股份有限公司 Application exception processing method, device, computer equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101354317A (en) * 2007-07-27 2009-01-28 通用电气公司 Abnormal polymerization method
CN105653432A (en) * 2015-12-22 2016-06-08 北京奇虎科技有限公司 Processing method and device of crash data
CN105893248A (en) * 2015-12-30 2016-08-24 乐视致新电子科技(天津)有限公司 Method and device for obtaining abnormal relevant information in terminal equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8176559B2 (en) * 2009-12-16 2012-05-08 Mcafee, Inc. Obfuscated malware detection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101354317A (en) * 2007-07-27 2009-01-28 通用电气公司 Abnormal polymerization method
CN105653432A (en) * 2015-12-22 2016-06-08 北京奇虎科技有限公司 Processing method and device of crash data
CN105893248A (en) * 2015-12-30 2016-08-24 乐视致新电子科技(天津)有限公司 Method and device for obtaining abnormal relevant information in terminal equipment

Also Published As

Publication number Publication date
CN109388546A (en) 2019-02-26

Similar Documents

Publication Publication Date Title
CN109257200B (en) Method and device for monitoring big data platform
CN111190888A (en) Method and device for managing graph database cluster
CN110727560A (en) Cloud service alarm method and device
CN110928934A (en) Data processing method and device for business analysis
CN114090366A (en) Method, device and system for monitoring data
US10331484B2 (en) Distributed data platform resource allocator
US20130198381A1 (en) Optimizing Data Extraction from Distributed Systems into a Unified Event Aggregator Using Time-Outs
CN114091704B (en) Alarm suppression method and device
CN110727563A (en) Cloud service alarm method and device for preset customer
CN113495820A (en) Method and device for collecting and processing abnormal information and abnormal monitoring system
US11283697B1 (en) Scalable real time metrics management
CN111064656A (en) Data management method, device, system, storage medium and electronic equipment
CN110534136B (en) Recording method and device
CN109388546B (en) Method, device and system for processing faults of application program
CN113760982A (en) Data processing method and device
CN109684279B (en) Data processing method and system
CN112749204B (en) Method and device for reading data
CN114049065A (en) Data processing method, device and system
CN112688982B (en) User request processing method and device
CN111274104B (en) Data processing method, device, electronic equipment and computer readable storage medium
US10296967B1 (en) System, method, and computer program for aggregating fallouts in an ordering system
CN110888770B (en) Method and device for transmitting information
CN107665241B (en) Real-time data multi-dimensional duplicate removal method and device
CN112711517A (en) Server performance monitoring method and device, storage medium and terminal
CN113722193A (en) Method and device for detecting page abnormity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant