CN116302640A - Abnormality analysis method, abnormality analysis device, abnormality analysis computer device, and abnormality analysis program - Google Patents

Abnormality analysis method, abnormality analysis device, abnormality analysis computer device, and abnormality analysis program Download PDF

Info

Publication number
CN116302640A
CN116302640A CN202310126982.8A CN202310126982A CN116302640A CN 116302640 A CN116302640 A CN 116302640A CN 202310126982 A CN202310126982 A CN 202310126982A CN 116302640 A CN116302640 A CN 116302640A
Authority
CN
China
Prior art keywords
dimension
abnormal
data
analysis
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310126982.8A
Other languages
Chinese (zh)
Inventor
周坤
史峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lazas Network Technology Shanghai Co Ltd
Original Assignee
Lazas Network Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lazas Network Technology Shanghai Co Ltd filed Critical Lazas Network Technology Shanghai Co Ltd
Priority to CN202310126982.8A priority Critical patent/CN116302640A/en
Publication of CN116302640A publication Critical patent/CN116302640A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9035Filtering based on additional data, e.g. user or group profiles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses an anomaly analysis method, an anomaly analysis device, computer equipment and a computer readable storage medium, which relate to the technical field of Internet. The method comprises the following steps: responding to an abnormality analysis request, acquiring a plurality of historical data to be analyzed, and determining the abnormality data in the plurality of historical data to be analyzed; extracting at least one target first dimension from a plurality of first dimensions associated with the abnormal data, calculating an abnormal contribution value of each target first dimension, and taking the target first dimension of which the abnormal contribution value meets a contribution value threshold as an abnormal dimension; determining a preset analysis depth, and performing dimension disassembly analysis on a second dimension related to the abnormal dimension according to the preset analysis depth to obtain at least one abnormal cause of which the number meets the preset analysis depth; based on at least one abnormality cause, an abnormality analysis result is generated, and the abnormality analysis result is output.

Description

Abnormality analysis method, abnormality analysis device, abnormality analysis computer device, and abnormality analysis program
Technical Field
The present disclosure relates to the field of internet technologies, and in particular, to an anomaly analysis method, an anomaly analysis device, a computer device, and a computer readable storage medium.
Background
With the development of internet technology, various platforms generate a large amount of data, such as report data, every day in the process of operation. In daily operation, a platform can often generate sudden abnormal fluctuation conditions such as payment success rate, and in order to understand the specific cause of fluctuation, it is particularly important to detect whether the data generated by the platform every day generate abnormal fluctuation and analyze the cause of the abnormal fluctuation.
In the related art, analysis on abnormal fluctuation of data is usually completed by an analysis model, the analysis model can analyze according to input fluctuation abnormality, explain the reason of fluctuation, provide some abnormal characteristics for staff to prepare potential risks in advance, or mine better ideas and popularize.
In carrying out the present application, the applicant has found that the related art has at least the following problems:
the abnormal characteristics obtained by analysis of the analysis model are coarse-grained characteristics, the real reasons and details of index fluctuation are difficult to locate through the characteristics, a large number of invalid operations are caused, the analysis efficiency is low, and the accuracy is low.
Disclosure of Invention
In view of this, the present application provides an anomaly analysis method, apparatus, computer device and computer readable storage medium, and mainly aims to solve the problems that it is difficult to locate the real cause and details of index fluctuation through these features, a large number of invalid operations are caused, the analysis efficiency is low, and the accuracy is not high.
According to a first aspect of the present application, there is provided an anomaly analysis method, the method comprising:
responding to an abnormality analysis request, acquiring a plurality of historical data to be analyzed, and determining the abnormality data in the plurality of historical data to be analyzed;
extracting at least one target first dimension from a plurality of first dimensions associated with the abnormal data, respectively calculating an abnormal contribution value of each target first dimension, and taking the target first dimension of which the abnormal contribution value meets a contribution value threshold as an abnormal dimension;
determining a preset analysis depth, and performing dimension disassembly analysis on a second dimension related to the abnormal dimension according to the preset analysis depth to obtain at least one abnormal cause with the quantity meeting the preset analysis depth;
generating an abnormality analysis result based on the at least one abnormality cause, and outputting the abnormality analysis result.
Optionally, the responding to the abnormality analysis request obtains a plurality of historical data to be analyzed, and determining the abnormality data in the plurality of historical data to be analyzed includes:
responding to the abnormality analysis request, and determining the type of data to be analyzed and the time period to be analyzed indicated by the abnormality analysis request;
the matched historical data which occur in the time period to be analyzed and have the data type consistent with the data type to be analyzed are used as the historical data to be analyzed;
performing time sequence decomposition processing on the plurality of historical data to be analyzed by using a time sequence decomposition algorithm to obtain a data sequence arranged according to a time sequence;
and calculating abnormal values of two adjacent historical data to be analyzed in the data sequence based on an abnormal detection algorithm, and taking the historical data to be analyzed, of which the abnormal values reach abnormal conditions, in the data sequence as the abnormal data.
Optionally, the extracting at least one target first dimension from the plurality of first dimensions associated with the abnormal data includes:
querying the plurality of first dimensions associated with the abnormal data, and calculating cross entropy of each first dimension in the plurality of first dimensions;
Acquiring a cross entropy threshold, and extracting a plurality of first dimensions with cross entropy larger than the cross entropy threshold from the plurality of first dimensions as a plurality of candidate first dimensions;
and sequencing the plurality of candidate first dimensions according to the order of the cross entropy from large to small to obtain a first dimension sequence, and taking at least one candidate first dimension arranged at the head of the first dimension sequence as the at least one target first dimension.
Optionally, the calculating the abnormal contribution value of each target first dimension includes:
for each target first dimension, reading a current day data value of the abnormal data in the target first dimension, and inquiring a comparison data value of the abnormal data in the target first dimension, wherein the comparison data value is a data value of a comparison date corresponding to the occurrence date of the current data value;
inquiring the total value of the abnormal data on the current date corresponding to the occurrence date of the current data value, and inquiring the total value of the abnormal data corresponding to the comparison date;
calculating a first difference value between the current day data value and the control data value, and calculating a second difference value between the current day total value and the control total value;
And taking the ratio of the first difference value to the second difference value as an abnormal contribution value of the target first dimension.
Optionally, the method further includes, after extracting at least one target first dimension from the plurality of first dimensions associated with the abnormal data, calculating an abnormal contribution value of each target first dimension, and taking the target first dimension in which the abnormal contribution value meets the threshold of the contribution value as an abnormal dimension:
acquiring the preset analysis depth;
when the preset analysis depth indicates single-dimensional analysis, taking the abnormal dimension as an abnormal reason;
and generating a single-dimensional analysis result comprising the abnormal reason, and outputting the single-dimensional analysis result.
Optionally, determining a preset analysis depth, performing dimension disassembly analysis on the second dimension associated with the abnormal dimension according to the preset analysis depth to obtain at least one abnormal cause with the number meeting the preset analysis depth, including:
querying a plurality of second dimensions associated with the abnormal dimension when the preset analysis depth indicates multi-dimensional analysis, and determining another abnormal dimension in the plurality of second dimensions;
determining the depth value indicated by the preset analysis depth, and continuing to determine the abnormal dimension in a plurality of third dimensions associated with the other abnormal dimension until the number of the determined abnormal dimensions is equal to the depth value, so as to obtain at least one abnormal dimension with the number meeting the preset analysis depth;
And taking the at least one abnormal dimension as the at least one abnormal reason.
Optionally, the determining another abnormal dimension among the plurality of second dimensions includes:
calculating cross entropy of each of the plurality of second dimensions;
acquiring a cross entropy threshold, and extracting a plurality of second dimensions with cross entropy larger than the cross entropy threshold from the plurality of second dimensions as a plurality of candidate second dimensions;
sequencing the plurality of candidate second dimensions according to the order of the cross entropy from large to small to obtain a second dimension sequence, and taking at least one candidate second dimension arranged at the head of the second dimension sequence as at least one target second dimension;
and respectively calculating an abnormal contribution value of each target second dimension in the at least one target second dimension, and taking the target second dimension, of which the abnormal contribution value meets the contribution value threshold, in the at least one target second dimension as the other abnormal dimension.
Optionally, the generating an anomaly analysis result based on the at least one anomaly cause includes:
identifying duplicate causes of anomalies among the at least one anomaly cause, and filtering the duplicate causes of anomalies among the at least one anomaly cause;
Respectively reading the abnormality contribution value corresponding to each abnormality cause in the filtered at least one abnormality cause, and sequencing the filtered at least one abnormality cause according to the order of the abnormality contribution values from the high to the low to obtain a cause sequence;
determining a preset output number, extracting target abnormality reasons of the preset output number arranged at the head of the team from the reason sequence, and generating the abnormality analysis result comprising the target abnormality reasons of the preset output number.
Optionally, the method further comprises:
reading a target abnormality cause included in the abnormality analysis result, and inquiring explanation information associated with the target abnormality cause;
marking the abnormal analysis result by adopting the interpretation information, generating a result verification prompt comprising the marked abnormal analysis result, and outputting the result verification prompt;
when a verification passing instruction is received based on the result verification reminding, a result receiving party is determined, and the marked abnormal analysis result is pushed to the result receiving party.
According to a second aspect of the present application, there is provided an abnormality analysis apparatus comprising:
the determining module is used for responding to the abnormality analysis request, acquiring a plurality of historical data to be analyzed and determining the abnormality data in the historical data to be analyzed;
The calculating module is used for extracting at least one target first dimension from a plurality of first dimensions associated with the abnormal data, calculating an abnormal contribution value of each target first dimension respectively, and taking the target first dimension of which the abnormal contribution value meets a contribution value threshold as an abnormal dimension;
the analysis module is used for determining a preset analysis depth, and carrying out dimension disassembly analysis on a second dimension related to the abnormal dimension according to the preset analysis depth to obtain at least one abnormal reason with the quantity meeting the preset analysis depth;
and the generation module is used for generating an abnormality analysis result based on the at least one abnormality cause and outputting the abnormality analysis result.
Optionally, the determining module is configured to determine, in response to the anomaly analysis request, a type of data to be analyzed and a time period to be analyzed indicated by the anomaly analysis request; the matched historical data which occur in the time period to be analyzed and have the data type consistent with the data type to be analyzed are used as the historical data to be analyzed; performing time sequence decomposition processing on the plurality of historical data to be analyzed by using a time sequence decomposition algorithm to obtain a data sequence arranged according to a time sequence; and calculating abnormal values of two adjacent historical data to be analyzed in the data sequence based on an abnormal detection algorithm, and taking the historical data to be analyzed, of which the abnormal values reach abnormal conditions, in the data sequence as the abnormal data.
Optionally, the calculating module is configured to query the plurality of first dimensions associated with the abnormal data, and calculate cross entropy of each first dimension in the plurality of first dimensions; acquiring a cross entropy threshold, and extracting a plurality of first dimensions with cross entropy larger than the cross entropy threshold from the plurality of first dimensions as a plurality of candidate first dimensions; and sequencing the plurality of candidate first dimensions according to the order of the cross entropy from large to small to obtain a first dimension sequence, and taking at least one candidate first dimension arranged at the head of the first dimension sequence as the at least one target first dimension.
Optionally, the calculating module is configured to, for each target first dimension, read a current day data value of the abnormal data in the target first dimension, and query a comparison data value of the abnormal data in the target first dimension, where the comparison data value is a data value of a comparison date corresponding to an occurrence date of the current data value; inquiring the total value of the abnormal data on the current date corresponding to the occurrence date of the current data value, and inquiring the total value of the abnormal data corresponding to the comparison date; calculating a first difference value between the current day data value and the control data value, and calculating a second difference value between the current day total value and the control total value; and taking the ratio of the first difference value to the second difference value as an abnormal contribution value of the target first dimension.
Optionally, the generating module is further configured to obtain the preset analysis depth; when the preset analysis depth indicates single-dimensional analysis, taking the abnormal dimension as an abnormal reason; and generating a single-dimensional analysis result comprising the abnormal reason, and outputting the single-dimensional analysis result.
Optionally, the analysis module is configured to query a plurality of second dimensions associated with the abnormal dimension when the preset analysis depth indicates multidimensional analysis, and determine another abnormal dimension in the plurality of second dimensions; determining the depth value indicated by the preset analysis depth, and continuing to determine the abnormal dimension in a plurality of third dimensions associated with the other abnormal dimension until the number of the determined abnormal dimensions is equal to the depth value, so as to obtain at least one abnormal dimension with the number meeting the preset analysis depth; and taking the at least one abnormal dimension as the at least one abnormal reason.
Optionally, the analysis module is configured to calculate a cross entropy of each of the plurality of second dimensions; acquiring a cross entropy threshold, and extracting a plurality of second dimensions with cross entropy larger than the cross entropy threshold from the plurality of second dimensions as a plurality of candidate second dimensions; sequencing the plurality of candidate second dimensions according to the order of the cross entropy from large to small to obtain a second dimension sequence, and taking at least one candidate second dimension arranged at the head of the second dimension sequence as at least one target second dimension; and respectively calculating an abnormal contribution value of each target second dimension in the at least one target second dimension, and taking the target second dimension, of which the abnormal contribution value meets the contribution value threshold, in the at least one target second dimension as the other abnormal dimension.
Optionally, the generating module is configured to identify a repeated abnormality cause of the at least one abnormality cause, and filter the repeated abnormality cause among the at least one abnormality cause; respectively reading the abnormality contribution value corresponding to each abnormality cause in the filtered at least one abnormality cause, and sequencing the filtered at least one abnormality cause according to the order of the abnormality contribution values from the high to the low to obtain a cause sequence; determining a preset output number, extracting target abnormality reasons of the preset output number arranged at the head of the team from the reason sequence, and generating the abnormality analysis result comprising the target abnormality reasons of the preset output number.
Optionally, the apparatus further comprises:
the query module is used for reading target abnormal reasons included in the abnormal analysis results and querying explanation information associated with the target abnormal reasons;
the labeling module is used for labeling the abnormal analysis result by adopting the interpretation information, generating a result verification prompt comprising the labeled abnormal analysis result, and outputting the result verification prompt;
and the pushing module is used for determining a result receiver when receiving the verification passing instruction based on the result verification reminding, and pushing the marked abnormal analysis result to the result receiver.
According to a third aspect of the present application there is provided a computer device comprising a memory storing a computer program and a processor implementing the steps of the method of any of the first aspects described above when the computer program is executed by the processor.
According to a fourth aspect of the present application there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of any of the first aspects described above.
By means of the technical scheme, the anomaly analysis method, the anomaly analysis device, the computer equipment and the computer readable storage medium provided by the application are used for responding to an anomaly analysis request, acquiring a plurality of historical data to be analyzed, determining the anomaly data in the historical data to be analyzed, extracting at least one target first dimension in a plurality of first dimensions related to the anomaly data, respectively calculating an anomaly contribution value of each target first dimension, taking the target first dimension of which the anomaly contribution value meets a contribution value threshold as an anomaly dimension, determining a preset analysis depth, carrying out dimension disassembly analysis on second dimensions related to the anomaly dimension according to the preset analysis depth to obtain at least one anomaly cause of which the number meets the preset analysis depth, generating an anomaly analysis result based on the at least one anomaly cause, and outputting the anomaly analysis result.
The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 shows a flow chart of an anomaly analysis method provided in an embodiment of the present application;
FIG. 2A is a flow chart illustrating another method for anomaly analysis according to an embodiment of the present application;
FIG. 2B is a schematic diagram illustrating an anomaly analysis method according to an embodiment of the present application;
FIG. 2C illustrates a schematic diagram of a dimension association provided by an embodiment of the present application;
fig. 2D is a schematic flow chart of an anomaly analysis method according to an embodiment of the present application;
FIG. 2E is a schematic diagram illustrating an anomaly analysis method according to an embodiment of the present application;
fig. 3 shows a schematic structural diagram of an abnormality analysis apparatus provided in an embodiment of the present application;
fig. 4 shows a schematic device structure of a computer device according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The embodiment of the application provides an anomaly analysis method, as shown in fig. 1, which comprises the following steps:
101. and responding to the abnormality analysis request, acquiring a plurality of historical data to be analyzed, and determining the abnormality data in the plurality of historical data to be analyzed.
The analysis of the abnormal fluctuation of the data is usually completed by an analysis model, the analysis model can analyze according to the input fluctuation abnormality, explain the reason of the fluctuation, provide some abnormal characteristics for staff to prepare the potential risk in advance, or mine better ideas and popularize. The applicant has appreciated that although the above procedure is able to give the impact weight of individual features and the correlation between features and data trends, there are still two problems, one being the difficulty in providing specific difference locations for data fluctuations, the need to quantify the extent of contribution of each feature to the differences; and secondly, the multiple collineation problem exists in each characteristic of the linear model. The former problem is the lack of localization anomalies, and the latter problem is the problem of the linear classification model itself. Therefore, the method for anomaly analysis is provided, whether anomalies occur or not is automatically judged, the anomaly data are located, the main dimension causing the anomaly data is found in a large number of dimensions through dimension disassembly analysis, an anomaly analysis result for explaining the main dimension is generated, other interference dimensions can be filtered, anomaly reasons can be quickly located, invalid operation is reduced, and analysis efficiency and accuracy are improved.
The embodiment of the application can be applied to a platform for providing services such as take-out, group purchase and the like, and the platform can be operated based on an independent server, and can also be operated based on a server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDNs), basic cloud computing such as big data and artificial intelligent platforms and the like. The platform can provide an abnormal analysis inlet, and when a worker has an analysis requirement, the abnormal analysis inlet can be triggered, so that the platform can receive an abnormal analysis request, respond to the abnormal analysis request, acquire a plurality of historical data to be analyzed, and determine the abnormal data in the historical data to be analyzed. Or, an analysis period may be set in the platform, and the data exception analysis is started after an exception analysis request is determined to be received every analysis period. It should be noted that, the platform may also perform real-time anomaly analysis while generating data, and use the data that is currently generated and determined to have anomalies as anomaly data in the subsequent analysis process, where the method for acquiring the anomaly data is not specifically limited in this application.
Further, the historical data to be analyzed may be industry key index data focused by a platform, a merchant or a staff, such as order quantity, return quantity, complaint quantity, etc., which is not specifically limited in the present application. When the abnormal data is determined, the historical data to be analyzed with larger fluctuation in the plurality of historical data to be analyzed can be used as the abnormal data, for example, the historical data to be analyzed is used as the order quantity, the order quantity of 7 continuous days is respectively 10, 11, 12, 2, 10, 11, 10, 11 and 12, and the order quantity on the 4 th day can be seen to be suddenly reduced, so that the order quantity on the 4 th day can be used as the abnormal data, and the specific reasons for the sudden reduction of the order can be caused by the subsequent analysis.
102. At least one target first dimension is extracted from a plurality of first dimensions associated with the abnormal data, the abnormal contribution value of each target first dimension is calculated respectively, and the target first dimension of which the abnormal contribution value meets the threshold value of the contribution value is taken as an abnormal dimension.
In the embodiment of the application, the dimensions affecting the type of data may be related to different types of data, for example, the dimensions affecting the order quantity may include a user, a merchant, a platform, a scene, and the like, and in order to analyze which dimension specifically causes occurrence of abnormal data, the platform queries a plurality of first dimensions related to the abnormal data, and performs preliminary analysis on the plurality of first dimensions. The first dimensions with small influence on the abnormal data have no analysis meaning, and the first dimensions with small influence on the abnormal data are directly filtered, so that the platform can extract at least one target first dimension from the first dimensions associated with the abnormal data, and the target first dimensions are the first dimensions with large influence on the abnormal data.
Further, the degree of influence on the abnormal data is also different for each target first dimension, and the greater the degree of influence, the more importance is required as the cause of the abnormality. Therefore, in the embodiment of the present application, after determining at least one target first dimension, the platform calculates an abnormal contribution value of each target first dimension, sets a contribution value threshold, and uses the target first dimension, in which the abnormal contribution value meets the contribution value threshold, as an abnormal dimension, so that the abnormal dimension is used as a potential cause for influencing abnormal data. The contribution value threshold may be set by a worker at his own time, for example, set to 50%, 80%, or the like, or may be calculated according to big data, which is not specifically limited in this application.
103. Determining a preset analysis depth, and performing dimension disassembly analysis on a second dimension related to the abnormal dimension according to the preset analysis depth to obtain at least one abnormal cause of which the number meets the preset analysis depth.
In the embodiment of the present application, since fine-grained secondary dimensions are also associated under dimensions, in order to determine whether to further analyze secondary dimensions under abnormal dimensions, a preset analysis depth is set in the platform, where the preset analysis depth is used for indicating how many layers of dimension analysis are performed. For the current progress, after the abnormal dimension is determined, namely the dimension analysis of 1 layer is completed, if the preset analysis depth is greater than 1, the platform continues to conduct dimension disassembly analysis on the second dimension related to the abnormal dimension, all the abnormal dimensions determined in the whole dimension disassembly process are used as abnormal reasons, and at least one abnormal reason with the quantity meeting the preset analysis depth is finally obtained.
For example, if the preset analysis depth is 3, the platform continues to perform dimension disassembly analysis on the second dimension associated with the abnormal dimension, then obtains another abnormal dimension, continues to perform dimension disassembly analysis on the third dimension associated with the other abnormal dimension, and then obtains an abnormal dimension, obtains 3 abnormal dimensions, and takes the 3 abnormal dimensions as the reason of the abnormality. It should be noted that, in the practical application process, if the value of the preset analysis depth is equal to 1, the currently determined abnormal dimension can be directly used as an abnormal reason, the dimension disassembly analysis on the second dimension related to the abnormal dimension is not needed, the value of the preset analysis depth can be set by itself, and can be generally set to 3-5.
104. Based on at least one abnormality cause, an abnormality analysis result is generated, and the abnormality analysis result is output.
In the embodiment of the application, after at least one abnormality cause is determined, the platform can merge and rearrange the at least one abnormality cause, generate an abnormality analysis result according to the ranking of the at least one abnormality cause, and output the abnormality analysis result to related staff for reference by the staff, so that potential risks are prepared in advance, and a better thought can be found to be promoted conveniently.
According to the method provided by the embodiment of the application, a plurality of historical data to be analyzed are obtained in response to an abnormal analysis request, the abnormal data are determined in the historical data to be analyzed, at least one target first dimension is extracted in a plurality of first dimensions related to the abnormal data, the abnormal contribution value of each target first dimension is calculated respectively, the target first dimension of which the abnormal contribution value meets the threshold of the contribution value is used as an abnormal dimension, the preset analysis depth is determined, the second dimension related to the abnormal dimension is subjected to dimension dismantling analysis according to the preset analysis depth, at least one abnormal cause of which the number meets the preset analysis depth is obtained, an abnormal analysis result is generated based on the at least one abnormal cause, the abnormal analysis result is output, the main dimension causing the abnormal data is found in a large number of dimensions through dimension dismantling analysis, the abnormal analysis result for explaining the main dimension is generated, other interference dimensions can be filtered, the abnormal cause is rapidly located, the invalid operation is reduced, and the analysis efficiency and the accuracy are improved.
Further, as a refinement and extension of the foregoing embodiment, in order to fully describe a specific implementation procedure of the embodiment, another anomaly analysis method is provided in the embodiment of the present application, as shown in fig. 2A, where the method includes:
201. And responding to the abnormality analysis request, determining the type of the data to be analyzed and the time period to be analyzed indicated by the abnormality analysis request, and taking the matched plurality of historical data which occur in the time period to be analyzed and have the data type consistent with the type of the data to be analyzed as a plurality of historical data to be analyzed.
The embodiment of the application can be applied to a platform for providing services such as take-out, group purchase and the like, the platform can provide an abnormal analysis inlet, and when staff has analysis demands, the abnormal analysis inlet can be triggered, so that the platform can receive an abnormal analysis request, respond to the abnormal analysis request, acquire a plurality of historical data to be analyzed, and determine the abnormal data in the plurality of historical data to be analyzed. Or, an analysis period may be set in the platform, and the data exception analysis is started after an exception analysis request is determined to be received every analysis period. It should be noted that, the platform may also perform real-time anomaly analysis while generating data, and use the data that is currently generated and determined to have anomalies as anomaly data in the subsequent analysis process, where the method for acquiring the anomaly data is not specifically limited in this application.
The historical data to be analyzed can be industry key index data, such as order quantity, return quantity, complaint quantity and the like, focused by a platform, a merchant or a staff, so that when the historical data to be analyzed is acquired, the platform can determine the type of the data to be analyzed indicated by an abnormal analysis request and the time period to be analyzed, and a plurality of historical data which are matched in the time period to be analyzed and have the data type consistent with the type of the data to be analyzed are used as a plurality of historical data to be analyzed. For example, assuming that the type of data to be analyzed indicated by the abnormality analysis request is an order amount and the period of time to be analyzed is the past 30 days, the platform acquires the order amount per day in the past 30 days as historical data to be analyzed for the subsequent analysis process.
202. And performing time sequence decomposition processing on the plurality of historical data to be analyzed by using a time sequence decomposition algorithm to obtain a data sequence which is arranged according to a time sequence, performing outlier calculation on two adjacent historical data to be analyzed in the data sequence based on an outlier detection algorithm, and taking the historical data to be analyzed, of which the outlier reaches an outlier condition, in the data sequence as outlier.
In the embodiment of the present application, when determining the abnormal data, the to-be-analyzed historical data with larger fluctuation in the plurality of to-be-analyzed historical data may be used as the abnormal data. In order to determine which historical data to be analyzed has larger fluctuation, a time sequence decomposition algorithm and an anomaly detection algorithm are combined, and the anomaly data is mined through the two algorithms. The time sequence decomposition algorithm may be an STL (sequential-Trend decomposition procedure based on Loess) decomposition algorithm, the anomaly detection algorithm may be a GESD (Generalized extreme student bias, generalized extreme student's deviation) algorithm, the specific platform may perform time sequence decomposition processing on the plurality of historical data to be analyzed by using the time sequence decomposition algorithm to obtain a data sequence arranged according to a time sequence, perform outlier calculation on two adjacent historical data to be analyzed in the data sequence based on the anomaly detection algorithm, and use the historical data to be analyzed, of which the outlier in the data sequence reaches an anomaly condition, as the anomaly data.
Taking the data of the last 2 months of the order volume as the historical data to be analyzed as an example, performing time sequence decomposition processing on the plurality of historical data to be analyzed by using a time sequence decomposition algorithm to obtain a data sequence arranged according to a time sequence, wherein a trend chart of the data sequence is shown as a figure 2B, wherein a horizontal axis in the figure 2B represents the date, and a vertical axis represents the order volume. Then, based on an anomaly detection algorithm, performing anomaly value calculation on two adjacent historical data to be analyzed in the data sequence, and taking the historical data to be analyzed, of which the anomaly value reaches an anomaly condition, in the data sequence as anomaly data, wherein the point is the point A hit by a circle in FIG. 2B. After the point A is determined, further positioning is started to cause the abnormality of the point A.
203. At least one target first dimension is extracted from a plurality of first dimensions associated with the anomaly data.
In the embodiment of the application, the dimensions affecting the type of data may be related to different types of data, for example, the dimensions affecting the order quantity may include a user, a merchant, a platform, a scene, and the like, and in order to analyze which dimension specifically causes occurrence of abnormal data, the platform queries a plurality of first dimensions related to the abnormal data, and performs preliminary analysis on the plurality of first dimensions. The preset dimensions may be set, and a plurality of first dimensions associated with the abnormal data are selected from the preset dimensions, and a secondary dimension associated with each first dimension and a dimension continuously associated with the secondary dimension may also be set in advance, and as shown in fig. 2C, the preset dimensions may include a user, a user_merchant intersection, a merchant, a platform, and a scene; wherein the secondary dimensions associated with the "user" dimension include "occupation", "frequency (lifecycle)"; secondary dimensions associated with the "frequency (lifecycle)" dimension include "active user", "inactive user", "super member", and "new user". The following dimensional association conditions of user-merchant cross, merchant, platform, scene and the like of other dimensions are similar to the dimension of user, and are not repeated here.
Furthermore, the influence on the abnormal data in the plurality of first dimensions is small, the first dimensions with small influence on the abnormal data have no analysis meaning, and the first dimensions are directly filtered, so that the platform can extract at least one target first dimension in the plurality of first dimensions related to the abnormal data, and the target first dimensions are the first dimensions with large influence on the abnormal data. The process of determining the at least one target first dimension is described below:
first, the platform queries a plurality of first dimensions associated with the anomaly data and calculates cross entropy for each of the plurality of first dimensions. The computation of the cross entropy can be realized by adopting a JSD (Jensen-Shannon Divergence, cross entropy divergence) algorithm, and the specific formula is shown in the following formula 1:
equation 1:
Figure BDA0004082759340000131
wherein equation 1 is actually used to measure the distance of different data distributions, D JS For representing the calculated cross entropy, P for representing the differenceThe normal data, Q is used for representing a control data group of abnormal data, i.e. data having a control relationship between the occurrence date of the abnormal data and the occurrence date of the abnormal data in the first dimension and the same data type as the abnormal data, and i represents all single samples in the control data group. For example, P may be a data set of data of the same day, and Q may be a data set corresponding to data of Zhou Huanbi of the same day.
In order to recall the dimension with the largest change, a cross entropy threshold is set in the platform, after the cross entropy of each first dimension is calculated, the platform can acquire the cross entropy threshold, and a plurality of first dimensions with the cross entropy larger than the cross entropy threshold are extracted from the plurality of first dimensions to serve as a plurality of candidate first dimensions. Then, the platform sorts the plurality of candidate first dimensions according to the order from large to small of the cross entropy to obtain a first dimension sequence, and takes at least one candidate first dimension arranged at the head of the first dimension sequence as at least one target first dimension, namely, several candidate first dimensions arranged in front after sorting are output as target first dimensions for subsequent continuous analysis. Therefore, the dimension with the largest change is recalled as the candidate first dimension through the JSD algorithm, and it is to be noted that the cross entropy can be calculated by adopting information entropy, information gain or KL (Kullback-Leibler divergence, relative entropy) divergence in the practical application process, but the JSD has symmetry, is insensitive to the number of dimension values, is small in interference and optimal in effect, and the specific process for calculating the cross entropy is not limited.
204. And respectively calculating an abnormal contribution value of each target first dimension, and taking the target first dimension of which the abnormal contribution value meets the contribution value threshold as an abnormal dimension.
In the embodiment of the present application, the degree of influence of the first dimension of each target on the abnormal data is also different, and the greater the degree of influence, the more important is required as the cause of the abnormality. Therefore, in the embodiment of the present application, after determining at least one target first dimension, the platform calculates an abnormal contribution value of each target first dimension, sets a contribution value threshold, and uses the target first dimension, in which the abnormal contribution value meets the contribution value threshold, as an abnormal dimension, so that the abnormal dimension is used as a potential cause for influencing abnormal data. The process of calculating the abnormal contribution value of the first dimension of the target is described below:
for each target first dimension, the platform reads the current day data value of the abnormal data in the target first dimension, and queries the comparison data value of the abnormal data in the target first dimension, wherein the comparison data value is the data value of the comparison date corresponding to the occurrence date of the current data value, for example, the comparison data value can be Zhou Huanbi data corresponding to the current day data value of the current day.
And then, the platform inquires the total value of the current date corresponding to the occurrence date of the current data value of the abnormal data, and inquires the total value of the comparison corresponding to the comparison date of the abnormal data. And calculating a first difference value between the current day data value and the comparison data value, calculating a second difference value between the current day total value and the comparison total value, and taking the ratio of the first difference value to the second difference value as an abnormal contribution value of the target first dimension.
Wherein, the above-described calculation process of the abnormal contribution value can be realized by the following formula 2:
equation 2: EP (EP) ij =(A ij (m)-F ij (m))/(A(m)-F(m))
Wherein EP ij For representing calculated abnormal contribution value, A ij (m) a data value of the current day representing the abnormal data in the first dimension of the object, F ij (m) is used for representing a comparison data value of the abnormal data under the target first dimension, A (m) is used for representing a current day total value corresponding to the occurrence date of the abnormal data on the current data value, F (m) is used for representing a comparison total value corresponding to the comparison date of the abnormal data, i is used for representing all single samples under the comparison data value, j is used for representing all single samples under the comparison data value as two-dimensional data samples, and in practical application, the samples can also be single-dimensional samples, and the application is not limited in particular. Thus, the ratio of the influence of the value change on the overall can be measured through the calculation of the abnormal contribution value, the value with the largest change under the first dimension of the target is found, the first dimension of the target to which the value belongs is taken as the abnormal dimension, and the abnormal dimension is differentPotential reasons for frequent data. The sum of the contributions of the values in each dimension is 1.
205. Acquiring a preset analysis depth, and executing the following step 206 when the preset analysis depth indicates single-dimension analysis; when the preset analysis depth indicates multi-dimensional analysis, the following steps 207 to 209 are performed.
In the embodiment of the present application, since fine-grained secondary dimensions are also associated under dimensions, in order to determine whether to further analyze secondary dimensions under abnormal dimensions, a preset analysis depth is set in the platform, where the preset analysis depth is used for indicating how many layers of dimension analysis are performed.
For the current progress, after the abnormal dimension is determined, the dimension analysis of 1 layer is completed, so that the platform can obtain the preset analysis depth, when the preset analysis depth is equal to 1, the preset analysis depth indicates single-dimension analysis, the platform can directly take the currently determined abnormal dimension as an abnormal reason, and the dimension disassembly analysis is not required to be performed on the second dimension related to the abnormal dimension, namely, the following step 206 is executed. And when the preset analysis depth is greater than 1, indicating that the preset analysis depth indicates multi-dimensional analysis, continuing to perform dimension disassembly analysis on the second dimension associated with the abnormal dimension by the platform, taking all abnormal dimensions determined in the whole dimension disassembly process as abnormal reasons, and finally obtaining at least one abnormal reason with the quantity meeting the preset analysis depth, namely executing the following steps 207 to 208.
It should be noted that, in the actual application process, a judgment condition may also be set, and the judgment condition determines whether to continue to analyze the currently determined abnormal dimension. Specific judgment conditions may include cross entropy variability, current dimension characteristics, depth threshold, etc., such that when the abnormal dimension satisfies the judgment conditions, the dimension disassembly analysis of the abnormal dimension is stopped, and the following step 206 is performed; and when the abnormal dimension does not meet the judgment condition, continuing to perform dimension disassembly analysis on the abnormal dimension, and executing the following steps 207 to 208. For example, assuming that the judgment condition is cross entropy variation, when the cross entropy of the abnormal dimension reaches the cross entropy variation, dimension disassembly analysis is not needed; assuming that the judgment condition is the "weather" dimension, when the determined abnormal dimension is related to the "weather", dimension disassembly analysis is not needed; assuming that the judgment condition indicates that the analysis depth is 1, and that the analysis of 1-layer depth is already performed at present, the dimension disassembly analysis is not required. The method for determining whether to continue to execute the dimension disassembly analysis is not particularly limited.
206. When the preset analysis depth indicates single-dimensional analysis, taking the abnormal dimension as an abnormal reason, generating a single-dimensional analysis result comprising the abnormal reason, and outputting the single-dimensional analysis result.
In the embodiment of the application, when the preset analysis depth indicates single-dimensional analysis, the platform directly takes the currently determined abnormal dimension as an abnormal reason and does not need dimension disassembly analysis on the second dimension related to the abnormal dimension, so that the platform takes the abnormal dimension as the abnormal reason, generates a single-dimensional analysis result comprising the abnormal reason, and outputs the single-dimensional analysis result. For example, assuming that the currently determined anomaly dimension is "weather", a single-dimension analysis result may be generated by using the "weather" as the anomaly cause. Specifically, when outputting the single-dimensional analysis result, the method can query the contact ways such as a mailbox and a mobile phone number set in the platform by the staff, take the contact ways as a result receiver, and output the single-dimensional analysis result to the result receiver, so that the staff can perform early warning or find new ideas based on the single-dimensional analysis result.
207. When the preset analysis depth indicates multi-dimensional analysis, inquiring a plurality of second dimensions related to the abnormal dimension, determining another abnormal dimension in the plurality of second dimensions, determining a depth value of the preset analysis depth indication, continuing to determine the abnormal dimension in a plurality of third dimensions related to the other abnormal dimension until the number of the determined abnormal dimensions is equal to the depth value, obtaining at least one abnormal dimension with the number meeting the preset analysis depth, and taking the at least one abnormal dimension as at least one abnormality reason.
In the embodiment of the application, when the preset analysis depth indicates multi-dimensional analysis, the platform continues to perform dimension disassembly analysis on the second dimension associated with the abnormal dimension, all abnormal dimensions determined in the whole dimension disassembly process are used as abnormal reasons, and at least one abnormal reason with the quantity meeting the preset analysis depth is finally obtained. Thus, the platform will continue to query the plurality of second dimensions associated with the anomalous dimension, determining another anomalous dimension in the plurality of second dimensions. Specifically, when another abnormal dimension is determined, the platform calculates cross entropy of each second dimension in the plurality of second dimensions, acquires a cross entropy threshold, extracts a plurality of second dimensions with cross entropy larger than the cross entropy threshold from the plurality of second dimensions as a plurality of candidate second dimensions, sorts the plurality of candidate second dimensions according to the order of the cross entropy from large to small, obtains a second dimension sequence, and takes at least one candidate second dimension arranged at the head of the second dimension sequence as at least one target second dimension. And then, respectively calculating the abnormal contribution value of each target second dimension in at least one target second dimension, and taking the target second dimension of which the abnormal contribution value meets the contribution value threshold in the at least one target second dimension as the other abnormal dimension. The process of determining the other abnormal dimension specifically corresponds to the processes described in steps 203 to 204, and will not be described herein.
And then, the platform determines the depth value indicated by the preset analysis depth, and continues to determine the abnormal dimension in a plurality of third dimensions related to another abnormal dimension until the number of the determined abnormal dimensions is equal to the depth value, so as to obtain at least one abnormal dimension with the number meeting the preset analysis depth, and the at least one abnormal dimension is used as at least one abnormal reason. The process of determining the abnormal dimension among the plurality of third dimensions is identical to the processes described in the above steps 203 to 204, and will not be described herein.
For example, if the preset analysis depth is 3, the platform continues to perform dimension disassembly analysis on the second dimension associated with the abnormal dimension, then obtains another abnormal dimension, continues to perform dimension disassembly analysis on the third dimension associated with the other abnormal dimension, and then obtains an abnormal dimension, obtains 3 abnormal dimensions, and takes the 3 abnormal dimensions as the reason of the abnormality.
In this way, some dimensions with larger influence are selected through the JSD algorithm, the dimensions are screened through the abnormal contribution value, and the dimension with the largest abnormal contribution value is selected as the basis of next-layer dimension disassembly or the abnormal reason is directly output.
208. An anomaly analysis result is generated based on at least one anomaly cause.
In this embodiment of the present application, after determining at least one abnormality cause, the platform may merge and reorder the at least one abnormality cause, and generate an abnormality analysis result according to the ranking of the at least one abnormality cause, where a specific process of generating the abnormality analysis result is as follows:
first, the platform identifies duplicate exception causes among the at least one exception cause and filters the duplicate exception causes among the at least one exception cause. And then, respectively reading the abnormal contribution value corresponding to each abnormal reason in the filtered at least one abnormal reason, and sequencing the filtered at least one abnormal reason according to the order of the abnormal contribution values from the big to the small to obtain a reason sequence, wherein the abnormal reasons are all evolved from the abnormal dimension, so that the abnormal contribution value corresponding to the abnormal reason, namely the abnormal contribution value corresponding to the abnormal dimension. And finally, determining the preset output quantity by the platform, extracting target abnormality reasons of the preset output quantity arranged at the head of the team from the reason sequence, and generating an abnormality analysis result of the target abnormality reasons comprising the preset output quantity. The preset output number can be set to any value of 3-5, so that 3-5 target abnormality reasons can be output subsequently as abnormality analysis results.
209. And outputting an abnormality analysis result.
In the embodiment of the application, after the abnormal analysis result is determined, the platform outputs the abnormal analysis result to related staff for reference by the staff, so that potential risks are prepared in advance, and a better thought can be found to be popularized.
In the practical application process, the platform can set corresponding explanation information for each abnormal cause in advance, the explanation information is used for explaining the abnormal caused by the platform, the explanation information is sent to staff for verification, and after the verification is passed, the analysis result of the abnormal and the explanation information are pushed together, so that the positioning and the explanation of the abnormal cause can be realized, the verification of the abnormal cause can be realized, and the accuracy of the abnormal analysis is further ensured, and the specific process is as follows:
firstly, the platform reads the target abnormality reasons included in the abnormality analysis result and inquires the explanation information related to the target abnormality reasons. And then, the platform marks the abnormal analysis result by adopting the interpretation information, generates a result verification prompt comprising the marked abnormal analysis result, and outputs the result verification prompt for the staff to verify. When a verification passing instruction is received based on the result verification reminding, the platform determines a result receiver and pushes the marked abnormal analysis result to the result receiver. The result receiver may be a contact way such as a mailbox and a mobile phone number set in the platform by the staff, which is not specifically limited in this application.
In the practical application process, the platform can also automatically verify the accuracy of the interpretation information, for example, if the reasons of the abnormality included in the abnormality analysis result are weather, the platform can automatically inquire the weather condition and the order quantity after the date corresponding to the abnormality data, and the number of people who take dinner out in clear weather is higher than the number of people who take takeaway in rainy weather, so that the order quantity of takeaway is influenced, and the abnormality analysis result is successfully verified.
In summary, the specific flow of the technical scheme of the application is summarized as follows: referring to fig. 2D, the platform determines abnormal data and acquires a plurality of preset dimensions. And then, the platform queries a plurality of first dimensions related to the abnormal data in a plurality of preset dimensions, performs cross entropy calculation on each first dimension, and extracts at least one target first dimension of which the cross entropy meets the condition in the plurality of first dimensions. Then, the platform calculates the abnormal contribution value of each target first dimension respectively, compares the calculated abnormal contribution value with the contribution value threshold value, and judges whether the target first dimension is taken as the cause of the abnormality. When the abnormal contribution value of one target first dimension is lower than the contribution value threshold, the target first dimension is not required to be used as an abnormal reason, and when the abnormal contribution value of one target first dimension is greater than or equal to the contribution value threshold, the target first dimension is required to be used as an abnormal reason. The lower platform determines the preset analysis depth, determines whether to continue dimension disassembly analysis on the second dimension associated with the abnormal dimension according to the preset analysis depth, if so, repeatedly executes the process, queries the second dimension associated with the abnormal dimension in a plurality of preset dimensions, and continues to determine another abnormal dimension until the number of the determined abnormal dimensions is equal to the depth value indicated by the preset analysis depth, and completes the dimension disassembly analysis. And finally, taking the determined at least one abnormal dimension as at least one abnormal reason, merging and rearranging the at least one abnormal reason, generating an abnormal analysis result according to the ranking of the at least one abnormal reason, and outputting the abnormal analysis result.
Taking the order quantity as an example, referring to fig. 2E, assuming that the order quantity is abnormal data, among a plurality of first dimensions related to the order quantity, a target first dimension determined by calculating cross entropy is "client version number", "weather", and "take out weather", wherein an abnormal contribution value of version number "8.10.1" in the dimension of "client version number" is calculated to be "141%", an abnormal contribution value of version number "8.10.0" in the dimension of "client version number" is calculated to be "131%", an abnormal contribution value of "clody" in the dimension of "weather" is calculated to be "177%", an abnormal contribution value of "partly_clody_day" in the dimension of "weather" is calculated to be "131%", an abnormal contribution value of "if take out weather" is calculated to be "300%", and then it is possible to determine if the dimension with the highest abnormal contribution value is "take out weather". Next, continuing to disassemble the dimension of 'take-out weather' to determine that the associated dimension is 'client version number', 'weather', 'merchant quality grade', wherein the abnormal contribution value of the version number '8.10.1' in the dimension of 'client version number' is calculated to be '30%', the abnormal contribution value of the 'cloady' in the dimension of 'weather' is calculated to be '59%', the abnormal contribution value of the 'a' in the dimension of 'merchant quality grade' is calculated to be '39.7%', and the dimension with the highest abnormal contribution value can be determined to be 'weather'. Then, dimension disassembly is continuously carried out on the weather, the associated dimension is determined to be a client version number and a merchant quality grade, wherein the abnormal contribution value of version number 8.10.1 under the dimension of the client version number is calculated to be 22.7%, the abnormal contribution value of merchant quality grade A is calculated to be 42.5%, and thus, the finally determined abnormal dimension is respectively taken as take-out weather, weather and merchant quality grade, and as can be seen from the path in FIG. 2E, the order quantity change is mainly that the reduction of the order quantity of class A merchants is large due to weather reasons, and the three dimensions are output as abnormal reasons. Therefore, the application provides a heuristic attribution algorithm for finding main factors from a large number of dimension combinations and a dimension disassembly tree formed by the heuristic attribution algorithm, so that invalid operation is reduced, and analysis efficiency is improved. Moreover, the analysis framework can also monitor index change in real time and alarm, and can locate the index change reason.
According to the method provided by the embodiment of the application, through dimension disassembly analysis, the main dimension causing the abnormal data is found in a large number of dimensions, and the abnormal analysis result for explaining the main dimension is generated, so that other interference dimensions can be filtered, the abnormal reasons can be rapidly positioned, invalid operation is reduced, and analysis efficiency and accuracy are improved.
Further, as a specific implementation of the method shown in fig. 1, an embodiment of the present application provides an anomaly analysis device, as shown in fig. 3, where the device includes: a determination module 301, a calculation module 302, an analysis module 303 and a generation module 304.
The determining module 301 is configured to obtain a plurality of historical data to be analyzed in response to an anomaly analysis request, and determine anomaly data in the plurality of historical data to be analyzed;
the calculating module 302 is configured to extract at least one target first dimension from the plurality of first dimensions associated with the abnormal data, calculate an abnormal contribution value of each target first dimension, and use the target first dimension whose abnormal contribution value meets the contribution value threshold as an abnormal dimension;
the analysis module 303 is configured to determine a preset analysis depth, and perform dimension disassembly analysis on a second dimension associated with the abnormal dimension according to the preset analysis depth to obtain at least one abnormal cause of which the number meets the preset analysis depth;
The generating module 304 is configured to generate an anomaly analysis result based on the at least one anomaly cause, and output the anomaly analysis result.
In a specific application scenario, the determining module 301 is configured to determine, in response to the anomaly analysis request, a type of data to be analyzed and a time period to be analyzed indicated by the anomaly analysis request; the matched historical data which occur in the time period to be analyzed and have the data type consistent with the data type to be analyzed are used as the historical data to be analyzed; performing time sequence decomposition processing on the plurality of historical data to be analyzed by using a time sequence decomposition algorithm to obtain a data sequence arranged according to a time sequence; and calculating abnormal values of two adjacent historical data to be analyzed in the data sequence based on an abnormal detection algorithm, and taking the historical data to be analyzed, of which the abnormal values reach abnormal conditions, in the data sequence as the abnormal data.
In a specific application scenario, the computing module 302 is configured to query the plurality of first dimensions associated with the abnormal data, and compute cross entropy of each of the plurality of first dimensions; acquiring a cross entropy threshold, and extracting a plurality of first dimensions with cross entropy larger than the cross entropy threshold from the plurality of first dimensions as a plurality of candidate first dimensions; and sequencing the plurality of candidate first dimensions according to the order of the cross entropy from large to small to obtain a first dimension sequence, and taking at least one candidate first dimension arranged at the head of the first dimension sequence as the at least one target first dimension.
In a specific application scenario, the calculating module 302 is configured to, for each target first dimension, read a current day data value of the abnormal data in the target first dimension, and query a comparison data value of the abnormal data in the target first dimension, where the comparison data value is a data value of a comparison date corresponding to an occurrence date of the current data value; inquiring the total value of the abnormal data on the current date corresponding to the occurrence date of the current data value, and inquiring the total value of the abnormal data corresponding to the comparison date; calculating a first difference value between the current day data value and the control data value, and calculating a second difference value between the current day total value and the control total value; and taking the ratio of the first difference value to the second difference value as an abnormal contribution value of the target first dimension.
In a specific application scenario, the generating module 304 is further configured to obtain the preset analysis depth; when the preset analysis depth indicates single-dimensional analysis, taking the abnormal dimension as an abnormal reason; and generating a single-dimensional analysis result comprising the abnormal reason, and outputting the single-dimensional analysis result.
In a specific application scenario, the analysis module 303 is configured to query a plurality of second dimensions associated with the abnormal dimension when the preset analysis depth indicates multidimensional analysis, and determine another abnormal dimension in the plurality of second dimensions; determining the depth value indicated by the preset analysis depth, and continuing to determine the abnormal dimension in a plurality of third dimensions associated with the other abnormal dimension until the number of the determined abnormal dimensions is equal to the depth value, so as to obtain at least one abnormal dimension with the number meeting the preset analysis depth; and taking the at least one abnormal dimension as the at least one abnormal reason.
In a specific application scenario, the analysis module 303 is configured to calculate cross entropy of each of the plurality of second dimensions; acquiring a cross entropy threshold, and extracting a plurality of second dimensions with cross entropy larger than the cross entropy threshold from the plurality of second dimensions as a plurality of candidate second dimensions; sequencing the plurality of candidate second dimensions according to the order of the cross entropy from large to small to obtain a second dimension sequence, and taking at least one candidate second dimension arranged at the head of the second dimension sequence as at least one target second dimension; and respectively calculating an abnormal contribution value of each target second dimension in the at least one target second dimension, and taking the target second dimension, of which the abnormal contribution value meets the contribution value threshold, in the at least one target second dimension as the other abnormal dimension.
In a specific application scenario, the generating module 304 is configured to identify a repeated abnormality cause in the at least one abnormality cause, and filter the repeated abnormality cause in the at least one abnormality cause; respectively reading the abnormality contribution value corresponding to each abnormality cause in the filtered at least one abnormality cause, and sequencing the filtered at least one abnormality cause according to the order of the abnormality contribution values from the high to the low to obtain a cause sequence; determining a preset output number, extracting target abnormality reasons of the preset output number arranged at the head of the team from the reason sequence, and generating the abnormality analysis result comprising the target abnormality reasons of the preset output number.
In a specific application scenario, the apparatus further includes:
the query module is used for reading target abnormal reasons included in the abnormal analysis results and querying explanation information associated with the target abnormal reasons;
the labeling module is used for labeling the abnormal analysis result by adopting the interpretation information, generating a result verification prompt comprising the labeled abnormal analysis result, and outputting the result verification prompt;
and the pushing module is used for determining a result receiver when receiving the verification passing instruction based on the result verification reminding, and pushing the marked abnormal analysis result to the result receiver.
According to the device provided by the embodiment of the application, a plurality of historical data to be analyzed are obtained in response to an abnormal analysis request, the abnormal data are determined in the historical data to be analyzed, at least one target first dimension is extracted in a plurality of first dimensions related to the abnormal data, the abnormal contribution value of each target first dimension is calculated respectively, the target first dimension of which the abnormal contribution value meets the threshold of the contribution value is used as an abnormal dimension, the preset analysis depth is determined, the second dimension related to the abnormal dimension is subjected to dimension disassembly analysis according to the preset analysis depth, at least one abnormal cause of which the number meets the preset analysis depth is obtained, an abnormal analysis result is generated based on the at least one abnormal cause, and the abnormal analysis result is output.
It should be noted that, for other corresponding descriptions of each functional unit related to the abnormality analysis apparatus provided in the embodiment of the present application, reference may be made to corresponding descriptions in fig. 1 and fig. 2A to fig. 2E, and no further description is given here.
It should be noted that, user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.
In an exemplary embodiment, referring to fig. 4, there is also provided a computer device, which includes a bus, a processor, a memory, and a communication interface, and may further include an input-output interface and a display device, where each functional unit may perform communication with each other through the bus. The memory stores a computer program and a processor for executing the program stored in the memory to execute the abnormality analysis method in the above embodiment.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the anomaly analysis method.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented in hardware, or may be implemented by means of software plus necessary general hardware platforms. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to perform the methods described in various implementation scenarios of the present application.
Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application.
Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario.
The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.

Claims (10)

1. An anomaly analysis method, comprising:
responding to an abnormality analysis request, acquiring a plurality of historical data to be analyzed, and determining the abnormality data in the plurality of historical data to be analyzed;
extracting at least one target first dimension from a plurality of first dimensions associated with the abnormal data, respectively calculating an abnormal contribution value of each target first dimension, and taking the target first dimension of which the abnormal contribution value meets a contribution value threshold as an abnormal dimension;
determining a preset analysis depth, and performing dimension disassembly analysis on a second dimension related to the abnormal dimension according to the preset analysis depth to obtain at least one abnormal cause with the quantity meeting the preset analysis depth;
generating an abnormality analysis result based on the at least one abnormality cause, and outputting the abnormality analysis result.
2. The method of claim 1, wherein the obtaining a plurality of historical data to be analyzed in response to the anomaly analysis request, determining anomaly data among the plurality of historical data to be analyzed, comprises:
Responding to the abnormality analysis request, and determining the type of data to be analyzed and the time period to be analyzed indicated by the abnormality analysis request;
the matched historical data which occur in the time period to be analyzed and have the data type consistent with the data type to be analyzed are used as the historical data to be analyzed;
performing time sequence decomposition processing on the plurality of historical data to be analyzed by using a time sequence decomposition algorithm to obtain a data sequence arranged according to a time sequence;
and calculating abnormal values of two adjacent historical data to be analyzed in the data sequence based on an abnormal detection algorithm, and taking the historical data to be analyzed, of which the abnormal values reach abnormal conditions, in the data sequence as the abnormal data.
3. The method of claim 1, wherein extracting at least one target first dimension among the plurality of first dimensions associated with the anomaly data comprises:
querying the plurality of first dimensions associated with the abnormal data, and calculating cross entropy of each first dimension in the plurality of first dimensions;
acquiring a cross entropy threshold, and extracting a plurality of first dimensions with cross entropy larger than the cross entropy threshold from the plurality of first dimensions as a plurality of candidate first dimensions;
And sequencing the plurality of candidate first dimensions according to the order of the cross entropy from large to small to obtain a first dimension sequence, and taking at least one candidate first dimension arranged at the head of the first dimension sequence as the at least one target first dimension.
4. The method of claim 1, wherein the separately calculating the anomaly contribution value for each target first dimension comprises:
for each target first dimension, reading a current day data value of the abnormal data in the target first dimension, and inquiring a comparison data value of the abnormal data in the target first dimension, wherein the comparison data value is a data value of a comparison date corresponding to the occurrence date of the current data value;
inquiring the total value of the abnormal data on the current date corresponding to the occurrence date of the current data value, and inquiring the total value of the abnormal data corresponding to the comparison date;
calculating a first difference value between the current day data value and the control data value, and calculating a second difference value between the current day total value and the control total value;
and taking the ratio of the first difference value to the second difference value as an abnormal contribution value of the target first dimension.
5. The method of claim 1, wherein the extracting at least one target first dimension from the plurality of first dimensions associated with the anomaly data, calculating an anomaly contribution value for each target first dimension, and using the target first dimension for which the anomaly contribution value meets a contribution value threshold as an anomaly dimension, the method further comprises:
acquiring the preset analysis depth;
when the preset analysis depth indicates single-dimensional analysis, taking the abnormal dimension as an abnormal reason;
and generating a single-dimensional analysis result comprising the abnormal reason, and outputting the single-dimensional analysis result.
6. The method according to claim 1, wherein determining the preset analysis depth, performing dimension disassembly analysis on the second dimension associated with the abnormal dimension according to the preset analysis depth, to obtain at least one abnormality cause whose number satisfies the preset analysis depth, includes:
querying a plurality of second dimensions associated with the abnormal dimension when the preset analysis depth indicates multi-dimensional analysis, and determining another abnormal dimension in the plurality of second dimensions;
determining the depth value indicated by the preset analysis depth, and continuing to determine the abnormal dimension in a plurality of third dimensions associated with the other abnormal dimension until the number of the determined abnormal dimensions is equal to the depth value, so as to obtain at least one abnormal dimension with the number meeting the preset analysis depth;
And taking the at least one abnormal dimension as the at least one abnormal reason.
7. The method of claim 6, wherein the determining another anomaly dimension among the plurality of second dimensions comprises:
calculating cross entropy of each of the plurality of second dimensions;
acquiring a cross entropy threshold, and extracting a plurality of second dimensions with cross entropy larger than the cross entropy threshold from the plurality of second dimensions as a plurality of candidate second dimensions;
sequencing the plurality of candidate second dimensions according to the order of the cross entropy from large to small to obtain a second dimension sequence, and taking at least one candidate second dimension arranged at the head of the second dimension sequence as at least one target second dimension;
and respectively calculating an abnormal contribution value of each target second dimension in the at least one target second dimension, and taking the target second dimension, of which the abnormal contribution value meets the contribution value threshold, in the at least one target second dimension as the other abnormal dimension.
8. An abnormality analysis device, comprising:
the determining module is used for responding to the abnormality analysis request, acquiring a plurality of historical data to be analyzed and determining the abnormality data in the historical data to be analyzed;
The calculating module is used for extracting at least one target first dimension from a plurality of first dimensions associated with the abnormal data, calculating an abnormal contribution value of each target first dimension respectively, and taking the target first dimension of which the abnormal contribution value meets a contribution value threshold as an abnormal dimension;
the analysis module is used for determining a preset analysis depth, and carrying out dimension disassembly analysis on a second dimension related to the abnormal dimension according to the preset analysis depth to obtain at least one abnormal reason with the quantity meeting the preset analysis depth;
and the generation module is used for generating an abnormality analysis result based on the at least one abnormality cause and outputting the abnormality analysis result.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
CN202310126982.8A 2023-02-16 2023-02-16 Abnormality analysis method, abnormality analysis device, abnormality analysis computer device, and abnormality analysis program Pending CN116302640A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310126982.8A CN116302640A (en) 2023-02-16 2023-02-16 Abnormality analysis method, abnormality analysis device, abnormality analysis computer device, and abnormality analysis program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310126982.8A CN116302640A (en) 2023-02-16 2023-02-16 Abnormality analysis method, abnormality analysis device, abnormality analysis computer device, and abnormality analysis program

Publications (1)

Publication Number Publication Date
CN116302640A true CN116302640A (en) 2023-06-23

Family

ID=86827924

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310126982.8A Pending CN116302640A (en) 2023-02-16 2023-02-16 Abnormality analysis method, abnormality analysis device, abnormality analysis computer device, and abnormality analysis program

Country Status (1)

Country Link
CN (1) CN116302640A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116957421A (en) * 2023-09-20 2023-10-27 山东济宁运河煤矿有限责任公司 Washing and selecting production intelligent monitoring system based on artificial intelligence
CN117454089A (en) * 2023-11-17 2024-01-26 浙江预策科技有限公司 Real-time analysis method and device for instrument panel, computer equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116957421A (en) * 2023-09-20 2023-10-27 山东济宁运河煤矿有限责任公司 Washing and selecting production intelligent monitoring system based on artificial intelligence
CN116957421B (en) * 2023-09-20 2024-01-05 山东济宁运河煤矿有限责任公司 Washing and selecting production intelligent monitoring system based on artificial intelligence
CN117454089A (en) * 2023-11-17 2024-01-26 浙江预策科技有限公司 Real-time analysis method and device for instrument panel, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109767322B (en) Suspicious transaction analysis method and device based on big data and computer equipment
WO2020253358A1 (en) Service data risk control analysis processing method, apparatus and computer device
CN111614690B (en) Abnormal behavior detection method and device
US9514167B2 (en) Behavior based record linkage
CN116302640A (en) Abnormality analysis method, abnormality analysis device, abnormality analysis computer device, and abnormality analysis program
CN109858737B (en) Grading model adjustment method and device based on model deployment and computer equipment
CN109711955B (en) Poor evaluation early warning method and system based on current order and blacklist base establishment method
CN107729519B (en) Multi-source multi-dimensional data-based evaluation method and device, and terminal
CN111881302B (en) Knowledge graph-based bank public opinion analysis method and system
CN104246786A (en) Field selection for pattern discovery
CN109063969A (en) A kind of method and device of account risk assessment
CN106779278A (en) The evaluation system of assets information and its treating method and apparatus of information
CN111062808B (en) Credit card limit evaluation method, credit card limit evaluation device, computer equipment and storage medium
CN112991079B (en) Multi-card co-occurrence medical treatment fraud detection method, system, cloud end and medium
CN114692593B (en) Network information safety monitoring and early warning method
CN109242658B (en) Suspicious transaction report generation method, suspicious transaction report generation system, suspicious transaction report generation computer device and suspicious transaction report storage medium
US20170032707A1 (en) Method for determining a fruition score in relation to a poverty alleviation program
KR20200019741A (en) Data Analysis Support System and Data Analysis Support Method
Chang et al. Mining the networks of telecommunication fraud groups using social network analysis
CN112990989A (en) Value prediction model input data generation method, device, equipment and medium
CN114817518B (en) License handling method, system and medium based on big data archive identification
CN108921433B (en) Risk quantitative analysis system based on business continuity
CN112907308A (en) Data detection method and device and computer readable storage medium
CN116402596A (en) Data analysis method, device, computer equipment and readable storage medium
CN114495137A (en) Bill abnormity detection model generation method and bill abnormity detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination