CN112819386A - Method, system and storage medium for generating time series data with abnormity - Google Patents

Method, system and storage medium for generating time series data with abnormity Download PDF

Info

Publication number
CN112819386A
CN112819386A CN202110245171.0A CN202110245171A CN112819386A CN 112819386 A CN112819386 A CN 112819386A CN 202110245171 A CN202110245171 A CN 202110245171A CN 112819386 A CN112819386 A CN 112819386A
Authority
CN
China
Prior art keywords
anomaly
series data
anomalies
period
additive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110245171.0A
Other languages
Chinese (zh)
Inventor
蔡志平
王承禹
周桐庆
余广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202110245171.0A priority Critical patent/CN112819386A/en
Publication of CN112819386A publication Critical patent/CN112819386A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing

Abstract

The invention provides a method, a system and a storage medium for generating time sequence data with abnormity, which are characterized in that a period component, a noise component and a trend component are generated, and the components are combined into normal time sequence data through an additive model or a multiplicative model; then, an anomaly is generated, and the anomaly is injected into the normal time-series data to generate time-series data with an anomaly and an anomaly tag. The invention can generate time sequence data with random shapes but controllable characteristics of frequency, amplitude, first-fourth order central moment of noise, drift degree and the like, can inject an abnormity into normal time sequence data, has controllable position, degree and type of the abnormity, and can be used for evaluating the performance of an abnormity detection algorithm.

Description

Method, system and storage medium for generating time series data with abnormity
Technical Field
The invention belongs to the field of KPI (Key performance indicator) anomaly detection in intelligent operation and maintenance, and particularly relates to a time sequence data generation method with anomalies and anomaly labels.
Background
Large companies providing internet-based services need to closely monitor the real-time performance of their systems, since a brief service interruption or quality degradation may result in significant traffic loss. These real-time performance data (e.g., search response time, CPU usage) are typically collected and stored in the form of a time series, referred to as Key Performance Indicators (KPIs). In order to ensure smooth operation of the business, these companies often develop an abnormality detection system capable of accurately detecting KPI abnormalities and timely troubleshooting.
Detecting KPI anomalies requires collecting and flagging KPI data to test them before the anomaly detection algorithms are actually deployed. Unlike traditional time series data (such as weather or climate data), KPI data is much larger and requires a highly experienced domain expert to flag anomalies.
Despite the importance of KPIs, few KPI anomaly data sets are presently disclosed for public use. The main reasons for this phenomenon are two: (1) manually labeling KPI data requires domain knowledge and significant time costs. (2) Large companies are reluctant to publish KPI data for privacy and security reasons. For the first problem, even with advanced aids, it still takes tens of minutes to mark a piece of KPI data that is one year long. Optimistically predict that the first problem may eventually be solved in the future, but the second problem is unlikely to be alleviated due to the commercial value of KPI data.
The deficiencies of the disclosed KPI data with exception labels lead to the following problems with KPI exception detection: (1) the evaluation is not comprehensive, so that the KPI anomaly detection algorithm may work well on some public data sets, but the performance in a production environment may not be as expected. (2) It is difficult to build a hypothetical scenario to evaluate the performance of an algorithm under a hypothetical situation. KPI data collected from a production environment is static and does not contain some rare events. If a hypothetical scenario cannot be constructed to stress test an anomaly detector prior to its deployment, it is likely to overlook the fatal defects of the detector and ultimately cause serious problems.
In summary, KPI anomaly detection is one of the main functions of intelligent operation and maintenance, and there is a strong need for an anomaly detection algorithm capable of generating time series data with various characteristics, especially with anomaly and anomaly labels to construct a hypothetical scenario, and performing evaluation and stress test on the anomaly detection algorithm to solve the above problems.
Disclosure of Invention
In order to solve the above problems, the present invention provides a method, a system and a storage medium for generating time series data with an exception.
In order to achieve the technical purpose, the invention adopts the technical scheme that:
the method for generating the time series data with the exception comprises the following steps:
generating a periodic component, a noise component and a trend component, and combining the components into normal time sequence data through an additive model or a multiplicative model;
an anomaly is injected into the normal time-series data to generate time-series data with an anomaly and an anomaly tag.
As a further limitation of the present invention, the method further comprises running the detection algorithm to be evaluated on the generated time series data with the exception and the exception tag, and uniformly ranking the performance scores of the detection algorithm to be evaluated to measure the performance of the detection algorithm to be evaluated.
As a further limitation of the present invention, the periodic component is generated using a random midpoint displacement method. Specifically, the method comprises the following steps:
(1) generating a shape of a single period using a random midpoint displacement method;
(2) by the random midpoint displacement differentiation method, a plurality of periods having slightly different shapes are generated from the shape of the first period.
(3) Standardizing each period to enable the amplitude of each period to be 1, and obtaining a period expression;
(4) sampling according to the period length specified by a user, and changing the amplitude and the frequency of each period;
(5) these periods are connected to form the complete period component.
As a further limitation of the present invention, the noise component is generated using a pearson distribution.
As a further limitation of the present invention, the trend component is generated using a linear or non-linear function.
As a further limitation of the invention, the anomaly is generated using predefined anomaly patterns, wherein the anomaly patterns include two classes, additive anomalies and behavioral anomalies, wherein the additive anomalies are injected into the normal time series data by an additive model, and the behavioral anomalies are injected into the normal time series data by randomly changing the period shape, length, and randomly changing the first to fourth order central moments of the noise. Additive anomalies are generated using a predefined shape template, behavioral anomalies are generated by a random midpoint-shift differentiation method and a pearson distribution system. Further, extreme value theory is employed to ensure the quality of these additive and behavioral anomalies that are generated, i.e., to ensure that these anomalies are indeed anomalies. And then establishing a low probability boundary through an extreme value theory, so that additive abnormality falls in a low probability region, and behavior abnormality falls in a non-low probability region to ensure the quality of injection abnormality.
A system for generating time series data with anomalies includes
The time sequence generation module is used for generating a period component, a noise component and a trend component and combining the components into normal time sequence data through an additive model or a multiplicative model;
and the exception injection module is used for injecting exceptions into the normal time sequence data to generate time sequence data with exception and exception labels.
The time sequence generation module comprises a period generator, a noise generator, a trend generator and a component fusion module, wherein the period generator generates period components by using a random midpoint displacement method; the noise generator generates a noise component using a pearson distribution; the trend generator generates trend components using linear or non-linear functions; and the component fusion module combines the periodic component, the noise component and the trend component into complete time sequence data through an additive model or a multiplicative model.
And the anomaly injection module comprises an anomaly generation module and an injection module, wherein the anomaly generation module generates an anomaly by using a predefined anomaly mode, and the anomaly mode comprises two types of additive anomalies and behavioral anomalies. Further additive anomalies are generated using a pre-defined shape template, behavioral anomalies are generated by a stochastic midpoint displacement differentiation method and a pearson distribution system. And the injection module is responsible for injecting the additive anomaly into the normal time sequence data through an additive model, and is responsible for injecting the behavior anomaly into the normal time sequence data in a mode of randomly changing the shape and the length of a period and randomly changing the first-order to fourth-order central moments of noise.
Furthermore, the system for generating the time series data with the abnormality also comprises a benchmark evaluation module, wherein the benchmark evaluation module is responsible for running the detection algorithm to be evaluated on the generated time series data with the abnormality and the abnormality label, and uniformly ranking the performance scores of the detection algorithm to be evaluated so as to measure the performance of the detection algorithm to be evaluated.
The invention provides a computer device, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of any one of the time series data generation methods with the exception when executing the computer program.
The present invention provides a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of any of the above-described time-series data generation methods with an abnormality.
Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:
(1) the invention can generate time sequence data with random shapes but controllable characteristics of frequency, amplitude, first-fourth order central moment of noise, drift degree, and the like, and can inject the abnormity into normal data, and the position, degree and type of the abnormity are controllable. May be used to evaluate the performance of the anomaly detection algorithm.
(2) By using the extreme value theory, the invention can ensure the authenticity of the abnormity, thereby ensuring that the performance obtained by evaluating the detector on the generated data has strong positive correlation with the performance of the detector on the real data.
(3) The system only controls one variable to change each time, the influence of the controlled variables on the performance of the detection algorithm can be evaluated, and a hypothetical scene is set up to find the defects of the detection algorithm, but other technical schemes cannot.
(4) The system controls and generates the anomalies with different categories, so that the detection capability of the detection algorithm on different anomalies can be evaluated, and other technical schemes cannot be realized.
Drawings
FIG. 1 is a general block diagram of an embodiment of the present invention;
FIG. 2 is a schematic illustration of a random midpoint displacement differentiation process used in one embodiment of the present invention;
FIG. 3 is a schematic diagram of an additive exception template used in one embodiment of the present invention, wherein (a) is a class A exception template and (B) is a class B exception template;
FIG. 4 is a schematic diagram showing the abnormalities generated by the random midpoint displacement differentiation method used in one embodiment of the present invention, wherein (a) is a reference diagram for comparison, and (b), (c), (d), and (e) are periodic shape diagrams generated when the differentiation depth is 2, 4, 8, and 9, respectively.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention can solve the problems that the existing intelligent operation and maintenance field is difficult to set up a hypothesis scene due to the lack of time sequence data with abnormal labels, and the evaluation of a KPI abnormal detection algorithm is incomplete.
An embodiment of the present invention provides a method for generating time series data with an exception, including:
first, a periodic component, a noise component, and a trend component are generated, and the components are combined into normal time series data by an additive model or a multiplicative model.
And secondly, injecting an exception into the normal time sequence data to generate time sequence data with an exception and an exception tag.
In one embodiment of the invention, the periodic components are generated using a random midpoint displacement method. Specifically, the method comprises the following steps:
(1) generating a shape of a single period using a random midpoint displacement method;
(2) generating a plurality of periods having slightly different shapes from the shape of the first period by a random midpoint displacement differentiation method;
(3) standardizing each period to enable the amplitude of each period to be 1, and obtaining a period expression;
(4) sampling according to the period length specified by a user, and changing the amplitude and the frequency of each period;
(5) connecting the cycles to form a complete cycle component;
in one embodiment of the invention, the noise component is generated using a pearson distribution. The trend component is generated using a linear or non-linear function.
In an embodiment of the invention, the anomalies are generated using predefined anomaly patterns, wherein the anomaly patterns include two types, additive anomalies and behavioral anomalies, wherein the additive anomalies are injected into the normal time series data by an additive model, and the behavioral anomalies are injected into the normal time series data by randomly changing the cycle shape, length, and randomly changing the first to fourth order central moments of the noise. Additive anomalies are generated using a predefined shape template, behavioral anomalies are generated by a random midpoint-shift differentiation method and a pearson distribution system. Further, extreme value theory is employed to ensure the quality of these additive and behavioral anomalies that are generated, i.e., to ensure that these anomalies are indeed anomalies. And then establishing a low probability boundary through an extreme value theory, so that additive abnormality falls in a low probability region, and behavior abnormality falls in a non-low probability region to ensure the quality of injection abnormality.
In an embodiment of the present invention, the method further includes a third step of running the detection algorithm to be evaluated on the generated time series data with the exception and the exception tag, and uniformly ranking the performance scores of the detection algorithm to be evaluated to measure the performance of the detection algorithm to be evaluated.
In an embodiment of the present invention, a time series data generation system with an exception is provided, including: a time series generation module and an exception injection module, wherein:
the time sequence generation module comprises a period generator, a noise generator, a trend generator and a component fusion module, wherein the period generator generates period components by using a random midpoint displacement method; the noise generator generates a noise component using a pearson distribution; the trend generator generates trend components using linear or non-linear functions; the component fusion module combines the periodic component, the noise component and the trend component into complete time sequence data through an additive model or a multiplicative model;
the anomaly injection module comprises an anomaly generation module and an injection module, wherein the anomaly generation module generates anomalies by using a predefined anomaly mode, and the anomaly mode comprises two types, additive anomalies and behavioral anomalies; further generating additive abnormality by using a predefined shape template, and generating behavior abnormality by a random midpoint displacement differentiation method and a Pearson distribution system; and the injection module is responsible for injecting the additive anomaly into the normal time sequence data through an additive model, and is responsible for injecting the behavior anomaly into the normal time sequence data in a mode of randomly changing the shape and the length of a period and randomly changing the first-order to fourth-order central moments of noise.
Fig. 1 is an overall block diagram of a time-series data generation system with an exception in an embodiment of the present invention. As shown in the figure: a system for generating time series data with anomalies, comprising: the device comprises a time sequence generation module, an abnormal injection module and a benchmark evaluation module, wherein:
the time sequence generation module comprises a period generator, a noise generator, a trend generator and a component fusion module, wherein the period generator generates period components by using a random midpoint displacement method; the noise generator generates a noise component using a pearson distribution; the trend generator generates trend components using linear or non-linear functions; the component fusion module combines the periodic component, the noise component and the trend component into complete time sequence data through an additive model or a multiplicative model;
the anomaly injection module comprises an anomaly generation module and an injection module, wherein the anomaly generation module generates anomalies by using a predefined anomaly mode, and the anomaly mode comprises two types, additive anomalies and behavioral anomalies; further generating additive abnormality by using a predefined shape template, and generating behavior abnormality by a random midpoint displacement differentiation method and a Pearson distribution system; the injection module is responsible for injecting the additive anomaly into the normal time sequence data through the additive model, and is responsible for injecting the behavior anomaly into the normal time sequence data in a mode of randomly changing the shape and the length of a period and randomly changing the first-order to fourth-order central moments of noise;
the benchmark evaluation module is responsible for running the detection algorithm to be evaluated on the generated time sequence data with the abnormal and abnormal labels, and ranking the performance scores of the detection algorithm to be evaluated uniformly so as to measure the performance of the detection algorithm to be evaluated.
In fig. 1, the system for generating time series data with an exception is divided into three modules, namely a time series generation module, an exception injection module and a reference evaluation module, which correspond to three stages of a workflow in the method for generating time series data with an exception in sequence, namely a time series generation stage, an exception injection stage and a reference evaluation stage. In the time series generation phase, firstly, parameters are received from a user, a period generator, a noise generator and a trend generator are used for generating three components respectively according to the parameters, and then an additive model is used for combining the three components into complete time series data. In the abnormal injection stage, firstly, an extreme value theory is used for establishing a low probability boundary for normal data, then different types of abnormal are generated according to parameters specified by a user, if additive abnormal is injected, the abnormal value is ensured to fall in a low probability area, and if behavioral abnormal is injected, the abnormal value is limited to fall in a non-low probability area, so that the quality of the two types of abnormal is ensured. In the benchmark evaluation stage, the evaluated algorithm and the benchmark algorithm are operated on the generated data, and the obtained results are ranked and returned to the user.
FIG. 2 is an example of a random midpoint displacement method used by the period generator in an embodiment of the invention. For ease of explanation, the line segments are represented in normal font and the vectors are represented in bold, e.g., A represents a point, AB represents a line segment, and AB represents a vector. As shown in the figure: suppose there are two points P1And P2M is P1And P2Generating an and vector P1P2Vertical random displacement vector MP3,P3The coordinate calculation formula is
Figure BDA0002963837320000091
Wherein a | | | P1M||2Is a line segment P1Length of M, b | | | MP3||2Is a line segment MP3The length of (a) of (b),
Figure BDA0002963837320000092
is a line segment P1M and line segment P1P3The included angle of | | |. the luminance2Is L2 norm, thus obtaining P1,P2,P3Three points, also called control points, are connected in sequence from left to right to obtain a function consisting of two line segments, as shown by the broken line in fig. 2. Recursively in line segments P1P3And a line segment P3P2By performing this process, a curve with a random shape can be obtained, as shown by the solid broken line in the figure.
FIG. 3 is an example of an additive model exception template predefined in one embodiment that includes two subgraphs, (a) a class A exception template and (B) a class B exception template. By varying parameters in a predefined template, TSAGen can generate a large number of additive anomalies that differ in shape. Class a anomalies are generated by the following formula:
Figure BDA0002963837320000101
wherein the content of the first and second substances,
Figure BDA0002963837320000102
the rise and decay curves are used to indicate the rate of occurrence and recovery of an anomaly, respectively. Epsilon1For the length of rise time of the anomaly, the anomaly is at ε1The degree of time reaches the maximum, epsilon2For an abnormal decay time length, epsilon12The duration of the anomaly. a is a rising function, b is a decay function, h is the absolute degree of anomaly, x is time, given x, can pass through f1The value of the anomaly at that time is calculated. All parameters are explicitly exemplified in fig. 3.
Class B anomalies are generated by the following formula:
Figure BDA0002963837320000103
where ε is the duration of the anomaly, h1And h2Degree of the start and end of an anomaly, h1And h2Is the absolute degree of anomaly, x is the time, given x, can pass through f2The value of the anomaly at that time is calculated. All parameters are explicitly exemplified in fig. 3.
FIG. 4 is an example of behavior abnormality generated by the random midpoint displacement method in one embodiment, and FIG. 4 includes five subgraphs, as indicated by their subscripts, where (a) is a reference for comparison, and the following four subgraphs (b), (c), (d), and (e) are differentiation depth in sequence
Figure BDA0002963837320000104
The periodic shape generated in the case of 2, 4, 8 and 9. The method generates a cycle that is shaped differently from the initial cycle by starting differentiation at a certain depth during the random midpoint displacement. The same control point is shared in each period before the differentiation depth is reached, and after the differentiation depth is reached, a new control point is independently generated in each period, so that the shape of each period is different,as shown in the figure, the differentiation depth
Figure BDA0002963837320000111
The larger the difference in shape between the generated periods and vice versa.
In one embodiment, a computer device is provided, which may be a server. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store sample data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a time series data generation method with an exception.
In one embodiment, a computer device is provided, which includes a memory and a processor, the memory stores a computer program, and the processor executes the computer program to implement the steps of the time series data generation method with exception in the above embodiments.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the steps of the time-series data generation method with an exception in the above-described embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer readable storage medium, and the computer program can include the processes of the embodiments of the methods for generating time series data with an exception. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (12)

1. A method for generating time-series data with an exception, comprising:
generating a periodic component, a noise component and a trend component, and combining the components into normal time sequence data through an additive model or a multiplicative model;
an anomaly is injected into the normal time-series data to generate time-series data with an anomaly and an anomaly tag.
2. The method for generating time series data with the exception according to claim 1, further comprising running a detection algorithm to be evaluated on the generated time series data with the exception and the exception tag, and uniformly ranking performance scores of the detection algorithm to be evaluated to measure performance of the detection algorithm to be evaluated.
3. The method for generating time-series data with an anomaly according to claim 1, wherein the period component is generated using a random midpoint displacement method.
4. The method for generating time-series data with an exception according to claim 3, wherein generating a cycle component comprises the steps of:
(1) generating a shape of a single period using a random midpoint displacement method;
(2) by the random midpoint displacement differentiation method, a plurality of periods having slightly different shapes are generated from the shape of the first period.
(3) Standardizing each period to enable the amplitude of each period to be 1, and obtaining a period expression;
(4) sampling according to the period length specified by a user, and changing the amplitude and the frequency of each period;
(5) these periods are connected to form the complete period component.
5. The method of generating time-series data with an anomaly according to any one of claims 1 to 4, wherein a noise component is generated using a Pearson distribution.
6. The method of generating time-series data with an anomaly according to claim 5, wherein the trend component is generated using a linear or non-linear function.
7. The method for generating time series data with anomalies according to claim 1, 2, 3, 4 or 6, characterized in that the anomalies are generated using predefined anomaly patterns, wherein the anomaly patterns include two types, additive anomalies and behavioral anomalies, wherein the additive anomalies are injected into the normal time series data through an additive model, and the behavioral anomalies are injected into the normal time series data by randomly changing the period shape, length, and randomly changing the first to fourth order central moments of the noise.
8. The method for generating time-series data with an anomaly according to claim 7, wherein the additive anomaly is generated using a predefined shape template, and the behavioral anomaly is generated by a stochastic midpoint displacement differentiation method and a Pearson distribution system.
9. The method of claim 8, wherein extreme value theory is used to ensure the quality of the additive anomalies and behavioral anomalies that are generated, i.e. ensure that the anomalies are indeed anomalies; and then establishing a low probability boundary through an extreme value theory, so that additive abnormality falls in a low probability region, and behavior abnormality falls in a non-low probability region to ensure the quality of injection abnormality.
10. A system for generating time series data with anomalies, comprising:
the time sequence generation module comprises a period generator, a noise generator, a trend generator and a component fusion module, wherein the period generator generates period components by using a random midpoint displacement method; the noise generator generates a noise component using a pearson distribution; the trend generator generates trend components using linear or non-linear functions; the component fusion module combines the periodic component, the noise component and the trend component into complete time sequence data through an additive model or a multiplicative model;
the anomaly injection module comprises an anomaly generation module and an injection module, wherein the anomaly generation module generates anomalies by using a predefined anomaly mode, and the anomaly mode comprises two types, additive anomalies and behavioral anomalies; further generating additive abnormality by using a predefined shape template, and generating behavior abnormality by a random midpoint displacement differentiation method and a Pearson distribution system; and the injection module is responsible for injecting the additive anomaly into the normal time sequence data through an additive model, and is responsible for injecting the behavior anomaly into the normal time sequence data in a mode of randomly changing the shape and the length of a period and randomly changing the first-order to fourth-order central moments of noise.
11. The system for generating time series data with exceptions according to claim 10, further comprising a benchmark evaluation module, wherein the benchmark evaluation module is responsible for running the detection algorithm to be evaluated on the generated time series data with the exception and the exception tag and uniformly ranking the performance scores of the detection algorithm to be evaluated so as to measure the performance of the detection algorithm to be evaluated.
12. A storage medium having a computer program stored thereon, characterized in that: the computer program when being executed by a processor carries out the steps of the method for generating time-series data with an exception as set forth in claim 1.
CN202110245171.0A 2021-03-05 2021-03-05 Method, system and storage medium for generating time series data with abnormity Pending CN112819386A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110245171.0A CN112819386A (en) 2021-03-05 2021-03-05 Method, system and storage medium for generating time series data with abnormity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110245171.0A CN112819386A (en) 2021-03-05 2021-03-05 Method, system and storage medium for generating time series data with abnormity

Publications (1)

Publication Number Publication Date
CN112819386A true CN112819386A (en) 2021-05-18

Family

ID=75862910

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110245171.0A Pending CN112819386A (en) 2021-03-05 2021-03-05 Method, system and storage medium for generating time series data with abnormity

Country Status (1)

Country Link
CN (1) CN112819386A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204590A (en) * 2021-05-31 2021-08-03 中国人民解放军国防科技大学 Unsupervised KPI (Key performance indicator) anomaly detection method based on serialization self-encoder
CN113282876A (en) * 2021-07-20 2021-08-20 中国人民解放军国防科技大学 Method, device and equipment for generating one-dimensional time sequence data in anomaly detection

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101660400A (en) * 2009-09-15 2010-03-03 洛阳乾禾仪器有限公司 Alarming method by monitoring shutdown of pumping machine based on acceleration sensor
CN104915568A (en) * 2015-06-24 2015-09-16 哈尔滨工业大学 Satellite telemetry data abnormity detection method based on DTW
CN106685750A (en) * 2015-11-11 2017-05-17 华为技术有限公司 System anomaly detection method and device
CN109347653A (en) * 2018-09-07 2019-02-15 阿里巴巴集团控股有限公司 A kind of Indexes Abnormality discovery method and apparatus
CN109754110A (en) * 2017-11-03 2019-05-14 株洲中车时代电气股份有限公司 A kind of method for early warning and system of traction converter failure
CN110059894A (en) * 2019-04-30 2019-07-26 无锡雪浪数制科技有限公司 Equipment state assessment method, apparatus, system and storage medium
CN110909811A (en) * 2019-11-28 2020-03-24 国网湖南省电力有限公司 OCSVM (online charging management system) -based power grid abnormal behavior detection and analysis method and system
CN110927655A (en) * 2019-11-21 2020-03-27 北京中宸泓昌科技有限公司 Diagnosis method for electric energy meter flying away and high-speed power line carrier module
CN111178110A (en) * 2019-12-31 2020-05-19 江苏金帆电源科技有限公司 Bar code abnormity detection method based on artificial intelligence
CN111340086A (en) * 2020-02-21 2020-06-26 同济大学 Method, system, medium and terminal for processing label-free data
CN111740991A (en) * 2020-06-19 2020-10-02 上海仪电(集团)有限公司中央研究院 Anomaly detection method and system
CN112000830A (en) * 2020-08-26 2020-11-27 中国科学技术大学 Time sequence data detection method and device
CN112070155A (en) * 2020-09-07 2020-12-11 常州微亿智造科技有限公司 Time series data labeling method and device
CN112423327A (en) * 2019-08-22 2021-02-26 中兴通讯股份有限公司 Capacity prediction method and device and storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101660400A (en) * 2009-09-15 2010-03-03 洛阳乾禾仪器有限公司 Alarming method by monitoring shutdown of pumping machine based on acceleration sensor
CN104915568A (en) * 2015-06-24 2015-09-16 哈尔滨工业大学 Satellite telemetry data abnormity detection method based on DTW
CN106685750A (en) * 2015-11-11 2017-05-17 华为技术有限公司 System anomaly detection method and device
CN109754110A (en) * 2017-11-03 2019-05-14 株洲中车时代电气股份有限公司 A kind of method for early warning and system of traction converter failure
CN109347653A (en) * 2018-09-07 2019-02-15 阿里巴巴集团控股有限公司 A kind of Indexes Abnormality discovery method and apparatus
CN110059894A (en) * 2019-04-30 2019-07-26 无锡雪浪数制科技有限公司 Equipment state assessment method, apparatus, system and storage medium
CN112423327A (en) * 2019-08-22 2021-02-26 中兴通讯股份有限公司 Capacity prediction method and device and storage medium
CN110927655A (en) * 2019-11-21 2020-03-27 北京中宸泓昌科技有限公司 Diagnosis method for electric energy meter flying away and high-speed power line carrier module
CN110909811A (en) * 2019-11-28 2020-03-24 国网湖南省电力有限公司 OCSVM (online charging management system) -based power grid abnormal behavior detection and analysis method and system
CN111178110A (en) * 2019-12-31 2020-05-19 江苏金帆电源科技有限公司 Bar code abnormity detection method based on artificial intelligence
CN111340086A (en) * 2020-02-21 2020-06-26 同济大学 Method, system, medium and terminal for processing label-free data
CN111740991A (en) * 2020-06-19 2020-10-02 上海仪电(集团)有限公司中央研究院 Anomaly detection method and system
CN112000830A (en) * 2020-08-26 2020-11-27 中国科学技术大学 Time sequence data detection method and device
CN112070155A (en) * 2020-09-07 2020-12-11 常州微亿智造科技有限公司 Time series data labeling method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
朱松豪;赵云斌;: "基于半监督生成式对抗网络的异常行为检测", 南京邮电大学学报(自然科学版), no. 04 *
黑盒: "SR(Spectral Residual)用于时间序列异常检测", 《全球 WEB 图标 知乎专栏 HTTPS://ZHUANLAN.ZHIHU.COM/P/150225585》, pages 1 - 5 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204590A (en) * 2021-05-31 2021-08-03 中国人民解放军国防科技大学 Unsupervised KPI (Key performance indicator) anomaly detection method based on serialization self-encoder
CN113204590B (en) * 2021-05-31 2021-11-23 中国人民解放军国防科技大学 Unsupervised KPI (Key performance indicator) anomaly detection method based on serialization self-encoder
CN113282876A (en) * 2021-07-20 2021-08-20 中国人民解放军国防科技大学 Method, device and equipment for generating one-dimensional time sequence data in anomaly detection
CN113282876B (en) * 2021-07-20 2021-10-01 中国人民解放军国防科技大学 Method, device and equipment for generating one-dimensional time sequence data in anomaly detection

Similar Documents

Publication Publication Date Title
CN112202736A (en) Industrial control system communication network abnormity classification method based on statistical learning and deep learning
CN107888397B (en) Method and device for determining fault type
DE102019112734A1 (en) Improved analog functional reliability with anomaly detection
Wang et al. Detection of false data injection attacks using the autoencoder approach
CN112819386A (en) Method, system and storage medium for generating time series data with abnormity
Kaygusuz et al. Detection of compromised smart grid devices with machine learning and convolution techniques
Zeng et al. Estimation of software defects fix effort using neural networks
CN110391840B (en) Method and system for judging abnormality of telemetry parameters of sun synchronous orbit satellite
CN108052092A (en) A kind of subway electromechanical equipment abnormal state detection method based on big data analysis
Yue An integrated anomaly detection method for load forecasting data under cyberattacks
Ran et al. K-codiagnosability verification of labeled Petri nets
CN113110961B (en) Equipment abnormality detection method and device, computer equipment and readable storage medium
Donner et al. Symbolic recurrence plots: A new quantitative framework for performance analysis of manufacturing networks
Hellmich Statistical inference of a software reliability model by linear filtering
Winkelvos et al. A property based security risk analysis through weighted simulation
CN113282876B (en) Method, device and equipment for generating one-dimensional time sequence data in anomaly detection
Cabral et al. Synchronous codiagnosability of modular discrete-event systems
CN113591266A (en) Method and system for analyzing fault probability of electric energy meter
Rogers et al. Digital twinning for condition monitoring of marine propulsion assets
Su et al. Deep neural network based efficient data fusion model for false data detection in power system
Friederich et al. A Framework for Validating Data-Driven Discrete-Event Simulation Models of Cyber-Physical Production Systems
CN116089520B (en) Fault identification method based on blockchain and big data and general computing node
Kustarev et al. Functional monitoring of SoC with dynamic actualization of behavioral model
Brodu Quantifying the effect of learning on recurrent spikin neurons
Liu et al. Online fault diagnosis in partially observed Petri nets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination