CN113282876B - Method, device and equipment for generating one-dimensional time sequence data in anomaly detection - Google Patents
Method, device and equipment for generating one-dimensional time sequence data in anomaly detection Download PDFInfo
- Publication number
- CN113282876B CN113282876B CN202110817079.7A CN202110817079A CN113282876B CN 113282876 B CN113282876 B CN 113282876B CN 202110817079 A CN202110817079 A CN 202110817079A CN 113282876 B CN113282876 B CN 113282876B
- Authority
- CN
- China
- Prior art keywords
- differentiation
- curve
- depth
- preset
- curves
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3024—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
- G06F11/3419—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time
- G06F11/3423—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time where the assessed time is active or idle time
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/20—Drawing from basic elements, e.g. lines or circles
- G06T11/203—Drawing of straight lines or curves
Abstract
The application relates to a method and a device for generating one-dimensional time series data in anomaly detection. The method comprises the following steps: acquiring time series data demand information for anomaly detection from a service system, wherein the information comprises: recursion is carried out on the two preset end points by adopting a random midpoint displacement method, and a comparison curve is obtained by recursion to the first differentiation depth; acquiring a preset second differentiation depth, and differentiating and recursing the control curves to the second differentiation depth by adopting a random midpoint displacement differentiation method to generate a plurality of differentiation curves; according to a plurality of preset similarity intervals, comparing the differentiation curve with the control curve to obtain similar curves of a plurality of categories; and sampling the similar curve to obtain one-dimensional time sequence data of a plurality of categories. By adopting the method, the shape of the time sequence can be changed, and the problem of insufficient quantity of the time sequence data is solved.
Description
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, and a device for generating one-dimensional time series data in anomaly detection.
Background
Currently, large companies providing internet-based services require close monitoring of the real-time performance of the internet system, as short service outages or quality degradation can result in significant traffic loss. These real-time performance data (e.g., search response time, CPU usage) are typically collected and stored in a time series. To ensure the smooth operation of the service, a time-series abnormality detection system is usually used to monitor the time-series data and timely remove the fault.
However, the amount of time series data in a large company is extremely large, and the overhead of training a model and monitoring its behavior for each KPI is significant. However, many KPIs have similar shape attributes, and according to this feature, current practice typically clusters time series data based on shape and trains a unified model for the same type of data for anomaly detection. Sample data is needed in model training, and a time sequence is needed to be used as sample data in a model monitored by a time sequence abnormity detection system.
Disclosure of Invention
In view of the above, it is necessary to provide a method and an apparatus for generating one-dimensional time-series data in anomaly detection, which can solve various problems of the conventional time-series data.
A method for generating one-dimensional time sequence data in anomaly detection comprises the following steps:
acquiring time sequence data demand information for anomaly detection from a service system; the time series data requirement information comprises: two preset endpoints, a first differentiation depth, a second differentiation depth and a plurality of similarity intervals;
recursion is carried out on the two preset end points by adopting a random midpoint displacement method, and when the recursion reaches a first differentiation depth, a comparison curve is obtained; obtaining a shared control point when recursion reaches a first differentiation depth; the shared control point includes: generating a differentiation point and two preset endpoints when the first differentiation depth is reached in a recursion mode; the shared control points are sequentially connected from left to right to obtain a contrast curve;
acquiring a preset second differentiation depth, and generating a plurality of differentiation curves when the control curves are differentiated and recurred to the second differentiation depth by adopting a random midpoint displacement differentiation method;
comparing the differentiation curve with the control curve according to a plurality of preset similarity intervals to obtain similar curves of a plurality of categories;
and sampling the similar curve according to a preset sampling rate to obtain one-dimensional time sequence data of a plurality of categories.
In one embodiment, a random midpoint displacement method is adopted to carry out recursion on two preset endpoints, and when the recursion reaches a first differentiation depth, a comparison curve is obtained; wherein, obtaining the shared control point when recursing to the first differentiation depth comprises:
whereinAndthe differentiation points generated before the first differentiation depth,andare the adjacent points of the image, and are the adjacent points,the differentiation points generated when the recursion reaches the first differentiation depth,is thatAndthe center point of (a) is,is a vector of the number of bits in the vector,is an and vectorVertical random displacement vector ofIs a line segmentThe length of (a) of (b),is a line segmentThe length of (a) of (b),is a line segmentAnd line segmentThe included angle of (a).
In one embodiment, generating a plurality of differentiation curves when differentiating and recurrently differentiating the control curve to a second differentiation depth by using a random midpoint displacement differentiation method comprises: performing multiple recursions on the control curve differentiation by adopting a random midpoint displacement differentiation method to a second differentiation depth to obtain multiple control point sets; and connecting the points in the control point set from left to right in sequence to obtain a plurality of differentiation curves.
In one embodiment, comparing the differentiation curve with the control curve according to a plurality of preset similarity intervals to obtain a plurality of categories of similarity curves, includes: and calculating the similarity of the differentiation curve and the control curve according to a plurality of preset similarity intervals, and dividing the differentiation curve into corresponding similarity intervals according to the similarity of the differentiation curve and the control curve to obtain similar curves of a plurality of categories.
In one embodiment, calculating the similarity of the differentiation curve and the control curve comprises:
and obtaining the similarity of the differentiation curve and the control curve according to the dynamic time reduction cost and the root mean square error of the differentiation curve and the control curve.
In one embodiment, the sampling is equidistant.
An apparatus for generating one-dimensional time-series data in abnormality detection, the apparatus comprising:
the data acquisition module is used for acquiring time sequence data demand information for anomaly detection from a service system; the time series data requirement information comprises: two preset endpoints, a first differentiation depth, a second differentiation depth and a plurality of similarity intervals;
the comparison curve acquisition module is used for recursing the two preset endpoints by adopting a random midpoint displacement method and obtaining a comparison curve when recursion reaches a first differentiation depth; obtaining a shared control point when recursion reaches a first differentiation depth; the shared control point includes: generating a differentiation point and two preset endpoints when the first differentiation depth is reached in a recursion mode; the shared control points are sequentially connected from left to right to obtain the contrast curve;
the differentiation curve acquisition module is used for acquiring a preset second differentiation depth, and generating a plurality of differentiation curves when the control curve is differentiated and recurred to the second differentiation depth by adopting a random midpoint displacement differentiation method;
the curve comparison module is used for comparing the differentiation curve with the control curve according to a plurality of preset similarity intervals to obtain similar curves of a plurality of categories;
and the sampling module is used for sampling the similar curve according to a preset sampling rate to obtain one-dimensional time sequence data of a plurality of categories.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring time sequence data demand information for anomaly detection from a service system; the time series data requirement information comprises: two preset endpoints, a first differentiation depth, a second differentiation depth and a plurality of similarity intervals;
recursion is carried out on the two preset end points by adopting a random midpoint displacement method, and when the recursion reaches a first differentiation depth, a comparison curve is obtained; obtaining a shared control point when recursion reaches a first differentiation depth; the shared control point includes: generating a differentiation point and two preset endpoints when the first differentiation depth is reached in a recursion mode; the shared control points are sequentially connected from left to right to obtain a contrast curve;
acquiring a preset second differentiation depth, and generating a plurality of differentiation curves when the control curves are differentiated and recurred to the second differentiation depth by adopting a random midpoint displacement differentiation method;
comparing the differentiation curve with the control curve according to a plurality of preset similarity intervals to obtain similar curves of a plurality of categories;
and sampling the similar curve according to a preset sampling rate to obtain one-dimensional time sequence data of a plurality of categories. A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring time sequence data demand information for anomaly detection from a service system; the time series data requirement information comprises: two preset endpoints, a first differentiation depth, a second differentiation depth and a plurality of similarity intervals;
recursion is carried out on the two preset end points by adopting a random midpoint displacement method, and when the recursion reaches a first differentiation depth, a comparison curve is obtained; obtaining a shared control point when recursion reaches a first differentiation depth; the shared control point includes: generating a differentiation point and two preset endpoints when the first differentiation depth is reached in a recursion mode; the shared control points are sequentially connected from left to right to obtain a contrast curve;
acquiring a preset second differentiation depth, and generating a plurality of differentiation curves when the control curves are differentiated and recurred to the second differentiation depth by adopting a random midpoint displacement differentiation method;
comparing the differentiation curve with the control curve according to a plurality of preset similarity intervals to obtain similar curves of a plurality of categories;
and sampling the similar curve according to a preset sampling rate to obtain one-dimensional time sequence data of a plurality of categories. The method and the device for generating the one-dimensional time sequence data in the anomaly detection acquire time sequence data requirement information for the anomaly detection from a business system; the time series data requirement information comprises: two preset endpoints, a first differentiation depth, a second differentiation depth and a plurality of similarity intervals; firstly, recursion is carried out on two preset end points by adopting a random midpoint displacement method, and when the recursion reaches a first differentiation depth, a comparison curve is obtained; obtaining a shared control point when recursion reaches a first differentiation depth; the shared control point includes: generating a differentiation point and two preset endpoints when the first differentiation depth is reached in a recursion mode; the shared control points are sequentially connected from left to right to obtain a contrast curve; the method comprises the steps of obtaining a preset second differentiation depth, differentiating a control curve to the second differentiation depth by a random point displacement differentiation method, obtaining a plurality of control point sets, connecting points in the control point sets from left to right to obtain a plurality of differentiation curves, setting similarity intervals of the control curve and the differentiation curves according to different numbers of the control points, calculating the similarity of the differentiation curves and the control curves, dividing the differentiation curves into corresponding similarity intervals according to the similarity of the differentiation curves and the control curves to obtain the similarity curves of a plurality of categories, ensuring that the differentiation curves in one interval are similar curves of one category by the method, and sampling the similarity curves according to a preset sampling rate to obtain one-dimensional time sequence data of the plurality of categories. According to the invention, a plurality of differentiation curves are generated through point displacement differentiation, the similarity intervals of the control curve and the differentiation curves are set, and the differentiation curves are divided into corresponding similarity intervals, so that the shape similarity of the curves in the same interval is high, the similarity of the curves in different intervals is low, the one-dimensional time sequence data with controllable shape similarity is obtained by sampling the similar curves in the same similarity interval, the shape of the time sequence can be easily changed, dynamic sample data is generated, a time sequence abnormity detection system can be favorably used for model training, and the problem of insufficient time sequence data is solved.
Drawings
FIG. 1 is a schematic flow chart illustrating a method for generating one-dimensional time-series data in anomaly detection according to an embodiment;
FIG. 2 is a schematic diagram of a process of a random midpoint displacement method used in one embodiment;
FIG. 3 is a graph of the effect of a random midpoint shift differentiation method used in one example;
FIG. 4 is a graph of dynamic time alignment cost versus differentiation depth for a differentiation curve generated by a random midpoint displacement differentiation method used in one embodiment;
FIG. 5 is a plot of root mean square error versus differentiation depth for differentiation curves generated by the random midpoint shift differentiation method used in one example versus control curves;
FIG. 6 is a block diagram showing a configuration of a one-dimensional time-series data generating apparatus in abnormality detection according to an embodiment;
FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, there is provided a method for generating one-dimensional time-series data in anomaly detection, including the steps of:
102, acquiring time series data demand information for anomaly detection from a business system; the time series data requirement information comprises: the system comprises two preset endpoints, a first differentiation depth, a second differentiation depth and a plurality of similarity intervals.
The service system may be an internet system, and the time series data obtained from the service system refers to network performance index data. The business system may also be a plant status detection system and the timing data may be status data collected by the sensors.
And 104, recursion is carried out on the two preset end points by adopting a random midpoint displacement method, and when the recursion reaches the first differentiation depth, a comparison curve is obtained.
Obtaining a shared control point when the recursion reaches a first differentiation depth; the shared control point includes: generating a differentiation point and two preset endpoints when the first differentiation depth is reached in a recursion mode; the shared control points are connected from left to right in sequence to obtain a comparison curve.
The random midpoint displacement method is a method for generating a graph quickly and conveniently and increasing details for an existing shape, recursion refers to a method for directly or indirectly calling a process or a function in the definition or the description of the process or the function, a random midpoint displacement method is adopted to recurse two preset endpoints to a first differentiation depth to generate differentiation points, the first differentiation depth refers to a certain depth in the process of random midpoint displacement, the number of the differentiation points generated by different differentiation depths is different, and the first and second quantities are used as prefixes to distinguish the differentiation depths, wherein the two preset endpoints and the differentiation points are shared control points, and the shared control points are sequentially connected from left to right to obtain a curve as a comparison curve.
And 106, acquiring a preset second differentiation depth, and generating a plurality of differentiation curves when the control curves are differentiated and recurred to the second differentiation depth by adopting a random midpoint displacement differentiation method.
When the second differentiation depth is reached, a plurality of new differentiation points are generated when the control curve is differentiated and recurred by adopting a random midpoint displacement differentiation method, the new differentiation points form a plurality of control point sets, the differentiation points in the control point sets are sequentially connected from left to right to generate a plurality of differentiation curves with different shapes, and the higher the differentiation depth is, the more the number of the generated new differentiation points is, and the larger the shape difference between the differentiation curve generated by the new differentiation points and the control curve is.
And 108, comparing the differentiation curve with the comparison curve according to a plurality of preset similarity intervals to obtain similar curves of a plurality of categories.
Because the random midpoint displacement differentiation method has certain randomness, the shape similarity of the differentiation curve and the control curve also has randomness, a plurality of similarity intervals are preset, the shape similarity between each differentiation curve and the control curve is calculated, the differentiation curves with the shape similarities in the same similarity interval are classified into one category, and the similar curves of a plurality of categories are obtained. When the time sequence is used for clustering algorithm training, the category corresponding to the similarity interval can be used as a label of the time sequence, so that the label of the time sequence does not need to be set again, and the time cost of manual labeling is saved.
And 110, sampling the similar curve according to a preset sampling rate to obtain one-dimensional time sequence data of a plurality of categories.
Two similar curves in the same category can form a segmentation function, the segmentation function is sampled equidistantly to obtain a section of randomly-shaped one-dimensional time sequence data of the category, and similarly, a plurality of one-dimensional time sequence data in different shapes can be obtained by sampling similar curves in different categories equidistantly.
The method and the device for generating the one-dimensional time sequence data in the anomaly detection acquire time sequence data requirement information for the anomaly detection from a business system; the time series data requirement information comprises: two preset endpoints, a first differentiation depth, a second differentiation depth and a plurality of similarity intervals; firstly, recursion is carried out on two preset end points by adopting a random midpoint displacement method, and when the recursion reaches a first differentiation depth, a comparison curve is obtained; obtaining a shared control point when recursion reaches a first differentiation depth; the shared control point includes: generating a differentiation point and two preset endpoints when the first differentiation depth is reached in a recursion mode; the shared control points are sequentially connected from left to right to obtain a contrast curve; the method comprises the steps of obtaining a preset second differentiation depth, differentiating a control curve to the second differentiation depth by a random point displacement differentiation method, obtaining a plurality of control point sets, connecting points in the control point sets from left to right to obtain a plurality of differentiation curves, setting similarity intervals of the control curve and the differentiation curves according to different numbers of the control points, calculating the similarity of the differentiation curves and the control curves, dividing the differentiation curves into corresponding similarity intervals according to the similarity of the differentiation curves and the control curves to obtain the similarity curves of a plurality of categories, ensuring that the differentiation curves in one interval are similar curves of one category by the method, and sampling the similarity curves according to a preset sampling rate to obtain one-dimensional time sequence data of the plurality of categories. According to the invention, a plurality of differentiation curves are generated through point displacement differentiation, the similarity intervals of the control curve and the differentiation curves are set, and the differentiation curves are divided into corresponding similarity intervals, so that the shape similarity of the curves in the same interval is high, the similarity of the curves in different intervals is low, the one-dimensional time sequence data with controllable shape similarity is obtained by sampling the similar curves in the same similarity interval, the shape of the time sequence can be easily changed, dynamic sample data is generated, a time sequence abnormity detection system can be favorably used for model training, and the problem of insufficient time sequence data is solved.
In one embodiment, as shown in fig. 2, a random midpoint displacement method is used to recurse two preset endpoints, and when the two preset endpoints recur to the first differentiation depth, a comparison curve is obtained; wherein, obtaining the shared control point when recursing to the first differentiation depth comprises:
whereinAndthe differentiation points generated before the first differentiation depth,andare the adjacent points of the image, and are the adjacent points,the differentiation points generated when the recursion reaches the first differentiation depth,is thatAndthe center point of (a) is,is a vector of the number of bits in the vector,is an and vectorVertical random displacement vector ofIs the length of the line segment and,is a line segmentThe length of (a) of (b),is a line segmentAnd line segmentThe included angle of (a). Will be provided with, ,The control curves are connected from left to right in sequence.
In one embodiment, generating a plurality of differentiation curves when differentiating and recurrently differentiating the control curve to a second differentiation depth by using a random midpoint displacement differentiation method comprises: performing multiple recursions on the control curve differentiation by adopting a random midpoint displacement differentiation method to a second differentiation depth to obtain multiple control point sets; and connecting the points in the control point set from left to right in sequence to obtain a plurality of differentiation curves.
The second differentiation depth is a differentiation depth larger than the first differentiation depth in the process of random midpoint shift, and the specific process of the random midpoint shift differentiation method will be explained below with reference to FIG. 2, usingRepresenting maximum recursion depth of a random midpoint displacement method, usingIndicating the degree of differentiation, useRepresenting a shared depth, wherein. Order toRepresenting a set of control points generated at a recursion depth i during a random midpoint displacement, whereinThen obviously haveAndand | represents the number of elements in the set. Order to,Respectively representing a random midpoint displacement process at a differentiation depth ofTwo sets of control points are finally generated, and the common set of control points can be expressed asThus, the number of control points (expressed in m) and the differentiation depth differ between differentiation curves of the same random midpoint displacement processHas the following relationship:. As the above formula indicates, as the differentiation depth increases, the number of control points differing between the differentiation curves increases, and the shape difference between the differentiation curves and the control curves becomes larger, as shown in fig. 3. According to this property, it is possible to control the differentiation depthMultiple differentiation curves with controlled similarity to the control curve shape were generated.
In one embodiment, comparing the differentiation curve with the control curve according to a plurality of preset similarity intervals to obtain a plurality of categories of similarity curves, includes: and calculating the similarity of the differentiation curve and the control curve according to a plurality of preset similarity intervals, and dividing the differentiation curve into corresponding similarity intervals according to the similarity of the differentiation curve and the control curve to obtain similar curves of a plurality of categories.
And classifying the differentiation curves with the shape similarity in the same similarity interval into a category according to the similarity of the differentiation curves and the control curves, so that the similarity curves with the high and low similarity can be obtained by once classification. When the time sequence is used for clustering algorithm training, the category corresponding to the similarity interval can be used as a label of the time sequence, so that the label of the time sequence does not need to be set again, and the time cost of manual labeling is saved.
In one embodiment, calculating the similarity of the differentiation curve and the control curve comprises:
and obtaining the similarity of the differentiation curve and the control curve according to the dynamic time reduction cost and the root mean square error of the differentiation curve and the control curve.
The dynamic time alignment overhead, also called DTW cost, is a method for adjusting the dynamic time alignment cost by calculating a distance matrix between each point of two sequences,
and (3) finding a path from the upper left corner to the lower right corner of the matrix to minimize the sum of elements on the path, thereby calculating the similarity between the two sequences. The root mean square error is also called RMSE (root mean square error), which is an expected value of the square of the difference between the estimated value of the parameter and the actual value of the parameter, and the smaller the MSE (mean square error) value is, the better accuracy of the prediction model description experiment data can be shown by means of the change degree of the data. The similarity is obtained by calculating the dynamic time alignment overhead and the root mean square error between the differentiation curve and the control curve.
As shown in fig. 4 and 5, the influence of the differentiation depth on the dynamic time alignment cost (DTW cost, a similarity) and the root mean square error (RMSE, a similarity) between the differentiation curve and the control curve increases with the increase of the differentiation depth, and on the basis of this, the similarity interval between the differentiation curve and the control curve can be specified, and the differentiation curves with the similarities in the same interval can be classified into one category.
In one embodiment, the sampling is equidistant.
The piecewise functions formed by the similar curves of multiple categories are sampled equidistantly, so that multiple groups of one-dimensional time sequence data with different shapes can be obtained, and the generated time sequences are dynamic sample data with variable shapes, so that a time sequence anomaly detection system can perform model training.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 6, there is provided a one-dimensional time-series data generating apparatus in abnormality detection, including: a data acquisition module 601, a contrast curve acquisition module 602, a differentiation curve acquisition module 603, a curve comparison module 604, and a sampling module 605, wherein:
a data acquisition module 601, configured to acquire time-series data demand information for anomaly detection from a service system; the time series data requirement information comprises: two preset endpoints, a first differentiation depth, a second differentiation depth and a plurality of similarity intervals;
a comparison curve obtaining module 602, configured to perform recursion on two preset endpoints by using a random midpoint displacement method, and obtain a comparison curve when the two preset endpoints recur to a first differentiation depth; obtaining a shared control point when recursion reaches a first differentiation depth; the shared control point includes: generating a differentiation point and two preset endpoints when the first differentiation depth is reached in a recursion mode; the shared control points are sequentially connected from left to right to obtain a contrast curve;
a differentiation curve obtaining module 603, configured to obtain a preset second differentiation depth, and generate a plurality of differentiation curves when differentiation recursion is performed on the control curve to the second differentiation depth by using a random midpoint displacement differentiation method;
a curve comparison module 604, configured to compare the differentiation curve with the comparison curve according to a plurality of preset similarity intervals, so as to obtain similar curves of multiple categories;
the sampling module 605 is configured to sample the similar curve according to a preset sampling rate to obtain one-dimensional time series data of multiple categories.
In one embodiment, the comparison curve obtaining module 602 is further configured to perform recursion on two preset endpoints by using a random midpoint displacement method, and obtain a comparison curve when the recursion reaches a first differentiation depth; wherein, obtaining the shared control point when recursing to the first differentiation depth comprises:
whereinAndthe differentiation points generated before the first differentiation depth,andare the adjacent points of the image, and are the adjacent points,the differentiation points generated when the recursion reaches the first differentiation depth,is thatAndthe center point of (a) is,is a vector of the number of bits in the vector,is an and vectorVertical random displacement vector ofIs a line segmentThe length of (a) of (b),is a line segmentThe length of (a) of (b),is a line segmentAnd line segmentThe included angle of (a).
In one embodiment, the differentiation curve obtaining module 603 is further configured to generate a plurality of differentiation curves when differentiating and recurrently differentiating the control curve to a second differentiation depth by using a random midpoint displacement differentiation method, where the method includes: performing multiple recursions on the control curve differentiation by adopting a random midpoint displacement differentiation method to a second differentiation depth to obtain multiple control point sets; and connecting the points in the control point set from left to right in sequence to obtain a plurality of differentiation curves.
In one embodiment, the curve comparing module 604 is further configured to compare the differentiation curve with the control curve according to a plurality of preset similarity intervals to obtain similar curves of a plurality of categories, including: and calculating the similarity of the differentiation curve and the control curve according to a plurality of preset similarity intervals, and dividing the differentiation curve into corresponding similarity intervals according to the similarity of the differentiation curve and the control curve to obtain similar curves of a plurality of categories.
In one embodiment, the curve comparison module 604 is further configured to obtain the similarity between the differentiation curve and the control curve according to the dynamic time reduction cost and the root mean square error of the differentiation curve and the control curve.
In one embodiment, the sampling module 605 is further configured to sample in an equidistant manner.
For specific limitations of the one-dimensional time series data generation device in abnormality detection, reference may be made to the above limitations on one-dimensional time series data generation method in abnormality detection, and details thereof are not repeated here. The modules in the one-dimensional time series data generating device for abnormality detection may be implemented in whole or in part by software, hardware, or a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a one-dimensional time-series data generation method in abnormality detection. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the method in the above embodiments when the processor executes the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the method in the above-mentioned embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (8)
1. A method for generating one-dimensional time-series data in anomaly detection, the method comprising:
acquiring time sequence data demand information for anomaly detection from a service system; the time series data requirement information comprises: two preset endpoints, a first differentiation depth, a second differentiation depth and a plurality of similarity intervals;
recursion is carried out on the two preset end points by adopting a random midpoint displacement method, and when the recursion reaches a first differentiation depth, a comparison curve is obtained; obtaining a shared control point when recursion reaches a first differentiation depth; the shared control point includes: a differentiation point generated when recursion is carried out to a first differentiation depth and two preset end points; the shared control points are sequentially connected from left to right to obtain the contrast curve;
acquiring a preset second differentiation depth, and generating a plurality of differentiation curves when the control curves are differentiated and recurred to the second differentiation depth by adopting a random midpoint displacement differentiation method;
comparing the differentiation curve with the control curve according to a plurality of preset similarity intervals to obtain similar curves of a plurality of categories;
sampling the similar curve according to a preset sampling rate to obtain one-dimensional time sequence data of a plurality of categories;
the step of comparing the differentiation curve with the control curve according to a plurality of preset similarity intervals to obtain a plurality of categories of similar curves comprises:
and calculating the similarity of the differentiation curve and the control curve according to a plurality of preset similarity intervals, and dividing the differentiation curve into corresponding similarity intervals according to the similarity of the differentiation curve and the control curve to obtain similar curves of a plurality of categories.
2. The method of claim 1, wherein a random midpoint displacement method is used to recurse two preset endpoints, and when the two preset endpoints recur to a first differentiation depth, a comparison curve is obtained; wherein, obtaining the shared control point when recursing to the first differentiation depth comprises:
whereinAndthe differentiation points generated before the first differentiation depth,andare the adjacent points of the image, and are the adjacent points,the differentiation points generated when the recursion reaches the first differentiation depth,is thatAndthe center point of (a) is,is a vector of the number of bits in the vector,is an and vectorVertical random displacement vector ofIs a line segmentThe length of (a) of (b),is a line segmentThe length of (a) of (b),is a line segmentAnd line segmentThe included angle of (a).
3. The method of claim 1, wherein generating a plurality of differentiation curves when differentiating and recursing the control curve to a second differentiation depth using a random midpoint-shift differentiation method comprises:
performing multiple recursions on the differentiation of the control curve to a second differentiation depth by adopting a random midpoint displacement differentiation method to obtain multiple control point sets;
and sequentially connecting the points in the control point set from left to right to obtain a plurality of differentiation curves.
4. The method of claim 1, wherein calculating the similarity of the differentiation curve and the control curve comprises:
and obtaining the similarity of the differentiation curve and the control curve according to the dynamic time reduction cost and the root mean square error of the differentiation curve and the control curve.
5. The method of claim 1, wherein the sampling pattern is equidistant sampling.
6. An apparatus for generating one-dimensional time-series data in abnormality detection, the apparatus comprising:
the data acquisition module is used for acquiring time sequence data demand information for anomaly detection from a service system; the time series data requirement information comprises: two preset endpoints, a first differentiation depth, a second differentiation depth and a plurality of similarity intervals;
the comparison curve acquisition module is used for recursing the two preset endpoints by adopting a random midpoint displacement method and obtaining a comparison curve when recursion reaches a first differentiation depth; obtaining a shared control point when recursion reaches a first differentiation depth; the shared control point includes: generating a differentiation point and two preset endpoints when the first differentiation depth is reached in a recursion mode; the shared control points are sequentially connected from left to right to obtain the contrast curve;
the differentiation curve acquisition module is used for acquiring a preset second differentiation depth, and generating a plurality of differentiation curves when the control curve is differentiated and recurred to the second differentiation depth by adopting a random midpoint displacement differentiation method;
the curve comparison module is used for comparing the differentiation curve with the control curve according to a plurality of preset similarity intervals to obtain similar curves of a plurality of categories;
the sampling module is used for sampling the similar curve according to a preset sampling rate to obtain one-dimensional time sequence data of a plurality of categories;
and the curve comparison module is further used for calculating the similarity between the differentiation curve and the control curve according to a plurality of preset similarity intervals, and dividing the differentiation curve into corresponding similarity intervals according to the similarity between the differentiation curve and the control curve to obtain similar curves of a plurality of categories.
7. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 5 when executing the computer program.
8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110817079.7A CN113282876B (en) | 2021-07-20 | 2021-07-20 | Method, device and equipment for generating one-dimensional time sequence data in anomaly detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110817079.7A CN113282876B (en) | 2021-07-20 | 2021-07-20 | Method, device and equipment for generating one-dimensional time sequence data in anomaly detection |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113282876A CN113282876A (en) | 2021-08-20 |
CN113282876B true CN113282876B (en) | 2021-10-01 |
Family
ID=77286895
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110817079.7A Active CN113282876B (en) | 2021-07-20 | 2021-07-20 | Method, device and equipment for generating one-dimensional time sequence data in anomaly detection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113282876B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114726749B (en) * | 2022-03-02 | 2023-10-31 | 阿里巴巴(中国)有限公司 | Data anomaly detection model acquisition method, device, equipment and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3474237A (en) * | 1966-10-03 | 1969-10-21 | Automation Ind Inc | Strain gage rosette calculator |
JP2007173907A (en) * | 2005-12-19 | 2007-07-05 | Nippon Telegr & Teleph Corp <Ntt> | Abnormal traffic detection method and device |
CN110473084A (en) * | 2019-07-17 | 2019-11-19 | 中国银行股份有限公司 | A kind of method for detecting abnormality and device |
CN110909046A (en) * | 2019-12-02 | 2020-03-24 | 上海舵敏智能科技有限公司 | Time series abnormality detection method and device, electronic device, and storage medium |
CN112037182A (en) * | 2020-08-14 | 2020-12-04 | 中南大学 | Locomotive running gear fault detection method and device based on time sequence image and storage medium |
CN112819386A (en) * | 2021-03-05 | 2021-05-18 | 中国人民解放军国防科技大学 | Method, system and storage medium for generating time series data with abnormity |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7613576B2 (en) * | 2007-04-12 | 2009-11-03 | Sun Microsystems, Inc. | Using EMI signals to facilitate proactive fault monitoring in computer systems |
CN107133343B (en) * | 2017-05-19 | 2018-04-13 | 哈工大大数据产业有限公司 | Big data abnormal state detection method and device based on time series approximate match |
CN108982106B (en) * | 2018-07-26 | 2020-09-22 | 安徽大学 | Effective method for rapidly detecting kinetic mutation of complex system |
CN111506637B (en) * | 2020-06-17 | 2020-11-27 | 北京必示科技有限公司 | Multi-dimensional anomaly detection method and device based on KPI (Key Performance indicator) and storage medium |
CN111967508A (en) * | 2020-07-31 | 2020-11-20 | 复旦大学 | Time series abnormal point detection method based on saliency map |
CN112685476A (en) * | 2021-01-06 | 2021-04-20 | 银江股份有限公司 | Periodic multivariate time series anomaly detection method and system |
-
2021
- 2021-07-20 CN CN202110817079.7A patent/CN113282876B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3474237A (en) * | 1966-10-03 | 1969-10-21 | Automation Ind Inc | Strain gage rosette calculator |
JP2007173907A (en) * | 2005-12-19 | 2007-07-05 | Nippon Telegr & Teleph Corp <Ntt> | Abnormal traffic detection method and device |
CN110473084A (en) * | 2019-07-17 | 2019-11-19 | 中国银行股份有限公司 | A kind of method for detecting abnormality and device |
CN110909046A (en) * | 2019-12-02 | 2020-03-24 | 上海舵敏智能科技有限公司 | Time series abnormality detection method and device, electronic device, and storage medium |
CN112037182A (en) * | 2020-08-14 | 2020-12-04 | 中南大学 | Locomotive running gear fault detection method and device based on time sequence image and storage medium |
CN112819386A (en) * | 2021-03-05 | 2021-05-18 | 中国人民解放军国防科技大学 | Method, system and storage medium for generating time series data with abnormity |
Non-Patent Citations (1)
Title |
---|
《时间序列异常检测算法的研究与应用》;吕玉红;《中国优秀硕士学位论文全文数据库信息科技辑》;20180323;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113282876A (en) | 2021-08-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111210024B (en) | Model training method, device, computer equipment and storage medium | |
Zhao et al. | Step-wise sequential phase partition (SSPP) algorithm based statistical modeling and online process monitoring | |
EP3847586A1 (en) | Computer-implemented method, computer program product and system for anomaly detection and/or predictive maintenance | |
CN108762228A (en) | A kind of multi-state fault monitoring method based on distributed PCA | |
WO2017076154A1 (en) | Method and apparatus for predicting network event and establishing network event prediction model | |
US9396061B1 (en) | Automated repair of storage system components via data analytics | |
CN108664603B (en) | Method and device for repairing abnormal aggregation value of time sequence data | |
CN105518654B (en) | To tool processing data offer multi-variables analysis based on K nearest neighbor method and system | |
CN111709447A (en) | Power grid abnormality detection method and device, computer equipment and storage medium | |
CN111897695B (en) | Method and device for acquiring KPI abnormal data sample and computer equipment | |
CN111325159B (en) | Fault diagnosis method, device, computer equipment and storage medium | |
WO2019200738A1 (en) | Data feature extraction method, apparatus, computer device, and storage medium | |
CN113282876B (en) | Method, device and equipment for generating one-dimensional time sequence data in anomaly detection | |
CN114240243A (en) | Rectifying tower product quality prediction method and device based on dynamic system identification | |
CN113110961B (en) | Equipment abnormality detection method and device, computer equipment and readable storage medium | |
CN114547145B (en) | Time sequence data anomaly detection method, system, storage medium and equipment | |
CN114553681B (en) | Device state abnormality detection method and device and computer device | |
CN111679953B (en) | Fault node identification method, device, equipment and medium based on artificial intelligence | |
CN114003422A (en) | Host anomaly detection method, computer device, and storage medium | |
CN112819386A (en) | Method, system and storage medium for generating time series data with abnormity | |
CN113052302A (en) | Machine health monitoring method and device based on recurrent neural network and terminal equipment | |
CN109829745A (en) | Business revenue data predication method, device, computer equipment and storage medium | |
CN113283501A (en) | Deep learning-based equipment state detection method, device, equipment and medium | |
CN114676868A (en) | Logistics cargo quantity prediction method and device, computer equipment and storage medium | |
CN114781278B (en) | Electromechanical equipment service life prediction method and system based on data driving |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |