WO2019179248A1

WO2019179248A1 - Anomaly detection method and device

Info

Publication number: WO2019179248A1
Application number: PCT/CN2019/073880
Authority: WO
Inventors: 周扬
Original assignee: 阿里巴巴集团控股有限公司
Priority date: 2018-03-19
Filing date: 2019-01-30
Publication date: 2019-09-26
Also published as: CN108563548B; CN108563548A; TW201941058A

Abstract

Disclosed are an anomaly detection method and device. The method comprises using, as a normal sample of a training set, data sampled during normal operation of a system; acquiring anomaly data; and cyclically performing the following steps until an expected identification effect of an anomaly detection model is achieved, so as to facilitate performing anomaly detection on data under detection by means of the anomaly detection model having achieved the expected identification effect: extending the anomaly data, and adding, as anomaly samples, the anomaly data and the extended anomaly data to the training set; training the anomaly detection model according to the training set, and determining the identification effect of the anomaly detection model; and when the identification effect of the anomaly detection model is worse than expected, acquiring new anomaly data. The method is adopted to acquire more anomaly samples, such that a training sample set having sufficient positive samples and negative samples is acquired with reference to normal samples, thereby improving the accuracy of fault identification performed by an anomaly detection model trained by the training set.

Description

Abnormal detection method and device

Technical field

The present specification relates to the field of computer technology, and in particular, to an abnormality detecting method and apparatus.

Background technique

With the continuous development of technology, data processing systems need to cope with the ever-increasing amount of data, especially for systems that support multiple services. Data processing systems usually require a certain scale of server collaboration to achieve large-scale data processing. For systems that provide multiple services, platforms are generally supported to support different services. Each platform can include one or more servers. This leads to the system needing hundreds or even thousands of servers to support, the size of the server is very large. When the system is running, the code, database and configuration of these servers will change very frequently. The number of changes per week may be tens of thousands or even more. Due to the negligence or error of any link, the platform may be faulty. Even the system is faulty. When the fault is solved, because the system is large in scale, the server may also be distributed in different regions, so the fault is difficult to locate, and the fault resolution time is too long, causing huge losses. Therefore, in the event of a system failure, the abnormality is accurately and timely identified, and the system can be used to stop bleeding and reduce losses in the shortest time.

At present, the commonly used means is that the business-critical indicators calculated in minutes form a time series, and the faults are identified by identifying the abnormalities of the time series. However, this method mainly relies on historical data when the system is running. Since the abnormality in the historical data of the system is usually small, it is not enough as the basis for fault identification. Therefore, the abnormality is generally identified by analyzing the laws in the normal data. The sample method is single, the fault identification is misjudged, and the missed rate is relatively high.

Summary of the invention

In view of the above technical problems, the present specification provides an abnormality detecting method and apparatus.

Specifically, the present specification is implemented by the following technical solutions:

In a first aspect, an embodiment of the present specification provides an abnormality detecting method. The method includes:

Obtaining sampling data when the system is in normal operation, and using the sampling data as a normal sample in the training set;

Obtain the abnormal data according to the prefabrication rule, and perform the following steps cyclically until the recognition effect of the abnormality detection model reaches the expected value, so that the abnormality detection model that uses the recognition effect to achieve the expected abnormality detection model performs abnormality detection on the detected data:

Extending the abnormal data, adding the abnormal data and the extended abnormal data as abnormal samples in the training set;

The abnormality detection model is trained according to a training set after the abnormal data is added, and the recognition effect of the abnormality detection model is determined;

When the recognition effect of the abnormality detecting model is lower than expected, new abnormal data is acquired according to the prefabricated rule.

In a second aspect, an embodiment of the present specification provides an abnormality detecting device, which is characterized in that: the device includes:

a first acquiring unit, configured to acquire sampling data when the system is in normal operation, and use the sampling data as a normal sample in the training set;

a second acquiring unit, configured to acquire abnormal data according to the pre-made rule;

a looping unit, configured to cyclically execute the following execution unit, the training unit, and the step of the second acquiring unit, until the recognition effect of the abnormality detecting model reaches an expectation, so that the abnormality detecting model is used to achieve the abnormality of the detected data by using the recognition effect Detection

The extension unit is configured to extend the abnormal data, and add the abnormal data and the extended abnormal data as abnormal samples in the training set;

The training unit is configured to train the abnormality detection model according to a training set after adding abnormal data, and determine a recognition effect of the abnormality detection model;

The second obtaining unit is further configured to: when the recognition effect of the abnormality detecting model is lower than expected, acquire new abnormal data according to the prefabricated rule.

In a third aspect, an embodiment of the present disclosure provides a computer device, including a memory, a processor, and a computer program stored on the memory and operable on the processor, wherein the processor implements the program The method steps of the aforementioned first aspect.

In a fourth aspect, a computer readable storage medium is provided having stored thereon a computer program that, when executed by a processor, implements the method of the first aspect described above.

In a fifth aspect, a computer program product comprising instructions for causing a computer to perform the method of the first aspect described above when the instructions are run on a computer.

Through the embodiment of the present specification, abnormal data can be acquired, and the abnormal data is extended to obtain more abnormal samples, and the normal sample is obtained to obtain a training set with sufficient positive samples and negative samples, thereby improving training according to the training set. The anomaly detection model performs the detection of the accuracy of the fault identification.

The above general description and the following detailed description are merely exemplary and explanatory and are not intended to limit the embodiments.

Moreover, any of the embodiments of the present specification does not need to achieve all of the above effects.

DRAWINGS

In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings to be used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is a few embodiments described in the embodiments of the present specification, and other drawings can be obtained from those skilled in the art based on these drawings.

1 is a schematic diagram of an application scenario shown in an embodiment of the present specification;

2 is a schematic diagram of an abnormality detecting method according to an embodiment of the present specification;

3 is a schematic diagram of another abnormality detecting method shown in an embodiment of the present specification;

4 is a schematic diagram of another abnormality detecting method shown in an embodiment of the present specification;

FIG. 5 is a schematic flow chart of an abnormality detecting method according to an embodiment of the present specification; FIG.

6 is a schematic structural diagram of an abnormality detecting device according to an embodiment of the present specification;

FIG. 7 is a schematic structural diagram of a computer device according to an embodiment of the present specification.

detailed description

Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. The following description refers to the same or similar elements in the different figures unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present specification. Instead, they are merely examples of devices and methods consistent with aspects of the present specification as detailed in the appended claims.

The terminology used in the description is for the purpose of describing particular embodiments, and is not intended to The singular forms "a", "the" and "the" It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. Depending on the context, the word "if" as used herein may be interpreted as "when" or "when" or "in response to a determination."

Data processing systems need to deal with ever-increasing amounts of data, especially for data processing systems that support multiple services. Data processing systems usually need to achieve large-scale data processing through a certain scale of server collaboration. For systems that support multiple services, platforms are generally supported to support different services. Each platform can include one or more servers.

Take the data processing system as an example of the Ant Financial Service Data Processing System. Ant Financial Services mainly involves hundreds of businesses such as convenience, wealth management, capital exchanges, and shopping and entertainment. The number of platforms supporting these business systems is hundreds. Due to the sheer volume of platforms, changes to code, databases, and configuration can be very frequent, and changes that occur every week can be tens of thousands or more. However, during the operation of the Ant Financial Service Processing System, the number of actual failures is not frequent, and even some platforms have experienced failures, which leads to the coverage of abnormal data in the historical data sampled by Ant Financial in the running process. Not enough, when the abnormality detection is performed based on the historical data, the detection effect is not satisfactory. In addition, due to the lack of historical anomaly data, the identified abnormal data is difficult to correspond with historical anomaly data, so it is difficult to analyze the root cause of abnormal data through historical data, which requires experienced technical personnel to judge. High cost and low efficiency.

For the above problem, the embodiment of the present specification provides an abnormality detecting method and apparatus. First, the operating system architecture of the solution of the embodiment of the present specification is first described. Referring to FIG. 1, the entity involved in the embodiment of the present specification includes: a data processing system 100 and a computer device 200. The data processing system 100 may include a service server, a terminal, and the like. The computer device 200 can be implemented independently of the data processing system 100 or by devices in the data processing system 100. For example, the functionality of the computer device 200 can be implemented by a service server in the service network 100.

In the embodiment of the present specification, the abnormality detection model is trained by the computer device 200, and the abnormality detection model of the data processing system 100 is abnormally detected by the trained abnormality detection model.

As shown in FIG. 2, in one example, the computer device 200 updates the abnormal samples in the training set by acquiring abnormal data and extending the abnormal data, and training the abnormality detecting model according to the updated training set. If the recognition effect of the abnormality detection model obtained by the training fails to meet the expected result, the abnormal data is continuously acquired, and the abnormal data is extended to update the abnormal sample in the training set until the abnormal detection model is trained according to the updated training set. When the effect reaches the expected level, the training ends, and then, the abnormality detection model finally obtained through the training is used to perform abnormality detection on the data to be detected of the data processing system. Each time the training set is updated, the abnormal samples in the training set are increased, so that enough abnormal samples can be obtained as the basis for the abnormality detection.

In another example, as shown in FIG. 3, the computer device 200 can quantize the acquired and extended abnormal data each time the training set is updated, so that after each update of the training set, the abnormal sample is increased by a specified number or 100%. . For example, the abnormal sample of each training set update can be controlled by the parameter coverage of the abnormal sample.

Based on this, in the embodiment of the present specification, first, by acquiring abnormal data and extending the abnormal data, the abnormal samples in the training set are updated, and then determining whether the parameter coverage of the abnormal samples in the updated training set reaches the expected value. .

If not, continue to extend the exception sample.

If so, the anomaly detection model is trained according to the updated training set. If the recognition effect of the abnormality detection model obtained by the training does not reach the expected result, the abnormal data is continuously acquired, and the abnormal data is extended to update the abnormal sample in the training set, and the parameter coverage of the abnormal sample in the updated training set is ensured. The rate is as expected until the recognition effect of the anomaly detection model trained according to the updated training set reaches the expected level, and the training ends.

In another example, as shown in FIG. 4, the computer device 200 may also acquire an abnormal sample or extend the abnormal sample in combination with the recognition effect each time the training set is updated. In an example, the manner of extending the abnormal sample may be adjusted according to the recognition effect. For example, when the abnormality detection model after the training has a poor recognition effect on the abnormal sample corresponding to a certain service, when the abnormal sample is extended, the focus may be increased. The amount of data or parameter coverage of the abnormal sample corresponding to the service.

In the embodiment of the present specification, the generation process of the abnormal sample (including the acquisition and extension of the abnormal data) can be regarded as an offensive closed loop, and the training abnormality detecting model according to the updated training set can be regarded as a defensive closed loop. A sufficient number of abnormal samples can be obtained through the closed loop, and the anomaly detection model can be effectively trained through the defensive closed loop. The attack and defense confrontation can effectively improve the recognition effect of the anomaly detection model. Further, the attack can be quantified by the parameter coverage or the amount of data of the abnormal sample, making the training anomaly detection model easier to iterate.

Embodiments of the present invention will be further described below with reference to the accompanying drawings.

FIG. 5 is a schematic flowchart diagram of an abnormality detecting method according to an embodiment of the present disclosure. The method is applicable to a computer device, as shown in FIG. 5, the method includes steps 510-560:

Step 510: Acquire sampling data when the system is in normal operation, and use the sampled data as a normal sample in the training set.

The solution provided in this specification can periodically sample when the data processing system is in normal operation, and obtain sampling data during normal operation of the data processing system. For example, the data of the normal operation of the data processing system can be sampled every minute. Then, the sampled data of the acquired data processing system during normal operation is marked as a class as a normal sample in the training sample. For example, the sample data class when the data processing system is in normal operation is marked as "0", and the class "0" is used to indicate that the data marked by it is a normal sample.

Among them, in the system call link, parameter, system change level to detect a variety of detailed data when the system is in normal operation, according to the detailed data for anomaly detection flexibility, the theoretical upper limit of the recognition effect is high. Here, the data of the normal operation of the data processing system includes one or more of call data, indicator data, change data, and operation and maintenance data.

Specifically, the call data may include one or more of a call link, an interface name, an input parameter, an output parameter, and a call time-consuming. The calling link can be a directed acyclic graph, the point is the calling interface, and the edge is the calling relationship. The calling data may be for a call request. For example, in the ant service data processing system, the terminal invokes a request for a payment service.

The indicator data can be a key indicator of the data processing system, for example, the number of system calls for each service that can be aggregated in minutes in the form of time series.

Change data can trigger changes to actions such as code release and data processing system configuration modifications.

The operation and maintenance data can include hardware data. For example, CPU usage, network latency, memory usage, and more.

Step 520: Acquire abnormal data according to the pre-made rule.

The pre-made rule may be determined according to actual requirements. For example, the pre-made rule may sequentially generate a fault request for each service in the data processing system, so that the obtained abnormal sample corresponds to each service in the data processing system, and the coverage of the abnormal sample is high. .

In the solution provided by the embodiment of the present disclosure, the fault request may be generated according to the pre-made rule, and the context data of the fault request is obtained, and the context data of the fault request is added as an abnormal sample in the training set.

The context data of the fault request may be the running data of the collected data processing system after receiving the fault request. The context data may include one or more of call data, indicator data, change data, and operation and maintenance data.

Steps 530-560 are executed cyclically until the recognition effect of the anomaly detection model is expected to be:

In step 530, the abnormal data is extended, and the abnormal data and the extended abnormal data are added as abnormal samples in the training set.

In one example, the extension of the rules can be used to extend the anomaly data. Based on this, the abnormality detection data generated according to the prefabrication rule may be added or subtracted in the training set, and then the prefabricated rule is extended, and the extended fault request is generated according to the extended prefabrication rule, and the context data of the extended fault request is obtained, and The context data of the extended fault request is added as an exception sample in the training set.

In another example, the abnormality detection data generated according to the pre-made rule may first be added or subtracted in the training set, and then the following steps are performed cyclically until the parameter coverage of the abnormal sample in the training set reaches an expected value: the pre-made rule is performed Extending, generating an extended fault request according to the extended pre-made rule, acquiring context data of the extended fault request, adding context data of the extended fault request as an abnormal sample in the training set; and determining an abnormal sample in the training set The parameters cover whether the expected condition is reached. When the parameter coverage of the abnormal samples in the training set does not reach the expected level, the extended pre-formation is used as the new pre-made rule. For example, determining whether the parameter of the abnormal sample in the training set covers whether the expected value can be achieved can be achieved by determining whether the abnormal sample in the training set is spread over each service, and whether the number of abnormal samples corresponding to each service reaches a threshold.

When the parameter coverage rate of the abnormal sample reaches the expected training set training anomaly detection algorithm, the recognition effect of the anomaly detection algorithm does not reach the expected value, and the expected value of the parameter coverage of the abnormal sample can be improved.

The extension of the prefabricated rules can be extended in conjunction with business rules or by tricks. For example, it can be extended in one or more of the following ways:

Extending according to historical faults in the operation of the data processing system;

Extending according to the same type of historical fault as the fault request;

Extend based on possible failures in the use case library;

The intelligent fault extension, for example, can use the context collected according to the fault request as a seed sample, and adopt a genetic algorithm to perform fault extension.

In addition, the context data of the fault request can be marked as a class as an exception sample in the training sample. For example, the context data class of the fault request is marked as "1", and the class "1" is used to indicate that the data marked by it is an abnormal sample.

Step 540: Train the abnormality detection model according to the training set after adding the abnormal data, and determine the recognition effect of the abnormality detection model.

In the solution provided by this specification, first, feature preprocessing can be performed on samples in the training set. A variety of feature pre-processing methods can be employed herein to obtain features of one or more expressions of parameter expression, structural expression, indicator aggregation, and altered expression. The feature of each feature expression form may correspond to one or more anomaly detection models, and the features of each feature expression form correspond to different anomaly detection models.

Then, the corresponding anomaly detection models are trained according to the characteristics of each feature expression form. For example, the time series anomaly detection model is trained according to the indicator convergence feature; the graph-based anomaly detection algorithm can be trained according to the characteristics of the structure expression; the adjacent point-based, linear, subspace-based based training can be trained according to the parameter expression or the changed expression characteristics. And anomaly detection models based on supervised learning.

Wherein, when training the abnormality detecting model, the recognition effect of the abnormality detecting model can be determined, and after the recognition effect is constant, the constant recognition effect is the recognition effect of the abnormality detecting model after the training.

In addition, the recognition effect can be expressed by one or more of recognition accuracy, recognition coverage, and KS value.

At step 550, it is determined whether the recognition effect of the abnormality detection model is as expected.

The expectation may be a threshold corresponding to one or more of the identification accuracy, the recognition coverage, and the KS value, etc., for example, the prediction may be that the recognition accuracy is not less than 99.5%.

Step 560: When the recognition effect of the abnormality detection model is lower than expected, the new abnormal data is acquired according to the prefabrication rule.

The prefabricated rule in step 560 may be an extended prefabricated rule or an initial prefabricated rule, and the initial prefabricated rule may refer to a prefabricated rule in which no extension occurs.

In addition, each time the training set is updated, the abnormality sample may be acquired in conjunction with the recognition effect or the abnormal sample may be extended. In an example, the manner of extending the prefabrication rule may be adjusted according to the recognition effect. For example, when the abnormality detection model after the training has a poor recognition effect on the abnormal sample corresponding to a certain service, the extended prefabrication rule may focus on increasing the service for the service. The fault request is generated to obtain a richer abnormal sample corresponding to the service, thereby increasing the capability of the trained abnormality detecting model in identifying the data to be detected corresponding to the service.

Step 570: When the recognition effect of the abnormality detecting model reaches an expectation, the abnormality detecting model that uses the recognition effect to achieve the expected abnormality detecting model performs abnormality detection on the detected data.

In the embodiment of the present specification, when the data processing system receives the service processing request, the abnormality detection model may be triggered according to the recognition effect to perform the abnormality detection. After the abnormality detection is triggered, the data to be detected generated by the service processing request may be collected in real time or periodically. The data to be detected includes one or more of call data, indicator data, change data, and operation and maintenance data.

When using the anomaly detection model to detect the data to be detected, the feature data may be preprocessed first. Here, a plurality of feature preprocessing methods may be used to obtain one or more of parameter expression, structure expression, indicator convergence, and change expression. A characteristic of a form of expression.

An anomaly detection model corresponding to the feature of each expression is used to identify whether the feature is abnormal. When the features of the same expression form correspond to a plurality of anomaly detection models, if the detection results obtained by the plurality of anomaly detection models are inconsistent, the feature may be determined by voting to determine whether the feature is abnormal.

By constructing the offense and defense closed loop by confrontation, the attack and defense effects are quantified, and the benign loop is iterated, which solves the problem that the anomaly detection iteration is difficult.

Through data refinement identification and positioning, the space for recognition effect is improved, and the basis for determining the root cause of the fault is provided, which helps the system to locate the problem faster. It can be detected at the system call link, parameter, system change level. The context slice collected during fault injection can save the refined data, which can restore the situation when the system is faulty. It can be combined with multiple detailed data sources for identification. High performance, good recognition, and combined with refined data when locating faults.

Corresponding to the above method embodiment, the embodiment of the present specification further provides an abnormality detecting device. As shown in FIG. 6, the device may include:

The first obtaining unit 601 is configured to acquire sampling data when the system is in normal operation, and use the sampling data as a normal sample in the training set;

The second obtaining unit 602 is configured to acquire abnormal data according to the pre-made rule.

The looping unit 603 is configured to cyclically execute the following steps of performing the extending unit, the training unit, and the second acquiring unit until the recognition effect of the abnormality detecting model is expected to be used, so as to use the recognition effect to achieve the expected abnormality detecting model for the detected data. abnormal detection;

The extension unit 604 is configured to extend the abnormal data, and add the abnormal data and the extended abnormal data as abnormal samples in the training set;

The training unit 605 is configured to train the abnormality detection model according to a training set after adding abnormal data, and determine an identification effect of the abnormality detection model;

The second obtaining unit 602 is further configured to acquire new abnormal data according to the pre-made rule when the recognition effect of the abnormality detecting model is lower than expected.

In one example, the samples in the training set include one or more of call data, metric data, change data, and operational data.

In another example, the training unit 605 is specifically configured to

Performing feature pre-processing on the samples in the training set to obtain features of one or more expression forms of parameter expression, structure expression, index convergence, and altered expression, wherein each expression form corresponds to one or more abnormality detections model;

The corresponding anomaly detection model is trained according to the characteristics of each expression form.

In another example, the second obtaining unit 602 is specifically configured to generate a fault request according to the pre-made rule, and acquire context data of the fault request.

In another example, the extension unit 604 is specifically configured to: extend the pre-made rule, generate an extended fault request according to the extended pre-made rule, acquire context data of the extended fault request, and perform the fault request The context data and the context data of the extended fault request are added as abnormal samples in the training set.

In another example, the extension unit 604 is specifically configured to:

The loop performs the following steps until the parameter coverage of the abnormal samples in the training set reaches the expected value:

Extending the pre-made rule, generating an extended fault request according to the extended pre-made rule, acquiring context data of the extended fault request, and using context data of the fault request and context data of the extended fault request as An abnormal sample is added in the training set;

When the parameter coverage of the abnormal samples in the training set does not reach the expected level, the extended pre-formation is taken as the new pre-made rule.

For details of the implementation process of the functions and functions of the modules in the device, refer to the implementation process of the corresponding steps in the foregoing method, and details are not described herein again.

The embodiments of the present specification further provide a computer device including at least a memory, a processor, and a computer program stored on the memory and operable on the processor, the computer device being implemented in the form of an anomaly detection server. Wherein, when the processor executes the program, the foregoing abnormality detecting method is implemented. The method at least includes:

In another example, the training the anomaly detection model according to the training set includes:

In another example, the obtaining the abnormal data according to the pre-made rule includes:

A fault request is generated according to the pre-made rule, and the context data of the fault request is obtained.

In another example, extending the abnormal data, adding the abnormal data and the extended abnormal data as abnormal samples in the training set includes:

Extending the pre-made rule, generating an extended fault request according to the extended pre-made rule, acquiring context data of the extended fault request, and using context data of the fault request and context data of the extended fault request as Anomalous samples are added to the training set.

In another example, the extending the pre-made rule to generate an extended fault request according to the extended pre-made rule, and acquiring the context data of the extended fault request includes:

FIG. 7 shows a schematic diagram of a more specific computer device structure provided by an embodiment of the present specification. The computer device may include a processor 710, a memory 720, an input/output interface 730, a communication interface 740, and a bus 750. The processor 77, the memory 720, the input/output interface 730, and the communication interface 740 implement a communication connection between the devices via the bus 750.

The processor 710 can be implemented by using a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits for performing correlation. The program is implemented to implement the technical solutions provided by the embodiments of the present specification.

The memory 720 can be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 720 can store the operating system and other applications. When the technical solution provided by the embodiment of the present specification is implemented by software or firmware, the related program code is saved in the memory 720 and is called and executed by the processor 710.

The input/output interface 730 is used to connect an input/output module to implement information input and output. The input/output/module can be configured as a component in the device (not shown) or externally connected to the device to provide the corresponding function. The input device may include a keyboard, a mouse, a touch screen, a microphone, various types of sensors, and the like, and the output device may include a display, a speaker, a vibrator, an indicator light, and the like.

The communication interface 740 is used to connect a communication module (not shown) to implement communication interaction between the device and other devices. The communication module can communicate by wired means (such as USB, network cable, etc.), or can communicate by wireless means (such as mobile network, WIFI, Bluetooth, etc.).

Bus 750 includes a path for transferring information between various components of the device, such as processor 710, memory 720, input/output interface 730, and communication interface 740.

It should be noted that although the above device only shows the processor 710, the memory 720, the input/output interface 730, the communication interface 740, and the bus 750, in a specific implementation, the device may also include necessary for normal operation. Other components. In addition, it will be understood by those skilled in the art that the above-mentioned devices may also include only the components necessary for implementing the embodiments of the present specification, and do not necessarily include all the components shown in the drawings.

The embodiment of the present specification further provides a computer readable storage medium having stored thereon a computer program, which is implemented by the processor to implement the aforementioned abnormality detecting method. The method at least includes:

Computer readable media includes both permanent and non-persistent, removable and non-removable media. Information storage can be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include temporary storage of computer readable media, such as modulated data signals and carrier waves.

It can be clearly understood by those skilled in the art that the embodiments of the present specification can be implemented by means of software plus a necessary general hardware platform. Based on such understanding, the technical solution of the embodiments of the present specification may be embodied in the form of a software product in essence or in the form of a software product, which may be stored in a storage medium such as a ROM/RAM. Disks, optical disks, and the like, including instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in various embodiments of the embodiments of the present specification or embodiments.

The system, device, module or unit illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product having a certain function. A typical implementation device is a computer, and the specific form of the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email transceiver, and a game control. A combination of a tablet, a tablet, a wearable device, or any of these devices.

The various embodiments in the specification are described in a progressive manner, and the same or similar parts between the various embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment. The device embodiments described above are merely illustrative, and the modules described as separate components may or may not be physically separated, and the functions of the modules may be the same in the implementation of the embodiments of the present specification. Or implemented in multiple software and/or hardware. It is also possible to select some or all of the modules according to actual needs to achieve the purpose of the solution of the embodiment. Those of ordinary skill in the art can understand and implement without any creative effort.

The above is only a specific embodiment of the embodiments of the present specification, and it should be noted that those skilled in the art can make some improvements and refinements without departing from the principles of the embodiments of the present specification. Improvements and retouching should also be considered as protection of embodiments of the present specification.

Claims

An abnormality detecting method, characterized in that the method comprises:

Obtaining sampling data when the system is in normal operation, and using the sampling data as a normal sample in the training set;

Obtain the abnormal data according to the prefabrication rule, and perform the following steps cyclically until the recognition effect of the abnormality detection model reaches the expected value, so that the abnormality detection model that uses the recognition effect to achieve the expected abnormality detection model performs abnormality detection on the detected data:

Extending the abnormal data, adding the abnormal data and the extended abnormal data as abnormal samples in the training set;

The abnormality detection model is trained according to a training set after the abnormal data is added, and the recognition effect of the abnormality detection model is determined;

When the recognition effect of the abnormality detecting model is lower than expected, new abnormal data is acquired according to the prefabricated rule.
The method according to claim 1, wherein the samples in the training set comprise one or more of call data, indicator data, change data, and operation and maintenance data.
The method according to claim 2, wherein the training the abnormality detection model according to the training set comprises:

Performing feature pre-processing on the samples in the training set to obtain features of one or more expression forms of parameter expression, structure expression, index convergence, and altered expression, wherein each expression form corresponds to one or more abnormality detections model;

The corresponding anomaly detection model is trained according to the characteristics of each expression form.
The method according to claim 1, wherein the obtaining the abnormal data according to the prefabrication rule comprises:

A fault request is generated according to the pre-made rule, and the context data of the fault request is obtained.
The method according to claim 4, wherein extending the abnormal data, adding the abnormal data and the extended abnormal data as abnormal samples in the training set comprises:

Extending the pre-made rule, generating an extended fault request according to the extended pre-made rule, acquiring context data of the extended fault request, and using context data of the fault request and context data of the extended fault request as Anomalous samples are added to the training set.
The method according to claim 5, wherein the extending the pre-made rule to generate an extended fault request according to the extended pre-made rule, and acquiring the context data of the extended fault request comprises:

The loop performs the following steps until the parameter coverage of the abnormal samples in the training set reaches the expected value:

Extending the pre-made rule, generating an extended fault request according to the extended pre-made rule, acquiring context data of the extended fault request, and using context data of the fault request and context data of the extended fault request as An abnormal sample is added in the training set;

When the parameter coverage of the abnormal samples in the training set does not reach the expected level, the extended pre-formation is taken as the new pre-made rule.
An abnormality detecting device, characterized in that the device comprises:

a first acquiring unit, configured to acquire sampling data when the system is in normal operation, and use the sampling data as a normal sample in the training set;

a second acquiring unit, configured to acquire abnormal data according to the pre-made rule;

a looping unit, configured to cyclically execute the following execution unit, the training unit, and the step of the second acquiring unit, until the recognition effect of the abnormality detecting model reaches an expectation, so that the abnormality detecting model is used to achieve the abnormality of the detected data by using the recognition effect Detection

The extension unit is configured to extend the abnormal data, and add the abnormal data and the extended abnormal data as abnormal samples in the training set;

The training unit is configured to train the abnormality detection model according to a training set after adding abnormal data, and determine a recognition effect of the abnormality detection model;

The second obtaining unit is further configured to: when the recognition effect of the abnormality detecting model is lower than expected, acquire new abnormal data according to the prefabricated rule.
The apparatus according to claim 7, wherein the samples in the training set comprise one or more of call data, indicator data, change data, and operation and maintenance data.
The apparatus according to claim 8, wherein said training unit is specifically configured to:

Performing feature pre-processing on the samples in the training set to obtain features of one or more expression forms of parameter expression, structure expression, index convergence, and altered expression, wherein each expression form corresponds to one or more abnormality detections model;

The corresponding anomaly detection model is trained according to the characteristics of each expression form.
The apparatus according to claim 7, wherein the second obtaining unit is specifically configured to: generate a fault request according to the pre-made rule, and acquire context data of the fault request.
The apparatus according to claim 10, wherein the extension unit is configured to: extend the pre-made rule, generate an extended fault request according to the extended pre-made rule, and acquire a context of the extended fault request. Data, the context data of the fault request and the context data of the extended fault request are added as abnormal samples in the training set.
The device according to claim 11, wherein the extension unit is specifically configured to:

The loop performs the following steps until the parameter coverage of the abnormal samples in the training set reaches the expected value:

Extending the pre-made rule, generating an extended fault request according to the extended pre-made rule, acquiring context data of the extended fault request, and using context data of the fault request and context data of the extended fault request as An abnormal sample is added in the training set;

When the parameter coverage of the abnormal samples in the training set does not reach the expected level, the extended pre-formation is taken as the new pre-made rule.
A computer device comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, wherein the processor performs the following steps when executing the program:

Obtaining sampling data when the system is in normal operation, and using the sampling data as a normal sample in the training set;

Obtain the abnormal data according to the prefabrication rule, and perform the following steps cyclically until the recognition effect of the abnormality detection model reaches the expected value, so that the abnormality detection model that uses the recognition effect to achieve the expected abnormality detection model performs abnormality detection on the detected data:

Extending the abnormal data, adding the abnormal data and the extended abnormal data as abnormal samples in the training set;

The abnormality detection model is trained according to a training set after the abnormal data is added, and the recognition effect of the abnormality detection model is determined;

When the recognition effect of the abnormality detecting model is lower than expected, new abnormal data is acquired according to the prefabricated rule.