WO2021189904A1 - 数据异常检测方法、装置、电子设备及存储介质 - Google Patents

数据异常检测方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2021189904A1
WO2021189904A1 PCT/CN2020/131984 CN2020131984W WO2021189904A1 WO 2021189904 A1 WO2021189904 A1 WO 2021189904A1 CN 2020131984 W CN2020131984 W CN 2020131984W WO 2021189904 A1 WO2021189904 A1 WO 2021189904A1
Authority
WO
WIPO (PCT)
Prior art keywords
data set
data
anomaly detection
detection model
detected
Prior art date
Application number
PCT/CN2020/131984
Other languages
English (en)
French (fr)
Inventor
邓悦
郑立颖
徐亮
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021189904A1 publication Critical patent/WO2021189904A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis

Definitions

  • This application relates to artificial intelligence technology, and in particular to a data abnormality detection method, device, electronic equipment, and computer-readable storage medium.
  • KPI Key Performance Index anomaly detection
  • the inventor realizes that the KPI anomaly detection method in the prior art has low robustness and low model stability due to the low robustness of the trained anomaly detection model, which leads to the problem of insufficient accuracy of the detection result; at the same time, the prior art A large number of tags are generated in the detection method, which consumes computer resources and reduces the detection efficiency.
  • a data anomaly detection method provided by this application includes:
  • the standard training data set including anomaly detection data and missing data
  • the anomaly detection model framework including a variational lower limit function
  • target to-be-detected data whose reconstruction probability is greater than or equal to the reconstruction threshold, it is determined that the target-to-be-detected data is abnormal data.
  • the present application also provides a data abnormality detection device, which includes:
  • the model acquisition module is used to acquire a pre-built anomaly detection model framework, the anomaly detection model framework including a variational lower limit function;
  • the function adjustment module is used to adjust the lower limit function of the variation by using the missing data to obtain an optimized lower limit function of the variation;
  • a model training module configured to use the standard training data set to train the anomaly detection model framework including the optimized variational lower limit function to obtain an anomaly detection model
  • a reconstruction probability acquisition module configured to acquire a data set to be detected, use the anomaly detection model to detect the data set to be detected, and obtain a reconstruction probability of the data to be detected in the data set to be detected;
  • the anomaly detection module is configured to determine that the target to-be-detected data is abnormal data if there is target to-be-detected data whose reconstruction probability is greater than or equal to the reconstruction threshold.
  • This application also provides an electronic device, which includes:
  • Memory storing at least one instruction
  • the processor executes the instructions stored in the memory to implement the following steps:
  • the standard training data set including anomaly detection data and missing data
  • the anomaly detection model framework including a variational lower limit function
  • target to-be-detected data whose reconstruction probability is greater than or equal to the reconstruction threshold, it is determined that the target-to-be-detected data is abnormal data.
  • the present application also provides a computer-readable storage medium in which at least one instruction is stored, and the at least one instruction is executed by a processor in an electronic device to implement the following steps:
  • the standard training data set including anomaly detection data and missing data
  • the anomaly detection model framework including a variational lower limit function
  • target to-be-detected data whose reconstruction probability is greater than or equal to the reconstruction threshold, it is determined that the target-to-be-detected data is abnormal data.
  • FIG. 1 is a schematic flowchart of a data anomaly detection method provided by an embodiment of this application
  • FIG. 2 is a diagram of functional modules of a data anomaly detection device provided by an embodiment of the application.
  • FIG. 3 is a schematic structural diagram of an electronic device that implements the data abnormality detection method provided by an embodiment of the application.
  • the execution subject of the data abnormality detection method provided in the embodiment of the present application includes but is not limited to at least one of the electronic devices that can be configured to execute the method provided in the embodiment of the present application, such as a server and a terminal.
  • the data anomaly detection method may be executed by software or hardware installed on a terminal device or a server device, and the software may be a blockchain platform.
  • the server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, etc.
  • the data abnormality detection method includes:
  • the standard training data set may include various KPI (Key Performance Indicators) values.
  • the KPI refers to the monitoring indicators (such as latency, throughput, etc.) of operation and maintenance objects such as services and systems.
  • the standard training data set contains the same or different KPIs arranged in the order of the monitored time. sequence.
  • the missing data is data with a value of 0, and the abnormality detection data is KPI abnormal data.
  • the standard training data set has hardware resource consumption collected at different times.
  • the hardware resource consumption of part of the time is 0 or the hardware resource consumption of part of the time is abnormal; or the number of online users collected at different times in the standard training data set, of which the number of online users in part of the time is 0 or the number of online users is online in part of the time The number of users is abnormal; or, the number of concurrent users is collected at different times in the standard training data set, where the number of concurrent users in part of the time is 0 or the number of concurrent users in part of the time is abnormal.
  • the standard training data set can be stored in the blockchain, and in specific implementation, the standard training data set is directly obtained from the nodes of the blockchain.
  • said obtaining a standard training data set includes:
  • the normalized data set is input into a preset sliding window to obtain the standard training data set.
  • the embodiment of the present application performs normalization processing on the original training data set through the following formula:
  • n is the number of data in the original training data set
  • x i is the i-th data in the original training data set
  • y i is the i-th data in the normalized data set
  • the y i ⁇ [ 0,1].
  • the normal data of the ⁇ ratio (that is, the non-zero KPI data) is randomly set to 0, which is regarded as missing data, which enhances the effect of model training.
  • the original training data set is time series data
  • inputting a normalized data set into the sliding window can ensure the sequence of the original training data set and improve the usability and consistency of the data.
  • the size of the sliding window is W
  • there are W pieces of data in the standard training data set that is, the data in the standard training data set are: x W ,..., x 1 .
  • normalizing the original training data set can standardize the data in the standard training data set, and by using the sliding window, the time sequence of the data in the standard training data set is ensured.
  • the pre-built anomaly detection model framework may be a VAE (Variational Autoencoders, variational autoencoder) anomaly detection model framework.
  • VAE Variational Autoencoders, variational autoencoder
  • the VAE includes an encoder, a decoder, and a variational lower limit function.
  • the encoder calculates the hidden variable distribution parameters (mean and variance) in the standard training data set, and samples the hidden variables.
  • the decoder The latent variable is restored to obtain an output result, and the input KPI data can be used to calculate the reconstruction probability of the input KPI data by using the output result and the variational lower limit function, and whether the KPI data is abnormal is judged according to the reconstruction probability.
  • the use of the missing data to adjust the lower variational function to obtain an optimized lower variational function includes:
  • the optimized value is added to the lower limit of variation function to obtain the lower limit of variation function.
  • the optimized value can be calculated based on the missing data in the following manner:
  • the optimized value ⁇ is calculated according to the missing coefficient obtained from the missing data. Specifically,
  • the x w is the wth data in the standard training data set.
  • W is the number of data in the standard training data set
  • x w is the w-th data in the standard training data set
  • a w is the missing coefficient of the w-th data, when x w is the missing data
  • is the optimized value
  • the z represents the hidden variable z in the standard training data set.
  • z) means taking the logarithm of p(x
  • p ⁇ (z) represents the distribution of the hidden variable z under the standard training data set
  • logp ⁇ (z) represents the logarithm of the p ⁇ (z)
  • x) represents the pair
  • x) takes the logarithm
  • x) means the distribution of the hidden variable z under the sample x, which corresponds to the encoder part.
  • the optimized variational lower limit function is adjusted according to the missing data, so that the missing data can be used to train the anomaly detection model framework, which enhances the framework of the anomaly detection model.
  • the stability of abnormal data improves the robustness of the model.
  • the S4 includes:
  • Step A Input the standard training data set into the anomaly detection model framework for calculation to obtain an output result
  • Step B Calculate the loss value of the optimized variational lower limit function according to the output result
  • Step C If the loss value is greater than the preset loss threshold, adjust the parameters in the anomaly detection model framework, and return to step A, until the loss value is less than or equal to the loss threshold, stop adjusting the anomaly detection
  • the parameters in the model framework are used to obtain the anomaly detection model.
  • the inputting the standard training data set into the anomaly detection model framework for calculation to obtain an output result includes:
  • the output result is calculated by using the decoder in the anomaly detection model framework and the hidden variable.
  • the hidden variable distribution parameter is the hidden variable of all data in the standard training data set.
  • the hidden variable is calculated by using the following formula:
  • z is the hidden variable
  • ⁇ (X) are the mean and variance in the hidden variable distribution parameters
  • ⁇ (x) is the mean of the standard training data set.
  • the output result is calculated by using the following formula:
  • p(x) is the output result
  • z is the point in the hidden variable space Z
  • p(z) is the probability of getting the hidden variable z
  • is the point in the parameter space ⁇
  • the range is the preset range.
  • I represents the identity matrix
  • is the hyperparameter
  • f is a function that maps z and ⁇ to x, that is, f:X ⁇ X.
  • the standard training data set can be reused to train the anomaly detection model framework, which improves the data utilization rate.
  • the decoder in the anomaly detection model when the anomaly detection model is specifically used to detect the data set to be detected, for each data to be detected in the data set to be detected, the decoder in the anomaly detection model outputs the mean and variance parameters .
  • the encoder in the anomaly detection model uses the mean and variance parameters output by the decoder to calculate the average probability of generating from the hidden variable distribution z that is close to the data to be detected.
  • the average probability is used as an anomaly score, called the reconstruction probability,
  • the reconstruction probability is used to evaluate the possibility of abnormality in the data to be detected.
  • the method before detecting the data set to be detected by using the anomaly detection model, the method further includes:
  • the missing values in the data set to be detected are filled in by Monte Carlo interpolation.
  • the Monte Carlo interpolation method can be obtained from the prior art, and will not be repeated here.
  • the missing values in the data set to be detected will cause deviations in the encoding process of the encoder in the anomaly detection model, thereby affecting the results of data anomaly detection.
  • the Monte Carlo interpolation method is used in the data set to be detected. Filling in the missing values of, can improve the accuracy of data anomaly detection.
  • the reconstruction threshold is preset.
  • data with a high reconstruction probability is determined as abnormal data.
  • a warning message reminder when it is determined that abnormal data exists in the data set to be detected, a warning message reminder is sent, and the warning message reminder includes the running time corresponding to the abnormal data point.
  • the warning message reminder By including the running time corresponding to the abnormal data point in the warning message reminder, it is beneficial to improve the efficiency of operation and maintenance.
  • the data to be detected whose reconstruction probability is less than the reconstruction threshold is normal data.
  • the monitoring is continued, and the abnormal detection result and detection time of this time are recorded.
  • the embodiment of the application adjusts the lower limit function of variation in the framework of the anomaly detection model according to the missing data in the standard training data set, so that the lower limit function of variation can be optimized, and the optimized lower limit function is used to train the model.
  • a more robust anomaly detection model can be obtained, which is conducive to improving the stability of the anomaly detection model, avoiding the problem of inaccurate detection, and further helping to improve the accuracy of KPI anomaly detection;
  • the embodiments of the present application according to The anomaly detection model obtained by optimizing the lower limit function of the variational function is used for detection. In this process, no label is generated, which reduces the dependence on the label, avoids occupying too much computer resources, and improves the efficiency of detection. Therefore, the data anomaly detection method proposed in this application can improve the efficiency and accuracy of KPI anomaly detection.
  • FIG. 2 it is a functional block diagram of a data abnormality detection device provided by an embodiment of the present application.
  • the data abnormality detection device 100 described in this application can be installed in an electronic device.
  • the data anomaly detection device 100 may include a data processing module 101, a model acquisition module 102, a function adjustment module 103, a model training module 104, a reconstruction probability acquisition module 105, and an anomaly detection module 106.
  • the module described in this application can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of an electronic device and can complete fixed functions, and are stored in the memory of the electronic device.
  • each module/unit is as follows:
  • the data processing module 101 is configured to obtain a standard training data set, and the standard training data set includes anomaly detection data and missing data.
  • the standard training data set may include various KPI (Key Performance Indicators) values.
  • the KPI refers to the monitoring indicators (such as latency, throughput, etc.) of operation and maintenance objects such as services and systems.
  • the standard training data set contains the same or different KPIs arranged in the order of the monitored time. sequence.
  • the missing data is data with a value of 0, and the abnormality detection data is KPI abnormal data.
  • the standard training data set has hardware resource consumption collected at different times.
  • the hardware resource consumption of part of the time is 0 or the hardware resource consumption of part of the time is abnormal; or the number of online users collected at different times in the standard training data set, of which the number of online users in part of the time is 0 or the number of online users is online in part of the time The number of users is abnormal; or, the number of concurrent users is collected at different times in the standard training data set, where the number of concurrent users in part of the time is 0 or the number of concurrent users in part of the time is abnormal.
  • the standard training data set can be stored in the blockchain, and in specific implementation, the standard training data set is directly obtained from the nodes of the blockchain.
  • data processing module 101 is specifically configured to:
  • the normalized data set is input into a preset sliding window to obtain the standard training data set.
  • the embodiment of the present application performs normalization processing on the original training data set through the following formula:
  • n is the number of data in the original training data set
  • x i is the i-th data in the original training data set
  • y i is the i-th data in the normalized data set
  • the y i ⁇ [ 0,1].
  • the normal data of the ⁇ ratio (that is, the non-zero KPI data) is randomly set to 0, which is regarded as missing data, which enhances the effect of model training.
  • the original training data set is time series data
  • inputting a normalized data set into the sliding window can ensure the sequence of the original training data set and improve the usability and consistency of the data.
  • the size of the sliding window is W
  • there are W pieces of data in the standard training data set that is, the data in the standard training data set are: x W ,..., x 1 .
  • normalizing the original training data set can standardize the data in the standard training data set, and by using the sliding window, the time sequence of the data in the standard training data set is ensured.
  • the model acquisition module 102 is configured to acquire a pre-built anomaly detection model framework, and the anomaly detection model framework includes a variational lower limit function.
  • the pre-built anomaly detection model framework may be a VAE (Variational Autoencoders, variational autoencoder) anomaly detection model framework.
  • VAE Variational Autoencoders, variational autoencoder
  • the VAE includes an encoder, a decoder, and a variational lower limit function.
  • the encoder calculates the hidden variable distribution parameters (mean and variance) in the standard training data set, and samples the hidden variables.
  • the decoder The latent variable is restored to obtain an output result, and the input KPI data can be used to calculate the reconstruction probability of the input KPI data by using the output result and the variational lower limit function, and whether the KPI data is abnormal is judged according to the reconstruction probability.
  • the function adjustment module 103 is configured to adjust the lower limit function of variation by using the missing data to obtain an optimized lower limit function of variation.
  • the function adjustment module 103 is specifically configured to:
  • the optimized value is added to the lower limit of variation function to obtain the lower limit of variation function.
  • the optimized value can be calculated based on the missing data in the following manner:
  • the optimized value ⁇ is calculated according to the missing coefficient obtained from the missing data. Specifically,
  • the x w is the wth data in the standard training data set.
  • W is the number of data in the standard training data set
  • x w is the w-th data in the standard training data set
  • a w is the missing coefficient of the w-th data, when x w is the missing data
  • is the optimized value
  • the z represents the hidden variable z in the standard training data set.
  • z) means taking the logarithm of p(x
  • p ⁇ (z) represents the distribution of the hidden variable z under the standard training data set
  • logp ⁇ (z) represents the logarithm of the p ⁇ (z)
  • x) represents the pair
  • x) takes the logarithm
  • x) means the distribution of the hidden variable z under the sample x, which corresponds to the encoder part.
  • the optimized variational lower limit function is adjusted according to the missing data, so that the missing data can be used to train the anomaly detection model framework, which enhances the framework of the anomaly detection model.
  • the stability of abnormal data improves the robustness of the model.
  • the model training module 104 is configured to use the standard training data set to train the anomaly detection model framework including the optimized variational lower limit function to obtain an anomaly detection model.
  • the model training module 104 includes:
  • the first calculation unit is configured to input the standard training data set into the anomaly detection model framework for calculation to obtain an output result
  • a second calculation unit configured to calculate the loss value of the optimized variational lower limit function according to the output result
  • the model acquisition and adjustment unit is configured to adjust the parameters in the anomaly detection model framework if the loss value is greater than the preset loss threshold, and start the first calculation unit to input the standard training data set into the anomaly detection model
  • the framework performs calculations to obtain output results, and stops adjusting the parameters in the anomaly detection model framework until the loss value is less than or equal to the loss threshold to obtain the anomaly detection model.
  • the first calculation unit is specifically configured to:
  • the output result is calculated by using the decoder in the anomaly detection model framework and the hidden variable.
  • the hidden variable distribution parameter is the hidden variable of all data in the standard training data set.
  • the hidden variable is calculated by using the following formula:
  • z is the hidden variable
  • ⁇ (X) are the mean and variance in the hidden variable distribution parameters
  • ⁇ (x) is the mean of the standard training data set.
  • the output result is calculated by using the following formula:
  • p(x) is the output result
  • z is the point in the hidden variable space Z
  • p(z) is the probability of getting the hidden variable z
  • is the point in the parameter space ⁇
  • the range is the preset range.
  • I represents the identity matrix
  • is the hyperparameter
  • f is a function that maps z and ⁇ to x, that is, f:X ⁇ X.
  • the standard training data set can be repeatedly used to train the anomaly detection model framework, which improves the data utilization rate.
  • the reconstruction probability acquisition module 105 is configured to acquire a data set to be detected, use the anomaly detection model to detect the data set to be detected, and obtain a reconstruction probability of the data to be detected in the data set to be detected.
  • the decoder in the anomaly detection model when the anomaly detection model is specifically used to detect the data set to be detected, for each data to be detected in the data set to be detected, the decoder in the anomaly detection model outputs the mean and variance parameters .
  • the encoder in the anomaly detection model uses the mean and variance parameters output by the decoder to calculate the average probability of generating from the hidden variable distribution z that is close to the data to be detected.
  • the average probability is used as an anomaly score, called the reconstruction probability,
  • the reconstruction probability is used to evaluate the possibility of abnormality in the data to be detected.
  • the device further includes a judgment module, and the judgment module is configured to:
  • the missing values in the data set to be detected are filled in by Monte Carlo interpolation.
  • Monte Carlo interpolation method can be obtained from the prior art, and will not be repeated here.
  • the missing values in the data set to be detected will cause deviations in the encoder encoding process in the anomaly detection model, thereby affecting the result of data anomaly detection.
  • the Monte Carlo interpolation method is used in the data set to be detected. Filling in the missing values of, can improve the accuracy of data anomaly detection, and at the same time use the anomaly detection model to output the reconstruction probability, which greatly improves the rate of data anomaly detection.
  • the anomaly detection module 106 is configured to determine that the target to-be-detected data is abnormal data if there is target to-be-detected data whose reconstruction probability is greater than or equal to a reconstruction threshold.
  • the reconstruction threshold is preset.
  • data with a high reconstruction probability is determined as abnormal data.
  • a warning message reminder when it is determined that abnormal data exists in the data set to be detected, a warning message reminder is sent, and the warning message reminder includes the running time corresponding to the abnormal data point.
  • the warning message reminder By including the running time corresponding to the abnormal data point in the warning message reminder, it is beneficial to improve the efficiency of operation and maintenance.
  • the data to be detected whose reconstruction probability is less than the reconstruction threshold is normal data.
  • the monitoring is continued, and the abnormal detection result and detection time of this time are recorded.
  • the embodiment of the application adjusts the lower limit function of the variation in the framework of the anomaly detection model according to the missing data in the standard training data set, so as to optimize the lower limit function of the variation, and then use the optimized lower limit function to train the model.
  • a more robust anomaly detection model can be obtained, which is conducive to improving the stability of the anomaly detection model, avoiding the problem of inaccurate detection, and further helping to improve the accuracy of KPI anomaly detection and detection; at the same time, the embodiments of the present application,
  • the detection is performed according to the anomaly detection model obtained by the optimized variational lower limit function. In this process, no label is generated, which reduces the dependence on the label, avoids occupying too much computer resources, and improves the efficiency of detection. Therefore, the data anomaly detection device proposed in this application can improve the efficiency and accuracy of KPI anomaly detection.
  • FIG. 3 it is a schematic structural diagram of an electronic device that implements a data abnormality detection method provided by an embodiment of the present application.
  • the electronic device 1 may include a processor 10, a memory 11, and a bus, and may also include a computer program stored in the memory 11 and running on the processor 10, such as a data abnormality detection program 12.
  • the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (such as SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc.
  • the memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, for example, a mobile hard disk of the electronic device 1.
  • the memory 11 may also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a smart media card (SMC), and a secure digital (Secure Digital) equipped on the electronic device 1. , SD) card, flash card (Flash Card), etc.
  • the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device.
  • the memory 11 can be used not only to store application software and various data installed in the electronic device 1, such as the code of the data abnormality detection program 12, etc., but also to temporarily store data that has been output or will be output.
  • the processor 10 may be composed of integrated circuits in some embodiments, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more Combinations of central processing unit (CPU), microprocessor, digital processing chip, graphics processor, and various control chips, etc.
  • the processor 10 is the control unit of the electronic device, which uses various interfaces and lines to connect the various components of the entire electronic device, and runs or executes programs or modules (such as data) stored in the memory 11 An abnormality detection program, etc.), and call data stored in the memory 11 to execute various functions of the electronic device 1 and process data.
  • the bus may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • PCI peripheral component interconnect standard
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the bus is configured to implement connection and communication between the memory 11 and at least one processor 10 and the like.
  • FIG. 3 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include fewer or more components than shown in the figure. Components, or a combination of certain components, or different component arrangements.
  • the electronic device 1 may also include a power source (such as a battery) for supplying power to various components.
  • the power source may be logically connected to the at least one processor 10 through a power management device, thereby controlling power
  • the device implements functions such as charge management, discharge management, and power consumption management.
  • the power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators.
  • the electronic device 1 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
  • the electronic device 1 may also include a network interface.
  • the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
  • the electronic device 1 may also include a user interface.
  • the user interface may be a display (Display) and an input unit (such as a keyboard (Keyboard)).
  • the user interface may also be a standard wired interface or a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc.
  • the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the electronic device 1 and to display a visualized user interface.
  • the data abnormality detection program 12 stored in the memory 11 in the electronic device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:
  • the standard training data set including anomaly detection data and missing data
  • the anomaly detection model framework including a variational lower limit function
  • target to-be-detected data whose reconstruction probability is greater than or equal to the reconstruction threshold, it is determined that the target-to-be-detected data is abnormal data.
  • the integrated module/unit of the electronic device 1 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a non-volatile or volatile computer readable storage medium .
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) .
  • the computer-readable storage medium stores a computer program, where the computer program is executed by a processor to implement the following steps:
  • the standard training data set including anomaly detection data and missing data
  • the anomaly detection model framework including a variational lower limit function
  • target to-be-detected data whose reconstruction probability is greater than or equal to the reconstruction threshold, it is determined that the target-to-be-detected data is abnormal data.
  • modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.
  • the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Hardware Design (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Quality & Reliability (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Debugging And Monitoring (AREA)

Abstract

一种数据异常检测方法、装置、电子设备以及计算机可读存储介质,该方法包括:获取包含缺失数据的标准训练数据集及包括变分下限函数的异常检测模型框架;利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数;利用所述标准训练数据集对异常检测模型框架进行训练,得到异常检测模型;利用所述异常检测模型对所述待检测数据集进行检测并得到待检测数据的重构概率;若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。该方法可以提高关键性能指标(KPI)异常检测的效率和准确率。

Description

数据异常检测方法、装置、电子设备及存储介质
本申请要求于2020年10月9日提交中国专利局、申请号为CN202011074730.8、名称为“数据异常检测方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术,尤其涉及一种数据异常检测方法、装置、电子设备及计算机可读存储介质。
背景技术
KPI(关键性能指标)异常检测是智能运维领域中非常重要的部分。为了确保业务不中断,通常需要检测各种KPI(如应用程序的KPI、操作系统的KPI等)是否存在异常,从而确定系统的软件或硬件是否存在故障,并及时进行故障排除。
发明人意识到,现有技术中的KPI异常检测方法,由于训练完成的异常检测模型鲁棒性较低,模型稳定性交低,导致检测结果的存在不够准确的问题;同时,现有技术中的检测方法中会生成大量标签,占用计算机资源的同时降低了检测效率。
发明内容
本申请提供的一种数据异常检测方法,包括:
获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据;
获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数;
利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数;
利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型;
获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率;
若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
本申请还提供一种数据异常检测装置,所述装置包括:
数据处理模块,用于获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据;
模型获取模块,用于获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数;
函数调整模块,用于利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数;
模型训练模块,用于利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型;
重构概率获取模块,用于获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率;
异常检测模块,用于若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
本申请还提供一种电子设备,所述电子设备包括:
存储器,存储至少一个指令;及
处理器,执行所述存储器中存储的指令以实现如下步骤:
获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据;
获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数;
利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数;
利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型;
获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率;
若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一个指令,所述至少一个指令被电子设备中的处理器执行以实现如下步骤:
获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据;
获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数;
利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数;
利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型;
获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率;
若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
附图说明
图1为本申请一实施例提供的数据异常检测方法的流程示意图;
图2为本申请一实施例提供的数据异常检测装置的功能模块图;
图3为本申请一实施例提供的实现所述数据异常检测方法的电子设备的结构示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请实施例提供的数据异常检测方法的执行主体包括但不限于服务端、终端等能够被配置为执行本申请实施例提供的该方法的电子设备中的至少一种。换言之,所述数据异常检测方法可以由安装在终端设备或服务端设备的软件或硬件来执行,所述软件可以是区块链平台。所述服务端包括但不限于:单台服务器、服务器集群、云端服务器或云端服务器集群等。
参照图1所示,为本申请一实施例提供的数据异常检测方法的流程示意图。在本实施例中,所述数据异常检测方法包括:
S1、获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据。
本申请实施例中,所述标准训练数据集可以包含各种KPI(Key Performance Indicators,关键性能指标)的数值。
所述KPI是指对服务、系统等运维对象的监控指标(如延迟、吞吐量等),具体的,标准训练数据集包含相同或不同的KPI按监控到的时间先后顺序排列而成的数值序列。
本申请实施例中,所述缺失数据为数值为0的数据,所述异常检测数据为KPI异常数据。
例如,标准训练数据集中有不同时间采集的CPU使用率,其中,部分时间的CPU使用率为0或部分时间CPU使用率异常;或者,标准训练数据集中有不同时间采集到的硬件资源消耗量,其中,部分时间的硬件资源消耗量为0或部分时间的硬件资源消耗量异常;或者,标准训练数据集中有不同时间采集到在线用户数量,其中,部分时间的在线用户数 量为0或部分时间在线用户数量异常;或者,标准训练数据集中有不同时间采集到并发用户数,其中,部分时间的并发用户数为0或部分时间的并发用户数异常。
优选的,所述标准训练数据集可以存储于区块链中,则在具体实施时,直接从区块链的节点中获取所述标准训练数据集。
通过将标准训练数据集存储于区块链中,可以提高KPI数据的私密和安全性。
进一步的,在本申请一可选实施例中,所述获取标准训练数据集包括:
获取原始训练数据集;
将所述原始训练数据集中预设比例的数据设置为缺失数据;
通过预设的归一化公式对所述包括缺失数据的原始训练数据集进行归一化处理,得到归一化数据集;
将所述归一化数据集输入至预设的滑动窗口,得到所述标准训练数据集。
详细地,本申请实施例通过下述公式对所述原始训练数据集进行归一化处理:
Figure PCTCN2020131984-appb-000001
其中,n为所述原始训练数据集中数据个数,x i为所述原始训练数据集中第i个数据,y i为所述归一化数据集中第i个数据,并且所述y i∈[0,1]。
本申请实施例中,随机将λ比率的正常数据(即非0的KPI数据)设置为0,视为缺失数据,加强了模型训练的效果。
本申请实施例中,所述原始训练数据集为时间序列数据,将归一化数据集输入至所述滑动窗口,可以保证所述原始训练数据集的序列性,提高数据的可用性和一致性。
具体的,若所述滑动窗口的大小为W,则所述标准训练数据集中的数据为W个,即标准训练数据集中的数据为:x W,…,x 1
本申请实施例中,对所述原始训练数据集进行归一化处理,可以标准化标准训练数据集中的数据,并且通过使用所述滑动窗口,保证了所述标准训练数据集中数据的时间序列性。
S2、获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数。
本申请实施例中,所述预先构建的异常检测模型框架可以为VAE(Variational Autoencoders,变分自编码器)异常检测模型框架。
具体的,VAE包括编码器、解码器及变分下限函数,所述编码器计算所述标准训练数据集中所述的隐变量分布参数(均值和方差),并采样得到隐变量,所述解码器对所述隐变量进行恢复,得到输出结果,利用所述输出结果和所述变分下限函数可以对输入的KPI数据计算其重构概率,根据所述重构概率判断所述KPI数据是否异常。
S3、利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数。
具体的,所述利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数,包括:
基于所述缺失数据计算优化数值;
将所述优化数值添加至所述变分下限函数,得到所述优化变分下限函数。
进一步的,可通过以下方式基于所述缺失数据计算优化数值:
缺失系数
Figure PCTCN2020131984-appb-000002
根据所述缺失数据数据得到的缺失系数计算优化数值β,具体的,
Figure PCTCN2020131984-appb-000003
其中,所述x w为所述标准训练数据集中第w个数据。
进一步地,所述优化变分下限函数为:
Figure PCTCN2020131984-appb-000004
其中,所述W为所述标准训练数据集中的数据个数,x w为所述标准训练数据集中第w个数据,所述a w为第w个数据的缺失系数,当x w为缺失数据时,a w=1,当x w不为缺失数据时,a w=0,β为优化数值,且存在
Figure PCTCN2020131984-appb-000005
所述z表示标准训练数据集中隐变量z。
其中,
Figure PCTCN2020131984-appb-000006
表示对x对应的隐变量z的分布计算期望,logp θ(x|z)表示对p(x|z;θ)取对数,p θ(x|z)意味着将隐变量z恢复成x,对应着解码器,p θ(z)表示标准训练数据集下隐变量z的分布,logp θ(z)表示对所述p θ(z)取对数,logq φ(z|x)表示对所述q φ(z|x)取对数,q φ(z|x)意味着在样本x下隐变量z的分布,对应于编码器部分。
进一步地,本申请实施例中,所述优化变分下限函数根据所述缺失数据进行调整,使得可以利用所述缺失数据对所述异常检测模型框架进行训练,增强了所述异常检测模型框架面对异常数据的稳定性,从而提高了模型的鲁棒性。
S4、利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型。
较佳地,所述S4包括:
步骤A:将所述标准训练数据集输入至所述异常检测模型框架进行计算,得到输出结果;
步骤B:根据所述输出结果计算所述优化变分下限函数的损失值;
步骤C:若所述损失值大于预设的损失阈值时,调整所述异常检测模型框架中的参数,返回步骤A,直到所述损失值小于等于所述损失阈值时,停止调整所述异常检测模型框架中的参数,得到所述异常检测模型。
进一步地,所述将所述标准训练数据集输入至所述异常检测模型框架进行计算,得到输出结果,包括:
利用所述异常检测模型框架中的编码器计算所述标准训练数据集中数据的隐变量分布参数;
对所述隐变量分布参数进行取样得到隐变量;
利用异常检测模型框架中的解码器及所述隐变量计算得到所述输出结果。
具体的,所述隐变量分布参数是所述标准训练数据集中所有数据的隐变量。
其中,本申请实施例中,利用下述公式计算得到所述隐变量:
Figure PCTCN2020131984-appb-000007
z为所述隐变量,
Figure PCTCN2020131984-appb-000008
μ(X),∑(X)为所述隐变量分布参数里的均值和方差,μ(x)为所述标准训练数据集的均值。
其中,本申请实施例中,利用下述公式计算得到所述输出结果:
p(x)=∫p(x,z|θ)=∫p(x|z;θ)p(z)dz
其中,p(x)为所述输出结果,z是隐变量空间Z中的点,p(z)为取到所述隐变量z的概率,θ是参数空间Θ中的点,所述参数空间的范围为预设范围。
p(x|z;θ)=N(x|f(z;θ),σ 2*I)
其中,I表示单位矩阵,σ为超参数。f为将z,θ映射到x的函数,即f:X×Θ→X。
本申请实施例中,由于每次训练前所述缺失数据是随机选取的,因此可以重复利用所 述标准训练数据集训练所述异常检测模型框架,提高了数据利用率。
S5、获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率。
本申请实施例中,具体利用所述异常检测模型对所述待检测数据集进行检测时,对于所述待检测数据集中的每个待检测数据,异常检测模型中的解码器输出均值和方差参数。异常检测模型中的编码器利用解码器输出的均值和方差参数,计算从所述隐变量分布z产生与待检测数据接近的平均概率,所述平均概率用作异常分数,称为重构概率,所述重构概率用于评估待检测数据出现异常的可能性。
优选的,本申请一实施例中,所述利用所述异常检测模型对所述待检测数据集进行检测之前,所述方法还包括:
判断所述待检测数据集中是否存在缺失值;
若所述待检测数据集中存在缺失值,通过蒙特卡洛插补法填充所述待检测数据集中存在的缺失值。
具体的,蒙特卡洛插补法可以从现有技术中获取,此处不再赘述。本申请实施例中,待检测数据集中的缺失值会在所述异常检测模型中的编码器编码过程引起偏差,从而影响数据异常检测的结果,通过所述蒙特卡洛插补法对待检测数据集中的缺失值进行填充,可以提高数据异常检测的准确性。
S6、若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
具体的,所述重构阈值为预设的。
本申请实施例中,将具有高重构概率的数据确定为异常数据。
可选的,在本申请实施例中,当确定待检测数据集中存在异常数据时,发送警告消息提醒,所述警告消息提醒包括异常数据点对应的运行时间。通过在警告消息提醒中包含异常数据点对应的运行时间,有利于提高运维的效率。
在本申请实施例中,确定重构概率小于所述重构阈值的待检测数据为正常数据。
可选的,在本申请实施例中,当确定待检测数据集中不存在异常数据时,持续监控,以及对本次的异常检测结果和检测时间进行记录。
本申请实施例根据所述标准训练数据集中的缺失数据对所述异常检测模型框架中的变分下限函数进行调整,可以优化变分下限函数,进而利用优化的变分下限函数对模型进行训练,可以得到鲁棒性更高的异常检测模型,有利于提高异常检测模型的稳定性,避免出现检测不准确的问题,进而有利于提高KPI异常检测时的准确率;同时,本申请实施例,根据优化变分下限函数得的异常检测模型进行检测,这一过程中不产生标签,降低了对标签的依赖性,避免了占用过多的计算机资源,提高了检测的效率。因此本申请提出的数据异常检测方法,可以提高KPI异常检测的效率和准确率。
如图2所示,是本申请一实施例提供的数据异常检测装置的功能模块图。
本申请所述数据异常检测装置100可以安装于电子设备中。根据实现的功能,所述数据异常检测装置100可以包括数据处理模块101、模型获取模块102、函数调整模块103、模型训练模块104、重构概率获取模块105及异常检测模块106。本申请所述模块也可以称之为单元,是指一种能够被电子设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在电子设备的存储器中。
在本实施例中,关于各模块/单元的功能如下:
所述数据处理模块101,用于获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据。
本申请实施例中,所述标准训练数据集可以包含各种KPI(Key Performance Indicators,关键性能指标)的数值。
所述KPI是指对服务、系统等运维对象的监控指标(如延迟、吞吐量等),具体的,标准训练数据集包含相同或不同的KPI按监控到的时间先后顺序排列而成的数值序列。
本申请实施例中,所述缺失数据为数值为0的数据,所述异常检测数据为KPI异常数据。
例如,标准训练数据集中有不同时间采集的CPU使用率,其中,部分时间的CPU使用率为0或部分时间CPU使用率异常;或者,标准训练数据集中有不同时间采集到的硬件资源消耗量,其中,部分时间的硬件资源消耗量为0或部分时间的硬件资源消耗量异常;或者,标准训练数据集中有不同时间采集到在线用户数量,其中,部分时间的在线用户数量为0或部分时间在线用户数量异常;或者,标准训练数据集中有不同时间采集到并发用户数,其中,部分时间的并发用户数为0或部分时间的并发用户数异常。
优选的,所述标准训练数据集可以存储于区块链中,则在具体实施时,直接从区块链的节点中获取所述标准训练数据集。
通过将标准训练数据集存储于区块链中,可以提高KPI数据的私密和安全性。
进一步的,所述数据处理模块101具体用于:
获取原始训练数据集;
将所述原始训练数据集中预设比例的数据设置为缺失数据;
通过预设的归一化公式对所述包括缺失数据的原始训练数据集进行归一化处理,得到归一化数据集;
将所述归一化数据集输入至预设的滑动窗口,得到所述标准训练数据集。
详细地,本申请实施例通过下述公式对所述原始训练数据集进行归一化处理:
Figure PCTCN2020131984-appb-000009
其中,n为所述原始训练数据集中数据个数,x i为所述原始训练数据集中第i个数据,y i为所述归一化数据集中第i个数据,并且所述y i∈[0,1]。
本申请实施例中,随机将λ比率的正常数据(即非0的KPI数据)设置为0,视为缺失数据,加强了模型训练的效果。
本申请实施例中,所述原始训练数据集为时间序列数据,将归一化数据集输入至所述滑动窗口,可以保证所述原始训练数据集的序列性,提高数据的可用性和一致性。
具体的,若所述滑动窗口的大小为W,则所述标准训练数据集中的数据为W个,即标准训练数据集中的数据为:x W,…,x 1
本申请实施例中,对所述原始训练数据集进行归一化处理,可以标准化标准训练数据集中的数据,并且通过使用所述滑动窗口,保证了所述标准训练数据集中数据的时间序列性。
所述模型获取模块102,用于获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数。
本申请实施例中,所述预先构建的异常检测模型框架可以为VAE(Variational Autoencoders,变分自编码器)异常检测模型框架。
具体的,VAE包括编码器、解码器及变分下限函数,所述编码器计算所述标准训练数据集中所述的隐变量分布参数(均值和方差),并采样得到隐变量,所述解码器对所述隐变量进行恢复,得到输出结果,利用所述输出结果和所述变分下限函数可以对输入的KPI数据计算其重构概率,根据所述重构概率判断所述KPI数据是否异常。
所述函数调整模块103,用于利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数。
具体的,所述函数调整模块103具体用于:
基于所述缺失数据计算优化数值;
将所述优化数值添加至所述变分下限函数,得到所述优化变分下限函数。
进一步的,可通过以下方式基于所述缺失数据计算优化数值:
缺失系数
Figure PCTCN2020131984-appb-000010
根据所述缺失数据数据得到的缺失系数计算优化数值β,具体的,
Figure PCTCN2020131984-appb-000011
其中,所述x w为所述标准训练数据集中第w个数据。
进一步地,所述优化变分下限函数为:
Figure PCTCN2020131984-appb-000012
其中,所述W为所述标准训练数据集中的数据个数,x w为所述标准训练数据集中第w个数据,所述a w为第w个数据的缺失系数,当x w为缺失数据时,a w=1,当x w不为缺失数据时,a w=0,β为优化数值,且存在
Figure PCTCN2020131984-appb-000013
所述z表示标准训练数据集中隐变量z。
其中,
Figure PCTCN2020131984-appb-000014
表示对x对应的隐变量z的分布计算期望,logp θ(x|z)表示对p(x|z;θ)取对数,p θ(x|z)意味着将隐变量z恢复成x,对应着解码器,p θ(z)表示标准训练数据集下隐变量z的分布,logp θ(z)表示对所述p θ(z)取对数,logq φ(z|x)表示对所述q φ(z|x)取对数,q φ(z|x)意味着在样本x下隐变量z的分布,对应于编码器部分。
进一步地,本申请实施例中,所述优化变分下限函数根据所述缺失数据进行调整,使得可以利用所述缺失数据对所述异常检测模型框架进行训练,增强了所述异常检测模型框架面对异常数据的稳定性,从而提高了模型的鲁棒性。
所述模型训练模块104,用于利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型。
较佳地,所述模型训练模块104包括:
第一计算单元,用于将所述标准训练数据集输入至所述异常检测模型框架进行计算,得到输出结果;
第二计算单元,用于根据所述输出结果计算所述优化变分下限函数的损失值;
模型获取调整单元,用于若所述损失值大于预设的损失阈值时,调整所述异常检测模型框架中的参数,出发第一计算单元将所述标准训练数据集输入至所述异常检测模型框架进行计算,得到输出结果,直到所述损失值小于等于所述损失阈值时,停止调整所述异常检测模型框架中的参数,得到所述异常检测模型。
进一步地,所述第一计算单元具体用于:
利用所述异常检测模型框架中的编码器计算所述标准训练数据集中数据的隐变量分布参数;
对所述隐变量分布参数进行取样得到隐变量;
利用异常检测模型框架中的解码器及所述隐变量计算得到所述输出结果。
具体的,所述隐变量分布参数是所述标准训练数据集中所有数据的隐变量。
其中,本申请实施例中,利用下述公式计算得到所述隐变量:
Figure PCTCN2020131984-appb-000015
z为所述隐变量,
Figure PCTCN2020131984-appb-000016
μ(X),∑(X)为所述隐变量分布参数里的均值和方差,μ(x)为所述标准训练数据集的均值。
其中,本申请实施例中,利用下述公式计算得到所述输出结果:
p(x)=∫p(x,z|θ)=∫p(x|z;θ)p(z)dz
其中,p(x)为所述输出结果,z是隐变量空间Z中的点,p(z)为取到所述隐变量z的概率,θ是参数空间Θ中的点,所述参数空间的范围为预设范围。
p(x|z;θ)=N(x|f(z;θ),σ 2*I)
其中,I表示单位矩阵,σ为超参数。f为将z,θ映射到x的函数,即f:X×Θ→X。
本申请实施例中,由于每次训练前所述缺失数据是随机选取的,因此可以重复利用所述标准训练数据集训练所述异常检测模型框架,提高了数据利用率。
所述重构概率获取模块105,用于获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率。
本申请实施例中,具体利用所述异常检测模型对所述待检测数据集进行检测时,对于所述待检测数据集中的每个待检测数据,异常检测模型中的解码器输出均值和方差参数。异常检测模型中的编码器利用解码器输出的均值和方差参数,计算从所述隐变量分布z产生与待检测数据接近的平均概率,所述平均概率用作异常分数,称为重构概率,所述重构概率用于评估待检测数据出现异常的可能性。
优选的,本申请一实施例中,所述装置还包括判断模块,所述判断模块用于:
利用所述异常检测模型对所述待检测数据集进行检测之前,判断所述待检测数据集中是否存在缺失值;
若所述待检测数据集中存在缺失值,通过蒙特卡洛插补法填充所述待检测数据集中存在的缺失值。
具体的,蒙特卡洛插补法可以从现有技术中获取,此处不再赘述。
本申请实施例中,待检测数据集中的缺失值会在所述异常检测模型中的编码器编码过程引起偏差,从而影响数据异常检测的结果,通过所述蒙特卡洛插补法对待检测数据集中的缺失值进行填充,可以提高数据异常检测的准确性,同时利用所述异常检测模型输出重构概率,极大地提升了数据异常检测的速率。
所述异常检测模块106,用于若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
具体的,所述重构阈值为预设的。
本申请实施例中,将具有高重构概率的数据确定为异常数据。
可选的,在本申请实施例中,当确定待检测数据集中存在异常数据时,发送警告消息提醒,所述警告消息提醒包括异常数据点对应的运行时间。通过在警告消息提醒中包含异常数据点对应的运行时间,有利于提高运维的效率。
在本申请实施例中,确定重构概率小于所述重构阈值的待检测数据为正常数据。
可选的,在本申请实施例中,当确定待检测数据集中不存在异常数据时,持续监控,以及对本次的异常检测结果和检测时间进行记录。
本申请实施例根据所述标准训练数据集中的缺失数据对所述异常检测模型框架中的变分下限函数进行调整,可以优化变分下限函数,进而利用优化的变分下限函数对模型进行训练,可以得到鲁棒性更高的异常检测模型,有利于提高异常检测模型的稳定性,避免出现检测不准确的问题,进而有利于提高KPI异常检测检测时的准确率;同时,本申请实施例,根据优化变分下限函数得的异常检测模型进行检测,这一过程中不产生标签,降低了对标签的依赖性,避免了占用过多的计算机资源,提高了检测的效率。因此本申请提出的数据异常检测装置,可以提高KPI异常检测的效率和准确率。
如图3所示,是本申请一实施例提供的实现数据异常检测方法的电子设备的结构示意图。
所述电子设备1可以包括处理器10、存储器11和总线,还可以包括存储在所述存储器11中并可在所述处理器10上运行的计算机程序,如数据异常检测程序12。
其中,所述存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、移动硬盘、多媒体卡、卡型存储器(例如:SD或DX存储器等)、磁性存储器、磁盘、光盘等。所述存储器11在一些实施例中可以是电子设备1的内部存储单元,例如该电子设备1的移动硬盘。所述存储器11在另一些实施例中也可以是电子设备1的外部存储设备,例如电子设备1上配备的插接式移动硬盘、智能存储卡(Smart Media Card,SMC)、安全数字(Secure Digital,SD)卡、闪存卡(Flash Card)等。进一步地,所述存储器11还可以既包括电子设备1的内部存储单元也包括外部存储设备。所述存储器11不仅可以用于存储安装于电子设备1的应用软件及各类数据,例如数据异常检测程序12的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。
所述处理器10在一些实施例中可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述处理器10是所述电子设备的控制核心(Control Unit),利用各种接口和线路连接整个电子设备的各个部件,通过运行或执行存储在所述存储器11内的程序或者模块(例如数据异常检测程序等),以及调用存储在所述存储器11内的数据,以执行电子设备1的各种功能和处理数据。
所述总线可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。所述总线被设置为实现所述存储器11以及至少一个处理器10等之间的连接通信。
图3仅示出了具有部件的电子设备,本领域技术人员可以理解的是,图3示出的结构并不构成对所述电子设备1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。
例如,尽管未示出,所述电子设备1还可以包括给各个部件供电的电源(比如电池),优选地,电源可以通过电源管理装置与所述至少一个处理器10逻辑相连,从而通过电源管理装置实现充电管理、放电管理、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备1还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。
进一步地,所述电子设备1还可以包括网络接口,可选地,所述网络接口可以包括有线接口和/或无线接口(如WI-FI接口、蓝牙接口等),通常用于在该电子设备1与其他电子设备之间建立通信连接。
可选地,该电子设备1还可以包括用户接口,用户接口可以是显示器(Display)、输入单元(比如键盘(Keyboard)),可选地,用户接口还可以是标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在电子设备1中处理的信息以及用于显示可视化的用户界面。
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。
所述电子设备1中的所述存储器11存储的数据异常检测程序12是多个指令的组合,在所述处理器10中运行时,可以实现:
获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据;
获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数;
利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数;
利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型;
获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率;
若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
具体地,所述处理器10对上述指令的具体实现方法可参考图1对应实施例中相关步骤的描述,在此不赘述。
进一步地,所述电子设备1集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个非易失性或易失性计算机可读取存储介质中。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)。
所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:
获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据;
获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数;
利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数;
利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型;
获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率;
若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。
因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附关联图标记视为限制所涉及的权利要求。
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服 务层以及应用服务层等。
此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第二等词语用来表示名称,而并不表示任何特定的顺序。
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。

Claims (20)

  1. 一种数据异常检测方法,其中,所述方法包括:
    获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据;
    获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数;
    利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数;
    利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型;
    获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率;
    若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
  2. 如权利要求1所述的数据异常检测方法,其中,所述获取标准训练数据集,包括:
    获取原始训练数据集;
    将所述原始训练数据集中预设比例的数据设置为缺失数据;
    通过预设的归一化公式对所述包括缺失数据的原始训练数据集进行归一化处理,得到归一化数据集;
    将所述归一化数据集输入至预设的滑动窗口,得到所述标准训练数据集。
  3. 如权利要求1所述的数据异常检测方法,其中,所述利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数,包括:
    基于所述缺失数据计算优化数值;
    将所述优化数值添加至所述变分下限函数,得到所述优化变分下限函数。
  4. 如权利要求1至3中任一项所述的数据异常检测方法,其中,所述优化变分下限函数为:
    Figure PCTCN2020131984-appb-100001
    其中,所述W为所述标准训练数据集中的数据个数,x w为所述标准训练数据集中第w个数据,所述a w为第w个数据的缺失系数,当x w为缺失数据时,a w=1,当x w不为缺失数据时,a w=0,β为优化数值,且存在
    Figure PCTCN2020131984-appb-100002
    所述z表示标准训练数据集中隐变量z;
    其中,
    Figure PCTCN2020131984-appb-100003
    表示对x对应的隐变量z的分布计算期望,logp θ(x|z)表示对p(x|z;θ)取对数,p θ(x|z)意味着将隐变量z恢复成x,对应着解码器,p θ(z)表示标准训练数据集下隐变量z的分布,logp θ(z)表示对所述p θ(z)取对数,logq φ(z|x)表示对所述q φ(z|x)取对数,q φ(z|x)意味着在样本x下隐变量z的分布,对应于编码器部分。
  5. 如权利要求1所述的数据异常检测方法,其中,所述利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型,包括:
    步骤A:将所述标准训练数据集输入至所述异常检测模型框架进行计算,得到输出结果;
    步骤B:根据所述输出结果计算所述优化变分下限函数的损失值;
    步骤C:若所述损失值大于预设的损失阈值时,调整所述异常检测模型框架中的参数,返回步骤A,直到所述损失值小于等于所述损失阈值时,停止调整所述异常检测模型框架中的参数,得到所述异常检测模型。
  6. 如权利要求5所述的数据异常检测方法,其中,所述将所述标准训练数据集输入至所述异常检测模型框架进行计算,得到输出结果,包括:
    利用所述异常检测模型框架中的编码器计算所述标准训练数据集中数据的隐变量分 布参数;
    对所述隐变量分布参数进行取样得到隐变量;
    利用异常检测模型框架中的解码器及所述隐变量计算得到所述输出结果。
  7. 如权利要求1所述的数据异常检测方法,其中,所述利用所述异常检测模型对所述待检测数据集进行检测之前,所述方法还包括:
    判断所述待检测数据集中是否存在缺失值;
    若所述待检测数据集中存在缺失值,通过蒙特卡洛插补法填充所述待检测数据集中存在的缺失值。
  8. 一种数据异常检测装置,其中,所述装置包括:
    数据处理模块,用于获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据;
    模型获取模块,用于获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数;
    函数调整模块,用于利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数;
    模型训练模块,用于利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型;
    重构概率获取模块,用于获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率;
    异常检测模块,用于若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
  9. 一种电子设备,其中,所述电子设备包括:
    至少一个处理器;以及,
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如下步骤:
    获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据;
    获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数;
    利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数;
    利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型;
    获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率;
    若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
  10. 如权利要求9所述的电子设备,其中,所述获取标准训练数据集,包括:
    获取原始训练数据集;
    将所述原始训练数据集中预设比例的数据设置为缺失数据;
    通过预设的归一化公式对所述包括缺失数据的原始训练数据集进行归一化处理,得到归一化数据集;
    将所述归一化数据集输入至预设的滑动窗口,得到所述标准训练数据集。
  11. 如权利要求9所述的电子设备,其中,所述利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数,包括:
    基于所述缺失数据计算优化数值;
    将所述优化数值添加至所述变分下限函数,得到所述优化变分下限函数。
  12. 如权利要求9至11中任一项所述的电子设备,其中,所述优化变分下限函数为:
    Figure PCTCN2020131984-appb-100004
    其中,所述W为所述标准训练数据集中的数据个数,x w为所述标准训练数据集中第w个数据,所述a w为第w个数据的缺失系数,当x w为缺失数据时,a w=1,当x w不为缺失数据时,a w=0,β为优化数值,且存在
    Figure PCTCN2020131984-appb-100005
    所述z表示标准训练数据集中隐变量z;
    其中,
    Figure PCTCN2020131984-appb-100006
    表示对x对应的隐变量z的分布计算期望,logp θ(x|z)表示对p(x|z;θ)取对数,p θ(x|z)意味着将隐变量z恢复成x,对应着解码器,p θ(z)表示标准训练数据集下隐变量z的分布,logp θ(z)表示对所述p θ(z)取对数,logq φ(z|x)表示对所述q φ(z|x)取对数,q φ(z|x)意味着在样本x下隐变量z的分布,对应于编码器部分。
  13. 如权利要求9所述的电子设备,其中,所述利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型,包括:
    步骤A:将所述标准训练数据集输入至所述异常检测模型框架进行计算,得到输出结果;
    步骤B:根据所述输出结果计算所述优化变分下限函数的损失值;
    步骤C:若所述损失值大于预设的损失阈值时,调整所述异常检测模型框架中的参数,返回步骤A,直到所述损失值小于等于所述损失阈值时,停止调整所述异常检测模型框架中的参数,得到所述异常检测模型。
  14. 如权利要求13所述的电子设备,其中,所述将所述标准训练数据集输入至所述异常检测模型框架进行计算,得到输出结果,包括:
    利用所述异常检测模型框架中的编码器计算所述标准训练数据集中数据的隐变量分布参数;
    对所述隐变量分布参数进行取样得到隐变量;
    利用异常检测模型框架中的解码器及所述隐变量计算得到所述输出结果。
  15. 如权利要求9所述的电子设备,其中,所述利用所述异常检测模型对所述待检测数据集进行检测之前,所述指令被所述至少一个处理器执行时还实现如下步骤:
    判断所述待检测数据集中是否存在缺失值;
    若所述待检测数据集中存在缺失值,通过蒙特卡洛插补法填充所述待检测数据集中存在的缺失值。
  16. 一种计算机可读存储介质,存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:
    获取标准训练数据集,所述标准训练数据集包含异常检测数据和缺失数据;
    获取预构建的异常检测模型框架,所述异常检测模型框架包括变分下限函数;
    利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数;
    利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型;
    获取待检测数据集,利用所述异常检测模型对所述待检测数据集进行检测,得到所述待检测数据集中待检测数据的重构概率;
    若存在重构概率大于等于重构阈值的目标待检测数据,则确定所述目标待检测数据为异常数据。
  17. 如权利要求16所述的计算机可读存储介质,其中,所述获取标准训练数据集,包括:
    获取原始训练数据集;
    将所述原始训练数据集中预设比例的数据设置为缺失数据;
    通过预设的归一化公式对所述包括缺失数据的原始训练数据集进行归一化处理,得到归一化数据集;
    将所述归一化数据集输入至预设的滑动窗口,得到所述标准训练数据集。
  18. 如权利要求16所述的计算机可读存储介质,其中,所述利用所述缺失数据对所述变分下限函数进行调整,得到优化变分下限函数,包括:
    基于所述缺失数据计算优化数值;
    将所述优化数值添加至所述变分下限函数,得到所述优化变分下限函数。
  19. 如权利要求16至18中任一项所述的计算机可读存储介质,其中,所述优化变分下限函数为:
    Figure PCTCN2020131984-appb-100007
    其中,所述W为所述标准训练数据集中的数据个数,x w为所述标准训练数据集中第w个数据,所述a w为第w个数据的缺失系数,当x w为缺失数据时,a w=1,当x w不为缺失数据时,a w=0,β为优化数值,且存在
    Figure PCTCN2020131984-appb-100008
    所述z表示标准训练数据集中隐变量z;
    其中,
    Figure PCTCN2020131984-appb-100009
    表示对x对应的隐变量z的分布计算期望,logp θ(x|z)表示对p(x|z;θ)取对数,p θ(x|z)意味着将隐变量z恢复成x,对应着解码器,p θ(z)表示标准训练数据集下隐变量z的分布,logp θ(z)表示对所述p θ(z)取对数,logq φ(z|x)表示对所述q φ(z|x)取对数,q φ(z|x)意味着在样本x下隐变量z的分布,对应于编码器部分。
  20. 如权利要求16所述的计算机可读存储介质,其中,所述利用所述标准训练数据集对包含所述优化变分下限函数的所述异常检测模型框架进行训练,得到异常检测模型,包括:
    步骤A:将所述标准训练数据集输入至所述异常检测模型框架进行计算,得到输出结果;
    步骤B:根据所述输出结果计算所述优化变分下限函数的损失值;
    步骤C:若所述损失值大于预设的损失阈值时,调整所述异常检测模型框架中的参数,返回步骤A,直到所述损失值小于等于所述损失阈值时,停止调整所述异常检测模型框架中的参数,得到所述异常检测模型。
PCT/CN2020/131984 2020-10-09 2020-11-27 数据异常检测方法、装置、电子设备及存储介质 WO2021189904A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011074730.8 2020-10-09
CN202011074730.8A CN112148577B (zh) 2020-10-09 2020-10-09 数据异常检测方法、装置、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021189904A1 true WO2021189904A1 (zh) 2021-09-30

Family

ID=73952717

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/131984 WO2021189904A1 (zh) 2020-10-09 2020-11-27 数据异常检测方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN112148577B (zh)
WO (1) WO2021189904A1 (zh)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113971513A (zh) * 2021-10-22 2022-01-25 河南鑫安利安全科技股份有限公司 一种企业安全风险管理平台的数据存储与优化方法
US20220060235A1 (en) * 2020-08-18 2022-02-24 Qualcomm Incorporated Federated learning for client-specific neural network parameter generation for wireless communication
CN114493291A (zh) * 2022-01-28 2022-05-13 中铁北京工程局集团有限公司 一种高填方质量智能检测方法及系统
CN114881157A (zh) * 2022-05-17 2022-08-09 中国南方电网有限责任公司超高压输电公司广州局 换流阀工作状态的检测方法、装置、设备和存储介质
CN114880384A (zh) * 2022-07-11 2022-08-09 杭州宇谷科技有限公司 一种无监督二轮电动车充电时序异常检测方法及系统
CN115034286A (zh) * 2022-04-24 2022-09-09 国家计算机网络与信息安全管理中心 一种基于自适应损失函数的异常用户识别方法和装置
CN116049157A (zh) * 2023-01-04 2023-05-02 北京京航计算通讯研究所 一种质量数据分析方法及系统
CN116956637A (zh) * 2023-09-06 2023-10-27 湖南光华防务科技集团有限公司 一种灭火弹覆盖面鲁棒性检测方法
CN117041018A (zh) * 2023-10-09 2023-11-10 中电科大数据研究院有限公司 一种数据中心远程智能运维管理方法及相关设备
CN117849700A (zh) * 2024-03-07 2024-04-09 南京国网电瑞电力科技有限责任公司 可控制测量的模块化电能计量系统
CN117849700B (zh) * 2024-03-07 2024-05-24 南京国网电瑞电力科技有限责任公司 可控制测量的模块化电能计量系统

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463646B (zh) * 2021-01-25 2021-05-11 北京工业大数据创新中心有限公司 一种传感器异常检测方法及装置
CN113114529B (zh) * 2021-03-25 2022-05-24 清华大学 基于条件变分自动编码器的kpi异常检测方法、装置和计算机存储介质
CN113204569A (zh) * 2021-03-30 2021-08-03 联想(北京)有限公司 一种信息处理方法及装置
CN113592019B (zh) * 2021-08-10 2023-09-15 平安银行股份有限公司 基于多模型融合的故障检测方法、装置、设备及介质
CN113705684B (zh) * 2021-08-30 2023-11-24 平安科技(深圳)有限公司 反向迭代的异常检测方法、装置、电子设备及介质
CN114185881A (zh) * 2021-12-14 2022-03-15 中国平安财产保险股份有限公司 异常数据自动修复方法、装置、设备及存储介质
CN114190897B (zh) * 2021-12-15 2024-04-05 中国科学院空天信息创新研究院 睡眠分期模型的训练方法、睡眠分期方法及装置
CN114722061B (zh) * 2022-04-08 2023-11-14 中国电信股份有限公司 数据处理方法及装置、设备、计算机可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190124045A1 (en) * 2017-10-24 2019-04-25 Nec Laboratories America, Inc. Density estimation network for unsupervised anomaly detection
CN109978379A (zh) * 2019-03-28 2019-07-05 北京百度网讯科技有限公司 时序数据异常检测方法、装置、计算机设备和存储介质
CN111562996A (zh) * 2020-04-11 2020-08-21 北京交通大学 一种关键性能指标数据的时序异常检测方法及系统
CN111598881A (zh) * 2020-05-19 2020-08-28 西安电子科技大学 基于变分自编码器的图像异常检测方法
CN111652278A (zh) * 2020-04-30 2020-09-11 中国平安财产保险股份有限公司 用户行为检测方法、装置、电子设备及介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110581834A (zh) * 2018-06-11 2019-12-17 中国移动通信集团浙江有限公司 一种通信能力开放异常检测方法和装置
EP3748545A1 (en) * 2019-06-07 2020-12-09 Tata Consultancy Services Limited Sparsity constraints and knowledge distillation based learning of sparser and compressed neural networks
CN110851338B (zh) * 2019-09-23 2022-06-24 平安科技(深圳)有限公司 异常检测方法、电子设备及存储介质
CN115903741B (zh) * 2022-11-18 2024-03-15 南京信息工程大学 一种工业控制系统数据异常检测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190124045A1 (en) * 2017-10-24 2019-04-25 Nec Laboratories America, Inc. Density estimation network for unsupervised anomaly detection
CN109978379A (zh) * 2019-03-28 2019-07-05 北京百度网讯科技有限公司 时序数据异常检测方法、装置、计算机设备和存储介质
CN111562996A (zh) * 2020-04-11 2020-08-21 北京交通大学 一种关键性能指标数据的时序异常检测方法及系统
CN111652278A (zh) * 2020-04-30 2020-09-11 中国平安财产保险股份有限公司 用户行为检测方法、装置、电子设备及介质
CN111598881A (zh) * 2020-05-19 2020-08-28 西安电子科技大学 基于变分自编码器的图像异常检测方法

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11909482B2 (en) * 2020-08-18 2024-02-20 Qualcomm Incorporated Federated learning for client-specific neural network parameter generation for wireless communication
US20220060235A1 (en) * 2020-08-18 2022-02-24 Qualcomm Incorporated Federated learning for client-specific neural network parameter generation for wireless communication
CN113971513A (zh) * 2021-10-22 2022-01-25 河南鑫安利安全科技股份有限公司 一种企业安全风险管理平台的数据存储与优化方法
CN114493291A (zh) * 2022-01-28 2022-05-13 中铁北京工程局集团有限公司 一种高填方质量智能检测方法及系统
CN115034286A (zh) * 2022-04-24 2022-09-09 国家计算机网络与信息安全管理中心 一种基于自适应损失函数的异常用户识别方法和装置
CN114881157A (zh) * 2022-05-17 2022-08-09 中国南方电网有限责任公司超高压输电公司广州局 换流阀工作状态的检测方法、装置、设备和存储介质
CN114880384A (zh) * 2022-07-11 2022-08-09 杭州宇谷科技有限公司 一种无监督二轮电动车充电时序异常检测方法及系统
CN114880384B (zh) * 2022-07-11 2022-09-23 杭州宇谷科技有限公司 一种无监督二轮电动车充电时序异常检测方法及系统
CN116049157A (zh) * 2023-01-04 2023-05-02 北京京航计算通讯研究所 一种质量数据分析方法及系统
CN116049157B (zh) * 2023-01-04 2024-05-07 北京京航计算通讯研究所 一种质量数据分析方法及系统
CN116956637A (zh) * 2023-09-06 2023-10-27 湖南光华防务科技集团有限公司 一种灭火弹覆盖面鲁棒性检测方法
CN116956637B (zh) * 2023-09-06 2024-03-05 湖南光华防务科技集团有限公司 一种灭火弹覆盖面鲁棒性检测方法
CN117041018B (zh) * 2023-10-09 2024-01-02 中电科大数据研究院有限公司 一种数据中心远程智能运维管理方法及相关设备
CN117041018A (zh) * 2023-10-09 2023-11-10 中电科大数据研究院有限公司 一种数据中心远程智能运维管理方法及相关设备
CN117849700A (zh) * 2024-03-07 2024-04-09 南京国网电瑞电力科技有限责任公司 可控制测量的模块化电能计量系统
CN117849700B (zh) * 2024-03-07 2024-05-24 南京国网电瑞电力科技有限责任公司 可控制测量的模块化电能计量系统

Also Published As

Publication number Publication date
CN112148577A (zh) 2020-12-29
CN112148577B (zh) 2024-05-07

Similar Documents

Publication Publication Date Title
WO2021189904A1 (zh) 数据异常检测方法、装置、电子设备及存储介质
WO2021189906A1 (zh) 基于联邦学习的目标检测方法、装置、设备及存储介质
WO2021189826A1 (zh) 报文生成方法、装置、电子设备及计算机可读存储介质
CN110852374B (zh) 数据检测方法、装置、电子设备以及存储介质
WO2021218336A1 (zh) 用户信息判别方法、装置、设备及计算机可读存储介质
WO2021189855A1 (zh) 基于ct序列的图像识别方法、装置、电子设备及介质
WO2021238563A1 (zh) 基于配置算法的企业运行数据分析方法、装置、电子设备及介质
WO2022142013A1 (zh) 基于人工智能的ab测试方法、装置、计算机设备及介质
WO2021151291A1 (zh) 疾病风险的分析方法、装置、电子设备及计算机存储介质
CN113918361A (zh) 基于物联网规则引擎的终端控制方法、装置、设备及介质
WO2019056496A1 (zh) 图片复审概率区间生成方法及图片复审判定方法
CN112084486A (zh) 用户信息验证方法、装置、电子设备及存储介质
CN113627160B (zh) 文本纠错方法、装置、电子设备及存储介质
WO2022088632A1 (zh) 用户数据监控分析方法、装置、设备及介质
WO2022227192A1 (zh) 图像分类方法、装置、电子设备及介质
WO2022134348A1 (zh) 一种监控软件开发过程的方法、装置、终端及存储介质
CN112486957B (zh) 数据库迁移检测方法、装置、设备及存储介质
CN111756760B (zh) 基于集成分类器的用户异常行为检测方法及相关设备
CN113176968A (zh) 基于接口参数分类的安全测试方法、装置及存储介质
WO2022227191A1 (zh) 非主动活体检测方法、装置、电子设备及存储介质
CN111651652B (zh) 基于人工智能的情感倾向识别方法、装置、设备及介质
WO2022110647A1 (zh) 风控数据生成方法、装置、设备及计算机可读存储介质
CN115328724B (zh) 一种基于大数据平台的监测方法和系统
CN114840531B (zh) 基于血缘关系的数据模型重构方法、装置、设备及介质
CN117171056B (zh) 基于自动化接口的测试方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20927570

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20927570

Country of ref document: EP

Kind code of ref document: A1