CN111027679A

CN111027679A - Abnormal data detection method and system

Info

Publication number: CN111027679A
Application number: CN201911240576.4A
Authority: CN
Inventors: 牛昕宇; 蔡权雄
Original assignee: Shenzhen Corerain Technologies Co Ltd
Current assignee: Shenzhen Corerain Technologies Co Ltd
Priority date: 2019-12-06
Filing date: 2019-12-06
Publication date: 2020-04-17

Abstract

The embodiment of the invention provides an abnormal data detection method and system. The abnormal data detection method comprises the following steps: receiving first detection data sent by an external system, wherein the first detection data are first-dimension data; converting the first detection data into second detection data, wherein the second detection data are second-dimension data, and the dimension of the first-dimension data is larger than that of the second-dimension data; predicting the second detection data based on the trained abnormal prediction model to obtain prediction data; and comparing the predicted data with the actual data to judge whether the actual data is abnormal or not. By carrying out prediction after dimensionality reduction on the first detection data, the effect of improving the accuracy of abnormal data prediction is achieved.

Description

Abnormal data detection method and system

Technical Field

The embodiment of the invention relates to the technical field of machine learning, in particular to an abnormal data detection method and system.

Background

Anomaly detection is an important research direction in the field of machine learning. In the fields of finance, network security and the like, the abnormal detection algorithm can distinguish normal data from abnormal data by learning a large amount of historical data, so that early warning is carried out on abnormal problems.

Time series abnormality detection is realized, and a prediction model based on time series is usually established by using historical data, so that the difference between the prediction data and the actual data is compared to judge whether abnormality occurs. A typical algorithm is a clustering algorithm that divides a time series into a plurality of states and establishes simple state transitions to reflect state switching between historical data to predict higher order time series data.

However, using a clustering algorithm to partition the time series of states can lose a lot of data information, such as numerical relationships between different states and differences between different data in the same state. Because the used state conversion is too simple, the data cannot be accurately predicted, and particularly, when the high-dimensional data is predicted, the prediction result is not accurate enough.

Disclosure of Invention

The embodiment of the invention provides an abnormal data detection method and system, which aim to achieve the effect of improving the accuracy of abnormal data prediction.

In a first aspect, an embodiment of the present invention provides an abnormal data detection method, including:

receiving first detection data sent by an external system, wherein the first detection data are first-dimension data;

converting the first detection data into second detection data, wherein the second detection data are second-dimension data, and the dimension of the first-dimension data is larger than that of the second-dimension data;

predicting the second detection data based on the trained abnormal prediction model to obtain prediction data;

and comparing the predicted data with the actual data to judge whether the actual data is abnormal or not.

Optionally, the comparing the predicted data with the actual data to determine whether the actual data is abnormal includes:

calculating a data difference value between the predicted data and the actual data;

judging whether the data difference value is larger than a difference value threshold value or not;

and if the data difference value is larger than the difference threshold value, judging that the actual data is abnormal.

Optionally, before the converting the first detection data into the second detection data, the method includes:

training through normal historical data to obtain an automatic encoder, wherein the automatic encoder comprises a decoder and an encoder;

and taking out the decoder, and reserving a trained encoder, wherein the trained encoder is used for converting the first detection data into second detection data.

Optionally, the encoder adds a batch normalization process.

Optionally, the predicting the second detection data based on the trained abnormal prediction model to obtain prediction data includes:

performing initial calculation on the second detection data to obtain a second intermediate result, wherein the second intermediate result is stored in a second intermediate result cache region;

and acquiring the second intermediate result from the second intermediate result cache region to perform final prediction calculation so as to obtain the prediction data.

Optionally, the auto-encoder and the anomaly prediction model share the same multiplication kernel.

In a second aspect, an embodiment of the present invention provides an abnormal data detection system, including:

the data interface is used for receiving first detection data sent by an external system, and the first detection data are first-dimension data;

the dimension reduction prediction module comprises a data dimension reduction unit and a prediction data calculation unit, the data dimension reduction unit is used for converting the first detection data into second detection data, the second detection data is second dimension data, the dimension of the first dimension data is larger than that of the second dimension data, and the prediction data calculation unit is used for predicting the second detection data based on a trained abnormal prediction model to obtain prediction data;

and the abnormity judgment module is used for comparing the predicted data with the actual data so as to judge whether the actual data is abnormal or not.

Optionally, the abnormality determining module includes:

a data difference value calculation unit for calculating a data difference value between the prediction data and the actual data;

and the abnormity judging unit is used for judging whether the data difference value is larger than a difference threshold value, and if the data difference value is larger than the difference threshold value, judging that the actual data is abnormal.

Optionally, the data dimension reduction unit includes a first computation subunit and a dimension reduction final processing unit;

the first calculation subunit is configured to perform initial calculation on the first detection data to obtain a first intermediate result;

and the dimension reduction final processing unit is electrically connected with the output end of the first calculation subunit and is used for performing final dimension reduction calculation on the first intermediate result to obtain the second detection data.

Optionally, the prediction data calculation unit includes a second calculation subunit and a final prediction processing unit;

the second calculation subunit is configured to perform initial calculation on the second detection data to obtain a second intermediate result;

and the final prediction processing unit is used for electrically connecting the output end of the second calculation subunit and performing final prediction calculation on the second intermediate result to obtain the prediction data.

Optionally, the first calculating subunit and the second calculating subunit are the same calculating subunit.

Optionally, the second calculating subunit includes a second intermediate result cache region, configured to store the second intermediate result.

Optionally, the dimension reduction final processing unit includes:

the dimensionality reduction processing subunit is used for carrying out dimensionality reduction calculation on the first intermediate result to obtain final dimensionality reduction data;

and the batch normalization processing unit is used for carrying out standardization processing on the final dimension reduction data to obtain the second detection data.

The embodiment of the invention receives first detection data sent by an external system, wherein the first detection data is first-dimension data; converting the first detection data into second detection data, wherein the second detection data are second-dimension data, and the dimension of the first-dimension data is larger than that of the second-dimension data; predicting the second detection data based on the trained abnormal prediction model to obtain prediction data; and comparing the predicted data with the actual data to judge whether the actual data is abnormal or not, so that the problem that the predicted result is not accurate enough when the high-dimensional data is predicted is solved, and the effect of improving the accuracy of abnormal data prediction is realized.

Drawings

Fig. 1 is a schematic flowchart of an abnormal data detection method according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of an abnormal data detection method according to a second embodiment of the present invention;

fig. 3 is a schematic structural diagram of an abnormal data detection system according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of an abnormal data detection system according to a fourth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. A process may be terminated when its operations are completed, but may have additional steps not included in the figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc.

Furthermore, the terms "first," "second," and the like may be used herein to describe various orientations, actions, steps, elements, or the like, but the orientations, actions, steps, or elements are not limited by these terms. These terms are only used to distinguish one direction, action, step or element from another direction, action, step or element. For example, the first detection data may be referred to as second detection data, and similarly, the second detection data may be referred to as first detection data, without departing from the scope of the present application. Both the first detection data and the second detection data are detection data, but they are not the same detection data. The terms "first", "second", etc. are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

Example one

Fig. 1 is a schematic flowchart of an abnormal data detection method according to an embodiment of the present invention, which is applicable to a scenario of detecting high-dimensional abnormal data.

As shown in fig. 1, a method for detecting abnormal data according to an embodiment of the present invention includes:

s110, receiving first detection data sent by an external system, wherein the first detection data are first-dimension data.

The external system refers to a system other than the abnormal data detection system, such as a sensor system on an airplane, and is not limited herein. The first detection data is data that is required to determine whether or not there is an abnormality. Specifically, the first detection data is time series data, for example, data generated by a plurality of vibration sensors on the aircraft according to vibrations during a flight test of the aircraft, and these data are related to a time series and can be used as the first detection data of the embodiment. The first dimension data refers to high dimension data, and the present embodiment does not limit the specific dimension of the high dimension, such as 1024 dimension data or 512 dimension data, which is not limited herein. Specifically, the specific dimension of the first-dimension data is n-th power of 2. Alternatively, data larger than 32 dimensions may be regarded as high-dimensional data, i.e., first-dimensional data; data of less than or equal to 32 dimensions is regarded as low-dimensional data, i.e., second-dimensional data.

S120, converting the first detection data into second detection data, wherein the second detection data are second-dimension data, and the dimension of the first-dimension data is larger than that of the second-dimension data.

The second detection data is data obtained after dimensionality reduction. Specifically, the second detection data is second-dimensional data, i.e., low-dimensional data, with respect to the first detection data. Alternatively, data of less than or equal to 32 dimensions is considered low-dimensional data.

In an optional embodiment, before converting the first detection data into the second detection data, the method further includes:

Specifically, the automatic decoder comprises an encoder and a decoder, after the automatic decoder is trained, the decoder is taken out, and only the trained encoder is reserved to perform dimensionality reduction processing on the first detection data to obtain second detection data. Optionally, the encoder performs batch normalization processing, that is, performs batch normalization processing on the dimensionality-reduced data after performing dimensionality reduction on the first detection data, and finally outputs second detection data. Optionally, the auto-encoder and the anomaly prediction model share the same multiplication kernel. Specifically, matrix vector multiplication performed by the auto-encoder for dimension reduction and matrix vector multiplication performed by prediction data deduction based on the abnormal prediction model share the same calculation subunit, and the same multiplication kernel is used for calculation, so that the use of a Digital Signal Processing (DSP) in a Field Programmable Gate Array (FPGA) is saved.

S130, predicting the second detection data based on the trained abnormal prediction model to obtain prediction data.

The abnormality prediction model is a model for predicting the second detection data. Optionally, the anomaly prediction model is a Long Short Term Memory (LSTM) model. The prediction data refers to data obtained by predicting the abnormal prediction model and is used for comparing with the actual data so as to judge whether the actual data is abnormal or not. Optionally, the model parameters of the trained LSTM model may be stored, so that the model parameters of the LSTM model may be obtained for calculation when performing calculation deduction on the second detection data, thereby obtaining the prediction data. Optionally, the second detection data may be calculated by the following formula to obtain the prediction data:

i_t＝sigmoid(W_i[x_t，h_t-1]+b_i)

f_t＝sigmoid(W_f[x_t，h_t-1]+b_f)

u_t＝tanh(W_u[x_t，h_t-1]+b_u)

c_t＝f_t⊙c_t-1+i_t⊙u_t

o_t＝sigmoid(W_o[x_t，h_t-1]+b_o)

h_t＝o_t⊙tanh(c_t)；

wherein i, f, u and o represent an input gate, a forgetting gate, an updating gate and an output gate respectively, W represents a weight matrix corresponding to all input and hidden elements, and b is a model parameter of the LSTM model, namely a bias term.

S140, comparing the predicted data with the actual data to judge whether the actual data is abnormal or not.

Specifically, the predicted data at time T can be obtained from the first detection data at a plurality of times before time T, and then compared with the actual data generated at time T, so as to determine whether the actual data at time T is abnormal.

In an optional embodiment, comparing the predicted data with the actual data to determine whether the actual data is abnormal may include:

The difference threshold is a condition for determining whether the actual data is abnormal. Specifically, the difference threshold is a value obtained by verifying the deviation obtained by the data set, performing Gaussian distribution, and taking a 99% confidence level for N (mu, sigma ^2) when the long-short term memory model is trained offline.

According to the technical scheme of the embodiment of the invention, first detection data sent by an external system is received, wherein the first detection data is first-dimension data; converting the first detection data into second detection data, wherein the second detection data are second-dimension data, and the dimension of the first-dimension data is larger than that of the second-dimension data; predicting the second detection data based on the trained abnormal prediction model to obtain prediction data; and comparing the predicted data with the actual data to judge whether the actual data is abnormal or not, and predicting the first detection data after dimensionality reduction.

Example two

Fig. 2 is a schematic flowchart of an abnormal data detection method according to a second embodiment of the present invention. The embodiment is further refined in the technical scheme, and is suitable for a scene of detecting high-dimensional abnormal data. The method may be performed by an anomaly data detection system, which is implemented in hardware.

As shown in fig. 2, the abnormal data detection method provided by the second embodiment of the present invention includes:

s210, receiving first detection data sent by an external system, wherein the first detection data are first-dimension data.

The external system refers to a system other than the abnormal data detection system, such as a sensor system on an airplane, and is not limited herein. The first detection data is data that is required to determine whether or not there is an abnormality. Specifically, the first detection data is time series data, for example, data generated by a plurality of vibration sensors on the aircraft according to vibrations during a flight test of the aircraft, and these data are related to a time series and can be used as the first detection data of the embodiment. The first dimension data refers to high dimension data, and the present embodiment does not limit the specific dimension of the high dimension, such as 1024 dimension data or 512 dimension data, which is not limited herein. Specifically, the specific dimension of the first-dimension data is n-th power of 2.

S220, converting the first detection data into second detection data, wherein the second detection data are second-dimension data, and the dimension of the first-dimension data is larger than that of the second-dimension data.

Specifically, the automatic decoder comprises an encoder and a decoder, after the automatic decoder is trained, the decoder is taken out, and only the trained encoder is reserved to perform dimensionality reduction processing on the first detection data to obtain second detection data. Optionally, the encoder performs batch normalization processing, that is, performs batch normalization processing on the dimensionality-reduced data after performing dimensionality reduction on the first detection data, and finally outputs second detection data.

And S230, performing initial calculation on the second detection data to obtain a second intermediate result, and storing the second intermediate result in a second intermediate result cache region.

Specifically, the second detection data may be calculated by the following formula to obtain the prediction data:

i_t＝sigmoid(W_i[x_t，h_t-1]+b_i)

f_t＝sigmoid(W_f[x_t，h_t-1]+b_f)

u_t＝tanh(W_u[x_t，h_t-1]+b_u)

c_t＝f_t⊙c_t-1+i_t⊙u_t

o_t＝sigmoid(W_o[x_t，h_t-1]+b_o)

h_t＝o_t⊙tanh(c_t)；

In the calculation of the above formula, for i_t＝sigmoid(W_i[x_t，h_t-1]+b_i)、f_t＝sigmoid(W_f[x_t，h_t-1]+b_f)、

And c_tIs done in the second calculation subunit as a second intermediate result. The second intermediate result cache region is a cache region for storing the second intermediate result. In particular, the step size of the LSTM calculation is based on a time series. Step size refers to the amount of detection data that LSTM needs to compute. For example, it is necessary to calculate the number of detections at time T, time T-1 and time T-2The step size of the LSTM is then 3. In this embodiment, the first detection data is generated by the sensor and the step size is associated with one sample vector from the sensor. And the deduction data at two adjacent moments are partially the same, so that for each calculation of the matrix vector of the predicted data, only the vector matrix multiplication of the newly input second detection data needs to be calculated, and the second intermediate result of the vector matrix multiplication is stored in the second intermediate result cache region, so that the second intermediate result is obtained for calculation when the predicted data is calculated, and the resources of the system are saved.

S240, obtaining the second intermediate result from the second intermediate result cache region to perform final prediction calculation so as to obtain the prediction data.

In particular, the method comprises the following steps of,

and h_to_t⊙tanh(o_t) The calculation of (b) is performed in the final prediction processing unit, thereby obtaining prediction data.

And S250, comparing the predicted data with the actual data to judge whether the actual data is abnormal or not.

EXAMPLE III

Fig. 3 is a schematic structural diagram of an abnormal data detection system according to a third embodiment of the present invention. As shown in fig. 3, an embodiment of the present invention provides an abnormal data detection system, which includes a data interface 310, a dimension reduction prediction module 320, and an abnormal judgment module 330. The abnormal data detection system of the present embodiment is used to detect abnormal data of high dimensions. Wherein:

the data interface 310 is configured to receive first detection data sent by an external system, where the first detection data is first-dimension data;

the dimension reduction prediction module 320 includes a data dimension reduction unit 321 and a prediction data calculation unit 322, where the data dimension reduction unit 321 is configured to convert the first detection data into second detection data, the second detection data is second dimension data, a dimension of the first dimension data is greater than that of the second dimension data, and the prediction data calculation unit 322 is configured to predict the second detection data based on a trained abnormal prediction model to obtain prediction data;

the anomaly determination module 330 is configured to compare the predicted data with actual data to determine whether the actual data is anomalous.

In the present embodiment, the data interface 310 refers to an interface for communicating with an external system. Specifically, the external system and the abnormal data detection system of the present embodiment have different transmission modes, are compatible with the external system through the data interface 310, and perform data communication with the external system. Optionally, the data interface 310 unit is connected to an AXI4 bus, a PCIe (PCI Express, high speed serial computer expansion bus standard) bus, or an ethernet port. When data interface 310 is a normal AXI4 interface with a DMA (Direct memory access) engine, the input data comes from external storage. An off-board system refers to a system other than an anomaly data detection system, such as a sensor system on an aircraft, and is not limited herein. The first detection data is data that is required to determine whether or not there is an abnormality. Specifically, the first detection data is time series data, for example, data generated by a plurality of vibration sensors on the aircraft according to vibrations during a flight test of the aircraft, and these data are related to a time series and can be used as the first detection data of the embodiment. The first dimension data refers to high dimension data, and the present embodiment does not limit the specific dimension of the high dimension, such as 1024 dimension data or 512 dimension data, which is not limited herein. Specifically, the specific dimension of the first-dimension data is n-th power of 2. Alternatively, data larger than 32 dimensions may be regarded as high-dimensional data, i.e., first-dimensional data; data of less than or equal to 32 dimensions is regarded as low-dimensional data, i.e., second-dimensional data.

The dimension reduction prediction module 320 converts the first detection data with high dimension into the second detection data with low dimension, and predicts the second detection data based on the trained abnormal prediction model, thereby obtaining the prediction data. The data dimension reduction unit 321 refers to a unit for performing dimension reduction on the first detection data. In this embodiment, optionally, the data dimension reduction unit 321 is an auto encoder (autoencoder) obtained by normal data training, and then a decoder in the auto encoder is removed, leaving a trained encoder (encoder), so as to perform dimension reduction on the high-dimensional data. The auto-encoder learns to convert data from the input layer into potential spatial features, and then uses the potential spatial features to reconstruct the output as close as possible to its original input. Specifically, the dimension reduction parameters of the trained encoder are stored, so that the dimension reduction parameters are obtained for calculation when the dimension reduction is performed on the first detection data, and thus the second detection data with low dimension is obtained. Optionally, the first detection data may be calculated by the following formula to obtain second detection data with a low dimension:

z is σ (Wx + b), where Z is the result of the second detection data being output, W is the matrix of the first detection data, x is the vector result of the first detection data, b is a dimensionality reduction parameter of the encoder, and σ is a sigmoid function.

In this embodiment, the anomaly prediction model is optionally a Long Short Term Memory (LSTM) model. Specifically, the model parameters of the trained LSTM model are stored, so that the model parameters of the LSTM model are obtained for calculation when the calculation deduction is performed on the second detection data, thereby obtaining the prediction data. Optionally, the second detection data may be calculated by the following formula to obtain the prediction data:

i_t＝sigmoid(W_i[x_t，h_t-1]+b_i)

f_t＝sigmoid(W_f[x_t，h_t-1]+b_f)

u_t＝tanh(W_u[x_t，h_t-1]+b_u)

c_t＝f_t⊙c_t-1+i_t⊙u_t

o_t＝sigmoid(W_o[x_t，h_t-1]+b_o)

h_t＝o_t⊙tanh(c_t)；

The anomaly determination module 330 is configured to compare the predicted data with the actual data, so as to determine whether the actual data is anomalous. Specifically, the predicted data at time T can be obtained from the first detection data at a plurality of times before time T, and then compared with the actual data generated at time T, so as to determine whether the actual data at time T is abnormal. Optionally, the abnormality determining module 330 includes: a data difference value calculation unit for calculating a data difference value between the prediction data and the actual data; and the abnormity judging unit is used for judging whether the data difference value is larger than a difference threshold value, and if the data difference value is larger than the difference threshold value, judging that the actual data is abnormal. Specifically, the difference threshold is a deviation obtained by verifying a data set when a long-short term memory model is trained offline, Gaussian distribution is performed, and N (mu, sigma ^2) is obtained, and a 99% confidence level is obtained.

According to the technical scheme of the embodiment of the invention, the abnormal data detection system comprises a data interface, wherein the data interface is used for receiving first detection data sent by an external system, and the first detection data is first-dimension data; the dimension reduction prediction module comprises a data dimension reduction unit and a prediction data calculation unit, the data dimension reduction unit is used for converting the first detection data into second detection data, the second detection data is second dimension data, the dimension of the first dimension data is larger than that of the second dimension data, and the prediction data calculation unit is used for predicting the second detection data based on a trained abnormal prediction model to obtain prediction data; and the abnormity judgment module is used for comparing the predicted data with the actual data so as to judge whether the actual data is abnormal or not. The first detection data is predicted after being subjected to dimension reduction through the dimension reduction prediction module, the accuracy of prediction on high-dimensional data is higher than that of prediction on high-dimensional data directly, and the technical effect of improving the accuracy of abnormal data prediction is achieved.

Example four

Fig. 4 is a schematic structural diagram of an abnormal data detection system according to a fourth embodiment of the present invention. The present embodiment is further detailed in the above technical solution, and the abnormal data detection system of the present embodiment is used for detecting high-dimensional abnormal data. As shown in fig. 4, an abnormal data detecting system according to an embodiment of the present invention includes a data interface 410, a dimension reduction predicting module 420, and an abnormal determining module 430. Wherein:

the dimension reduction prediction module 420 comprises a data dimension reduction unit 421 and a prediction data calculation unit 422, wherein the data dimension reduction unit 421 comprises a first calculation subunit 4211 and a dimension reduction final processing unit 4212; the first calculating subunit 4211 is configured to perform initial calculation on the first detection data to obtain a first intermediate result; the dimension reduction final processing unit 4212 is electrically connected to an output end of the first calculating subunit 4211, and is configured to perform final dimension reduction calculation on the first intermediate result to obtain the second detection data.

The prediction data calculation unit 422 includes a second calculation sub-unit 4221 and a final prediction processing unit 4222; the second calculating subunit 4221 is configured to perform initial calculation on the second detection data to obtain a second intermediate result; the final prediction processing unit 4222 is electrically connected to an output end of the second calculating subunit 4221, and configured to perform final prediction calculation on the second intermediate result to obtain the prediction data. Optionally, the final prediction processing unit 4222 includes two LSTM layers, and first introduces one LSTM network to perform vector representation on the time sequence of the second detection data, and performs reverse order reconstruction on the time sequence by using the other LSTM network to obtain the prediction data.

Specifically, the first detection data is calculated by the following formula to obtain the second detection data with low dimension:

z is σ (Wx + b), where Z is the result of the second detection data being output, W is the matrix of the first detection data, x is the vector result of the first detection data, b is a dimensionality reduction parameter of the encoder, and σ is a sigmoid function. Then, for the calculation of Wx + b, a first intermediate result is obtained in the first calculation subunit 4211, and the first intermediate result is an operation result of the matrix vector in the dimension reduction calculation. The sigmoid function is calculated, and σ is calculated as the sigmoid function in dimension reduction final processing unit 4212, so that dimension reduction second detection data is obtained.

i_t＝sigmoid(W_i[x_t，g_t-1]+b_i)

f_t=sigmoid(W_f[x_t，h_t-1]+b_f)

u_t＝tanh(W_u[x_t，h_t-1]+b_u)

c_t＝f_t⊙c_t-1+i_t⊙u_t

o_t＝sigmoid(W_o[x_t，h_t-1]+b_o)

h_t＝o_t⊙tanh(c_t)；

And c_tIs done in the second calculation subunit 4221, and

and h_t＝o_t⊙tanh(c_t) The final prediction processing unit 4222 performs calculation to obtain prediction data.

In one embodiment, optionally, the first computation subunit 4211 and the second computation subunit 4221 are the same computation subunit, that is, the matrix vector multiplication of the data dimension reduction processing and the matrix vector multiplication of the prediction data deduction share the same computation subunit, and the same multiplication kernel is used for computation, so as to save the DSP used in the FPGA. In this embodiment, specifically, the dimension reduction prediction module 420 further includes a second detection data buffer. The first detection data is subjected to matrix vector operation of dimension reduction processing through the same calculating subunit, and then is transmitted to a dimension reduction final processing unit 4212 to calculate and output second detection data after dimension reduction; the second detection data is stored in the second detection data buffer, and the same calculation subunit obtains the second detection data from the second detection data buffer, performs matrix vector operation on the prediction data, and finally sends the second detection data to the final prediction processing unit 4222 to obtain the prediction data.

Optionally, the second computing subunit 4221 includes a second intermediate result cache area, configured to store a second intermediate result. In particular, the step size of the LSTM calculation is based on a time series. Step size refers to the amount of detection data that LSTM needs to compute. For example, if the detection data at time T, time T-1 and time T-2 need to be calculated, the step size of LSTM is 3. In this embodiment, the first detection data is generated by the sensor and the step size is associated with one sample vector from the sensor. And the deduction data at two adjacent moments are partially the same, so that for each calculation of the matrix vector of the predicted data, only the vector matrix multiplication of the newly input second detection data needs to be calculated, and the second intermediate result of the vector matrix multiplication is stored in the second intermediate result cache region, so that the second intermediate result is obtained for calculation when the predicted data is calculated, and the resources of the system are saved.

Optionally, the dimension reduction final processing unit 4212 includes:

Specifically, the dimension reduction processing subunit performs dimension reduction calculation on the first intermediate result to obtain final dimension reduction data, and then performs further standardization processing on the final dimension reduction data through a Batch Normalization (BN) unit, so that the prediction data calculated by the prediction data calculation unit 422 is more accurate.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. An abnormal data detection method, comprising:

2. The abnormal data detecting method according to claim 1, wherein the comparing the predicted data with the actual data to determine whether the actual data is abnormal comprises:

3. The abnormal data detection method according to claim 1, wherein before said converting the first detected data into the second detected data, comprising:

4. The abnormal data detection method of claim 3, wherein the encoder incorporates a batch normalization process.

5. The abnormal data detection method of claim 1, wherein predicting the second detection data based on the trained abnormal prediction model to obtain prediction data comprises:

6. The abnormal data detection method of claim 3, wherein the auto-encoder and the abnormal prediction model share a same multiplication kernel.

7. An abnormal data detection system, comprising:

8. The abnormal data detection system of claim 7, wherein the abnormality determination module comprises:

9. The outlier data detection system of claim 7, wherein said data dimension reduction unit comprises a first computational subunit and a dimension reduction final processing unit;

10. The abnormal data detection system according to claim 9, wherein the predicted data calculation unit includes a second calculation subunit and a final prediction processing unit;

11. The abnormal data detection system of claim 10, wherein the first computational subunit and the second computational subunit are the same computational subunit.

12. The abnormal data detection system of claim 10, wherein the second calculation subunit includes a second intermediate result buffer for storing the second intermediate result.

13. The abnormal data detection system of claim 9, wherein the dimension reduction final processing unit comprises: