CN111737099B

CN111737099B - Data center anomaly detection method and device based on Gaussian distribution

Info

Publication number: CN111737099B
Application number: CN202010515936.3A
Authority: CN
Inventors: 许明杰; 俞俊; 陈琰; 卢士达; 王琳; 梅竹; 陈海洋; 庞恒茂
Original assignee: NARI Group Corp; Nari Technology Co Ltd; State Grid Shanghai Electric Power Co Ltd; State Grid Electric Power Research Institute
Current assignee: NARI Group Corp; Nari Technology Co Ltd; State Grid Shanghai Electric Power Co Ltd; State Grid Electric Power Research Institute
Priority date: 2020-06-09
Filing date: 2020-06-09
Publication date: 2021-04-16
Anticipated expiration: 2040-06-09
Also published as: CN111737099A

Abstract

The invention provides a data center anomaly detection method and device based on Gaussian distribution. The method comprises the following steps: acquiring the characteristics of a hardware level, a software level and a physical environment of a data center server to form a multi-dimensional characteristic data set; performing dimensionality reduction processing on the acquired multi-dimensional feature data set; and according to the data subjected to the dimension reduction processing, performing operation by using an abnormality detection model based on Gaussian distribution to obtain an abnormality detection result. On the basis of an anomaly detection algorithm based on Gaussian distribution, the anomaly monitoring method suitable for the high-density data center is provided, anomaly monitoring efficiency of the data center can be improved, and management cost of the data center under high-density design is reduced.

Description

Data center anomaly detection method and device based on Gaussian distribution

Technical Field

The invention relates to an anomaly detection method and device for a data center, and belongs to the technical field of data centers.

Background

With the advent of the era of big Data, Data centers (IDC for short) have been rapidly developed^[1]. According to white paper book in data center (2018)^[2]It is shown that global data centers exhibit a trend toward an increase in the amount of reducers. Since 2017, with the development of large-scale and intensive concepts, the construction scale of the data center is increasing, but the problems of efficient operation and maintenance management and talent loss of the data center are highlighted. The problems that operation and maintenance talents are in short supply and the operation and maintenance capability cannot keep pace with the construction speed of the data center and the like occur in the multi-data center. In the big data era, a large amount of data generated by a network is flooded into a data center, so that the data center is required to have the characteristics of high density, greenness and easiness in management^[3]. As data centers continue to approach these goals, however, it becomes increasingly difficult for people to manage data centers. Wherein the monitoring of devices and troubleshooting of faulty devices for IDC rooms has been a popular topic of academic research in recent years. The academia has not yet provided a good solution to this problem. Most data center monitoring machine rooms still adopt means of manual investigation and physical sensor monitoring, and monitoring efficiency is not high and cost is high.

In recent years, anomaly detection for data centers has been a hotspot in academia^[4][5]. Two main strategies are currently adopted for such studies: anomaly detection based on machine learning models and anomaly detection based on statistical models. The sample set is clustered based on the anomaly detection of the machine learning model, each data can be gathered in a certain cluster, and then the relevance of the data can be judged by calculating the Euclidean distance and the Manhattan distance.If a sample data point is far from any cluster or the data points of the cluster are sparse, the data point or cluster is determined to be in an abnormal state. For the study of this algorithm, Shenyin^[6]A stable single-class support vector machine is provided, and a self-adaptive penalty factor is designed according to the Euclidean distance from each normal data point to the center of a data set, so that the influence of partial outliers on the support vector machine is small. Although overfitting of the model is avoided to a certain extent by the algorithm of Shenyin, the algorithm is easy to converge prematurely during convergence, so that the vector machine model cannot be classified well. F Xiao^[7]And learning normal data samples by utilizing linear discrimination and logistic regression, and identifying acceptable behaviors of the network from the learning to perform intrusion detection. An alarm may occur when abnormal data outside the data set is observed. Although the accuracy of the algorithm is excellent, the linear regression model algorithm based on the algorithm occupies a large amount of time and memory during calculation, thereby reducing the efficiency of the algorithm. Statistical model-based anomaly detection requires feature sets to be extracted from the state or behavior of an observed object and a corresponding statistical model to be constructed. By collecting the distribution conditions of normal samples and abnormal samples in the samples, the abnormity can be rapidly judged according to the distribution conditions of newly collected samples. The method does not occupy a large amount of calculation time, and is suitable for solving the problem of large data flow. Huorong Ren^[8]And segmenting sequence data through a sliding window, defining the state of the data according to the value of the sliding window data, and establishing a high-order Markov model with real-time adaptive state change for carrying out anomaly detection. The algorithm of the Huorongren can adapt to the change of the data set in real time, but the algorithm has no way of considering all sample sets in different periods, and is not suitable for being used as an algorithm of a data center anomaly detection system. Chen Xianda^[9]The hierarchical structure is adopted to integrate the correlation between the sensor control and time, and the weight of the sensor and the spatial information is combined to perform anomaly detection on the sensor in the network through the Markov chain. After the spatial correlation is determined, effective time correlation is extracted, so that the detection accuracy can be improved and the communication cost can be reduced. But do notThe algorithm of Chen Xianda only designs an algorithm on the level of a sensor, and cannot completely reflect the abnormality of a server, and the monitoring effect on some servers with abnormal operation is not good.

As can be seen from the above analysis, the anomaly detection methods for data centers in the prior art, although excellent in some convenience, have their respective disadvantages. How to comprehensively improve the execution efficiency of the algorithm and the accuracy of the anomaly monitoring is a problem to be solved.

The cited references are as follows:

[1]2019 cloud computing industry depth report [ N ]. China information weekly report, 2019-12-09(012).

[2] White paper book in data center (2018).

http://www.caict.ac.cn/kxyj/qwfb/bps/201810/t20181016_186900.htm

[3] Tengqing cloud-based data center platform research and design [ J/OL ] electronic technology and software engineering, 2019(23): 173-.

[4] Zhan, network anomaly detection research and application [ D ]. Beijing post and telecommunications university, 2019.

[5] The virtual machine anomaly detection strategy and algorithm for operating environment perception under the Zhouyun cloud platform are researched [ D ]. Chongqing university, 2015.

[6]Yin S,Zhu X,Jing C.Fault detection based on a robust one class support vector machine[J].Neurocomputing,2014,145:263-268

[7]Subba B,Biswas S,Karmakar S.Intrusion Detection Systems using Linear Discriminant Analysis and Logistic Regression[C].India Conference.IEEE,2016:1-6

[8]Ren H,Ye Z,Li Z.Anomaly detection based on a dynamic Markov model[J].Information Sciences,2017,411:52-65.

[9]Chen X,Kim K T.Youn H Y.Integration of Markov random field with Markov chain for efficient event detection using wireless sensor network[J].Computer Communications,2008,31(17):4018-4025.

[10]Tingting Pan,Junhong Zhao,Wei Wu,Jie Yang.Learning imbalanced datasets based on SMOTE and Gaussian distribution[J].Information Sciences,2020,512.。

Disclosure of Invention

The purpose of the invention is as follows: aiming at the defects of the prior art, the invention provides a data center anomaly detection method and device based on Gaussian distribution, which can obviously improve the detection accuracy of a data center anomaly server and have high algorithm execution efficiency.

The technical scheme is as follows: in a first aspect, a data center anomaly detection method based on gaussian distribution includes the following steps:

acquiring the characteristics of a hardware level, a software level and a physical environment of a data center server to form a multi-dimensional characteristic data set;

performing dimensionality reduction processing on the acquired multi-dimensional feature data set;

and according to the data subjected to the dimension reduction processing, performing operation by using an abnormality detection model based on Gaussian distribution to obtain an abnormality detection result.

Further, the multi-dimensional feature data set is represented in the form of a matrix as follows:

n represents a characteristic dimension, each matrix element X_d(d is more than or equal to 1 and less than or equal to n) represents a vector formed by a plurality of physical quantities, and the vector is respectively one of X _ CPU, X _ GPU, X _ memory, X _ disk, X _ net, X _ thread and X _ phy, wherein X _ CPU is a series of characteristics for representing the working state of a CPU, X _ GPU is a series of characteristics for representing the working state of a GPU, X _ memory is a series of characteristics for representing the working state of a memory, X _ disk is a series of characteristics for representing the working state of a disk, X _ net is a series of characteristics for representing the working state of a network, X _ thread is a series of characteristics for representing the state of process resources, and X _ phy is a series of characteristics for representing a physical environment.

Further, the performing dimension reduction processing on the acquired multi-dimensional feature data set includes:

s21, for the d dimension feature X_dThe jth element X of (2)_djCalculating each feature X according to equation (1)_djAverage value of (d):

wherein the superscript i represents the specific feature serial number, and m is the number of samples taken for the element feature;

s22, using

Replace each

Substituting equation (2) for feature scaling for each feature:

where max _ x_djRepresents the maximum value of the jth element feature of the d-dimension, min _ x_djRepresenting the minimum value of j element characteristics of the d dimension;

s23, the step S22

Substituting equation (3) to calculate the covariance matrix:

s24, sorting the covariance matrix elements from big to small, taking the first k columns to form a new covariance matrix u_reduceThen, calculating a new feature value according to the formula (4) to obtain a new feature matrix dataset _ z:

z＝U_reduce ^Tx (4)

further, the anomaly detection model based on the gaussian distribution is generated as follows:

recording the set of k features after dimensionality reduction as a set χ, selecting a first element in the set χ to be placed in an empty set κ, and then circularly executing the following operations until the set χ is empty:

a) calculating the distribution of the first column characteristic value in the set chi according to the Gaussian distribution, and marking as P_first(x) Separately calculate P_first(x) A correlation coefficient r associated with each distribution in the set κ;

b) when the | r | is larger than a specified threshold value, an eta matrix and an s matrix corresponding to the two distributions are calculated to form a multi-element high-density data center distribution which is recorded as Hdd distribution, and P is removed from the set χ_first(x) The cycle is ended;

c) otherwise, the master is P_first(x) Putting into a kappa set, and returning to the step a;

the calculation mode of the eta matrix and the s matrix is as follows:

where eta ∈ Rⁿ，s∈R^n×n，f∈RⁿEta is mean vector of Hdd multivariate distribution, s is covariance matrix of Hdd multivariate distribution, f is intermediate parameter vector of Hdd multivariate distribution, and is formed by dividing corresponding elements of eta and s, p (x) is probability density function of Hdd multivariate distribution, and x⁽ⁱ⁾Represents the ith feature and m represents the number of samples of that feature.

Further, the obtaining an anomaly detection result by performing an operation using the anomaly detection model based on gaussian distribution includes:

and obtaining a multivariate distribution probability density function of each distribution after obtaining all the distributions in the set kappa according to an abnormality detection model, calculating a probability value of each distribution by using data subjected to dimension reduction processing, and determining the occurrence of an abnormality and identifying the dimension of the abnormality when the probability value is greater than a specified threshold value.

In a second aspect, a data center anomaly detection device based on gaussian distribution includes:

the data acquisition module is used for acquiring the characteristics of a hardware layer, a software layer and a physical environment of the data center server to form a multi-dimensional characteristic data set;

the preprocessing module is used for performing dimensionality reduction processing on the acquired multi-dimensional feature data set;

and the anomaly detection module is used for calculating by using an anomaly detection model based on Gaussian distribution according to the data subjected to the dimension reduction processing to obtain an anomaly detection result.

In a third aspect, a computer device comprises:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, which when executed by the processors implement the steps of the gaussian distribution based data center anomaly detection method according to the first aspect of the present invention.

Has the advantages that: the invention provides an anomaly monitoring method suitable for a high-density data center on the basis of an anomaly detection algorithm based on Gaussian distribution. The method comprises the steps of acquiring running characteristics of physical devices and software layers of a server, capturing data objects which possibly have abnormity in real time, performing dimensionality reduction processing on the data, extracting factors which have important influence on the abnormity, applying an improved Gaussian probability model, and performing comprehensive measurement on a plurality of factors to avoid detection errors caused by single-factor detection. The method can effectively improve the detection accuracy of the abnormal server of the high-density data center, has higher execution efficiency, and is beneficial to reducing the management cost of the data center under the high-density design.

Drawings

Fig. 1 is a flowchart of an anomaly detection method for a data center according to an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is further explained by combining the attached drawings.

The traditional data center carries out routing inspection and risk analysis through manpower and physical sensors, and consumes a large amount of manpower and material resources. On the basis of an anomaly detection algorithm based on Gaussian distribution, the anomaly monitoring method suitable for the high-density data center is provided, anomaly monitoring efficiency of the data center can be improved, and management cost of the data center under high-density design is reduced.

As shown in fig. 1, in one embodiment, the data center anomaly detection method based on gaussian distribution includes the following steps:

and step S10, acquiring the characteristics of the physical device and the software layer of the data center server.

For the selection of the features, the embodiment selects 300 features from the physical device and software level of the server, and the features mainly come from the physical environment where the CPU, the GPU, the hard disk, the memory, the motherboard, the power supply, the hardware are located, the computation, the storage, the process, the network throughput, and some other composite features. The examples of the extraction of the above features are as follows:

firstly, a series of characteristics extracted from the CPU, such as CPU load, CPU wait IO operation occupancy rate, CPU idle state occupancy rate and the like, are set as:

X_cpu＝(cpu_load,cpu_iowait,cpu_free,......,cpu_sys)

secondly, a series of characteristics extracted from the GPU, such as GPU load, GPU occupancy rate of waiting IO operation, GPU idle state occupancy rate and the like are set as follows:

X_gpu＝(gpu_load,gpu_iowait,gpu_free,......,gpu_sys)

a series of characteristics extracted from the memory, such as the number of free memories, the reading rate from the memory per second, the writing rate into the memory per second, the memory access rate, etc., are set as:

X_memory＝(memory_free,memory_read,memory_write,......,memory_visit)

and fourthly, extracting a series of characteristics from the disk, such as disk IO throughput, hard disk access amount, reading rate from the disk per second, writing rate to the disk per second and the like, and setting the characteristics as follows:

X_disk＝(disk_io,num_of_disk_acc/sec,......,disk_read)

the characteristics extracted from the physical environment, such as temperature, humidity, temperature difference, fan speed and the like, are set as follows:

X_phy＝(tem,hum,tem_dval,......,cpu_fan_rate)

a series of characteristics extracted from the network throughput of the server, such as the data volume received by the server per second, the data volume sent by the server per second, the network load rate, the data packet receiving amount, the data packet loss amount and the like are set as follows:

X_net＝(net_re,net_send,net_pac_re,......,net_load)

extracting characteristics from process resources, such as process occupancy rate of a process occupying a memory, a shared memory and a cpu, and setting the characteristics as follows:

X_thread＝(thread_mem_size,thread_share_size,thread_cpu,......,thread_time)

for the above obtained features, the unlabeled feature sample is defined as:

X≡(X_cpu,X_gpu,X_memory,X_disk,X_net,X_thread,X_phy)。

it should be understood that the above-described seven-dimensional feature content is only for illustrative purposes, and does not limit the method of the present invention to obtain the same features as described above, and since the hardware facilities, the physical environment and the maintenance focus of different data centers are different, the selection of corresponding feature items can be performed according to actual situations.

In step S20, the acquired feature data set is subjected to dimension reduction processing.

The acquired feature data sets form a matrix, which is recorded as:

wherein each element X of the matrix_dRepresents a vector value, i.e. an X value in X ≡ (X _ cpu, X _ gpu, X _ memory, X _ disk, X _ net, X _ thread, X _ phy) which is a collection of several physical quantities. n is the server feature dimension acquired in step S10, and in the present embodiment, n is 7.

The dimensionality reduction is carried out according to the following steps:

s21, for the d (d is more than or equal to 1 and less than or equal to n) th dimension characteristic X_dThe jth element X of (2)_djCalculating each feature X according to equation (1)_djAverage value of (d):

the superscript i indicates a specific feature number, and as described in step S10, the CPU dimension feature X _ CPU is set to X₁First feature X thereof₁₁For cpu _ load, m is the number of samples taken for the element's feature, U_djRepresenting the mean value of the acquired m cpu _ loads;

s22, using

Replace each

Substituting equation (2) for feature scaling for each feature:

s23, preparation of S22

Substituting equation (3) to calculate the covariance matrix:

for different features X_djAnd calculating according to a matrix formed by the sample values of the covariance matrix.

S24, sorting the covariance matrix elements from big to small, taking the first k columns to form a new covariance matrix u_reduceThen, a new eigenvalue is calculated according to equation (4), resulting in a new eigen matrix dataset _ z as shown in equation (5).

z＝U_reduce ^Tx (4)

Step S30, an anomaly detection result is obtained by performing an operation using an anomaly detection model based on gaussian distribution based on the data subjected to the dimension reduction processing.

Because the common Gaussian distribution is applied to the abnormal detection algorithm of the data center server, the error is large, and the effect is not ideal, the invention provides a new probability distribution function on the basis of the Gaussian distribution.

The general distribution for a High-density data center (Hdd) is defined as follows:

X～Hdd(μ,σ²,t) (6)

general minute Brilliant, mu_jThe mean value is represented by the average value,

denotes the standard deviation, t is an intermediate value, and f (x) denotes the probability density function. The multivariate distribution for Hdd is defined as follows:

X～MultHdd(η,s,f) (11)

where eta ∈ Rⁿ，s∈R^n×n，f∈Rⁿη is the mean vector of Hdd multivariate distribution, s is the covariance matrix of Hdd multivariate distribution, f is the t-parameter vector of Hdd multivariate distribution, which is formed by dividing η by the corresponding elements of s, and p (x) is the probability density function of Hdd multivariate distribution.

Assuming x is a k-dimensional feature vector, then:

P_Hddadprobability distribution function representing multivariate distribution, with hddad representing high density data center anomaly detection (hddamany detection), wherein

And P_MultHdd(x; eta, s, f) represents the normal distribution and the multivariate distribution, respectively, as defined above.

Since a plurality of factors are considered, detection is equivalent to integration of a plurality of dimensions that reflect abnormal data, compared with a single element. For example, if an exception occurs, the CPU \ GPU \ memory may have a fault after the exception occurs, but the hard disk has no fault, so the foregoing principal component analysis is used to eliminate the influence of the hard disk, and the detection error caused by only passing through the CPU is avoided by the comprehensive measurement of a plurality of elements.

The generation of the anomaly detection model of the invention needs to calculate the correlation among all characteristic variables and then generate the model.

The following algorithm is performed:

1) setting a set χ as a set where k features subjected to dimensionality reduction are located, and setting a set κ as an empty set;

2) selecting a first element in chi to be put into a kappa set;

3) when there are elements in the set χ, the following operations are performed in a loop until the set χ is empty:

3.1) selecting the first distribution P in the set χ_first(x) (the first distribution refers to the distribution of the first list of feature values obtained according to the formula of the previous general distribution), the correlation coefficient r is calculated separately for each distribution in the set k, as follows:

3.2) if | r>0.25, calculating the eta matrix and the s matrix of the two distributions to form a multivariate Hdd distribution, and removing P from the set χ_first(x) The cycle ends.

3.3) otherwise, P_first(x) Put into the kappa pool.

The cycle ends.

The kappa set is a reference comparison set, storing all irrelevant distributions. The effect of this cycle is that the P currently taken in χ_first(x) And comparing each distribution in the comparison set kappa and if the correlation is greater than 0.25, comparing P with the distribution_first(x) Form a multivariate distribution with the current distribution in kappa because of their strong correlation; if neither is greater than 0.25, P is indicated_first(x) If the correlation with all distributions in κ is not strong, P is selected_first(x) Put into κ. In this embodiment, the correlation threshold of 0.25 is a value obtained through experimental statistics, which is relatively reasonable and has a small error, and can be adjusted as needed in an actual situation.

dis is an abbreviation for distribution, hddis denotes Hdd distribution, hddis_iShowing the ith multivariate distribution constructed according to the above cycle.

And finishing the generation of the abnormality detection model.

After all the distributions in the set k are obtained, the multivariate distribution probability density function of each distribution can be obtained, then the probability value of each distribution can be calculated according to the measured value in practical application, and when the probability value is larger than a certain threshold value, the abnormality detection can be carried out. The threshold value is generally determined by specific problem specific analysis, is related to different characteristics, cannot be uniformly determined in advance, and can be configured in detection.

According to the method, the data center deployed by an enterprise is verified within a period of time, experiments show that the method improves the accuracy rate of detecting the abnormal server by nearly 20%, and meanwhile, the algorithm has high execution efficiency.

According to another embodiment of the present invention, a data center abnormality detection apparatus based on gaussian distribution includes:

the data acquisition module is used for acquiring the characteristics of physical devices and software layers of the data center server to form a multi-dimensional characteristic data set;

The multidimensional characteristic data set obtained by the data acquisition module is expressed in a matrix form as follows:

The preprocessing module comprises:

a mean value calculation unit for calculating the d-th dimension characteristic X_dThe jth element X of (2)_djEach feature X is calculated as follows_djAverage value of (d):

feature scaling unit for using

Replace each

The feature scaling is performed for each feature by substituting:

a covariance matrix calculation unit for scaling the features obtained by the feature scaling unit

The covariance matrix is calculated by substituting the following equation:

a new feature matrix calculation unit for sorting the covariance matrix elements from large to small, taking the first k columns to form a new covariance matrix u_reduceThen, a new feature value is calculated as follows,

z＝U_reduce ^Tx (21)

obtaining a new feature matrix dataset _ z:

the abnormality detection module includes:

the model building unit is used for generating a data center abnormity detection model based on Gaussian distribution, and the generation method comprises the following steps: recording the set of k features after dimensionality reduction as a set χ, selecting a first element in the set χ to be placed in an empty set κ, and then circularly executing the following operations until the set χ is empty:

b) when | r | is greater than a specified threshold, the η moments for the two distributions are calculatedThe matrix and the s-matrix form a multi-element high density data center distribution, denoted as Hdd distribution, with P removed from the set χ_first(x) The cycle is ended;

the calculation mode of the eta matrix and the s matrix is as follows:

where eta ∈ Rⁿ，s∈R^n×n，f∈RⁿEta is mean vector of Hdd multivariate distribution, s is covariance matrix of Hdd multivariate distribution, f is intermediate parameter vector of Hdd multivariate distribution, and is formed by dividing corresponding elements of eta and s, p (x) is probability density function of Hdd multivariate distribution, and x⁽ⁱ⁾Representing the ith feature, and m represents the number of samples of the feature;

and the anomaly detection unit is used for obtaining a multi-element distribution probability density function of each distribution after obtaining all the distributions in the set kappa according to the anomaly detection model, calculating the probability value of each distribution by using the data subjected to dimension reduction processing, and when the probability value is greater than a specified threshold value, considering that an anomaly occurs and identifying the dimension of the anomaly.

Based on the same technical concept as the method embodiment, according to another embodiment of the present invention, there is provided a computer apparatus including: one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, which when executed by the processors implement the steps in the method embodiments.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims

1. A data center anomaly detection method based on Gaussian distribution is characterized by comprising the following steps:

the method comprises the following steps of obtaining the characteristics of a hardware layer, a software layer and a physical environment of a data center server to form a multi-dimensional characteristic data set, wherein the multi-dimensional characteristic data set is expressed in a matrix form as follows:

n represents a characteristic dimension, each matrix element X_d(d is more than or equal to 1 and less than or equal to n) represents a vector formed by a plurality of physical quantities, and the vector is respectively one of X _ CPU, X _ GPU, X _ memory, X _ disk, X _ net, X _ thread and X _ phy, wherein X _ CPU is a series of characteristics for representing the working state of a CPU, X _ GPU is a series of characteristics for representing the working state of a GPU, X _ memory is a series of characteristics for representing the working state of a memory, X _ disk is a series of characteristics for representing the working state of a disk, X _ net is a series of characteristics for representing the working state of a network, X _ thread is a series of characteristics for representing the state of process resources, and X _ phy is a series of characteristics for representing a physical environment;

and performing dimensionality reduction processing on the acquired multi-dimensional feature data set, wherein the dimensionality reduction processing comprises the following steps:

s22, using

Replace each

Substituting equation (2) for feature scaling for each feature:

s23, the step S22

Substituting equation (3) to calculate the covariance matrix:

z＝U_reduce ^Tx (4)

according to the data subjected to dimension reduction processing, performing operation by using an anomaly detection model based on Gaussian distribution to obtain an anomaly detection result, wherein the anomaly detection model based on Gaussian distribution is generated according to the following method: recording the set of k features after dimensionality reduction as a set χ, selecting a first element in the set χ to be placed in an empty set κ, and then circularly executing the following operations until the set χ is empty:

c) otherwise, the master is P_first(x) Put into kappa collection and return to step a.

2. The data center abnormality detection method based on gaussian distribution according to claim 1, characterized in that the η matrix and the s matrix are calculated as follows:

where eta ∈ Rⁿ，s∈R^n×n，f∈RⁿEta is mean vector of Hdd multivariate distribution, s is covariance matrix of Hdd multivariate distribution, f is intermediate parameter vector of Hdd multivariate distribution, and is formed by dividing corresponding elements of eta and s, and p (x) is probability density function of Hdd multivariate distribution，x⁽ⁱ⁾Represents the ith feature and m represents the number of samples of that feature.

3. The method for detecting the abnormality of the data center based on the gaussian distribution according to claim 2, wherein the performing an operation by using the gaussian distribution based abnormality detection model to obtain the abnormality detection result comprises:

4. A data center anomaly detection device based on Gaussian distribution is characterized by comprising:

the data acquisition module is used for acquiring the characteristics of a hardware layer, a software layer and a physical environment of the data center server to form a multi-dimensional characteristic data set, and the multi-dimensional characteristic data set is expressed in a matrix form as follows:

the preprocessing module is configured to perform dimension reduction processing on the acquired multi-dimensional feature data set, and the preprocessing module specifically includes:

a mean value calculation unit for calculating the d-th dimension characteristic X_dThe jth element X of (2)_djCalculating each feature X according to equation (1)_djAverage value of (d):

feature scaling unit for using

Replace each

Substituting equation (2) for feature scaling for each feature:

Substituting equation (3) to calculate the covariance matrix:

a new feature matrix calculation unit for sorting the covariance matrix elements from large to small, taking the first k columns to form a new covariance matrix u_reduceThen, calculating a new feature value according to the formula (4) to obtain a new feature matrix dataset _ z:

z＝U_reduce ^Tx (4)

the anomaly detection module is used for calculating by using an anomaly detection model based on Gaussian distribution according to data subjected to dimension reduction processing to obtain an anomaly detection result, and comprises a model construction unit which is used for generating a data center anomaly detection model based on Gaussian distribution, wherein the generation method comprises the following steps: recording the set of k features after dimensionality reduction as a set χ, selecting a first element in the set χ to be placed in an empty set κ, and then circularly executing the following operations until the set χ is empty:

5. The apparatus for detecting data center abnormality based on gaussian distribution according to claim 4, wherein the η matrix and the s matrix are calculated as follows:

6. The data center abnormality detection device according to claim 4, wherein the abnormality detection module further includes an abnormality detection unit, configured to obtain a multivariate distribution probability density function of each distribution after obtaining all distributions in the set κ according to the abnormality detection model, calculate a probability value of each distribution by using dimension-reduced data, and determine that an abnormality occurs and identify a dimension in which the abnormality occurs when the probability value is greater than a specified threshold.

7. A computer device, the device comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, which when executed by the processors implement the steps of any of claims 1-3.