CN114238952A - Abnormal behavior detection method, device and system and computer readable storage medium - Google Patents

Abnormal behavior detection method, device and system and computer readable storage medium Download PDF

Info

Publication number
CN114238952A
CN114238952A CN202111381886.5A CN202111381886A CN114238952A CN 114238952 A CN114238952 A CN 114238952A CN 202111381886 A CN202111381886 A CN 202111381886A CN 114238952 A CN114238952 A CN 114238952A
Authority
CN
China
Prior art keywords
value
cluster
matrix
operation characteristic
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111381886.5A
Other languages
Chinese (zh)
Inventor
许云风
马振
邹武
王启凡
陶景龙
殷钱安
夏玉明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information and Data Security Solutions Co Ltd
Original Assignee
Information and Data Security Solutions Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information and Data Security Solutions Co Ltd filed Critical Information and Data Security Solutions Co Ltd
Priority to CN202111381886.5A priority Critical patent/CN114238952A/en
Publication of CN114238952A publication Critical patent/CN114238952A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an abnormal behavior detection method, device and system and a computer readable storage medium. Wherein, the method comprises the following steps: adopting expert experience knowledge to intelligently select multi-dimensional operation characteristic values corresponding to operation behaviors; and selecting an optimal K value, normalizing the operation characteristic values, carrying out K-means clustering, integrating the clustered results into the concept of peer-to-peer group, detecting to obtain abnormal behaviors, and visualizing the abnormal behaviors. The invention carries out full and rapid feature selection; all the characteristics are normalized to improve the detection sensitivity; obtaining a clustering center and a clustering category by adopting an optimal K mean value clustering algorithm so as to facilitate the detection of abnormal behaviors; abnormal behavior judgment is carried out in a peer-to-peer mode, and threshold parameters can be regulated and controlled, so that the detection speed and accuracy are improved; and the detection result is visualized after dimensionality reduction, so that the detection result is more popular, visualized and clearer.

Description

Abnormal behavior detection method, device and system and computer readable storage medium
Technical Field
The present invention relates to the field of data security, and in particular, to a method, an apparatus, a system, and a computer-readable storage medium for detecting abnormal behavior.
Background
In recent years, data security has gradually increased to the height of national security, and urgent needs for security protection of big data are developed in institutions or departments such as mobile communication, telecommunication, power grid, real estate, big data center, education and the like. However, the complexity of the internet topology, the imperceptibility of network attacks, the diversity of attack means, the irregularity of hacking actions, and the advancement of hacking capabilities are increasingly straining the security environment. Accordingly, situation awareness, User Entity Behavior Analysis (UEBA), and the like, are brought forward for targeted product technologies. In spite of these emerging product technologies, almost three technical elements are included, namely, a big data processing engine or a frame is used as a foundation stone, a rule set formed by expert experience consensus is used as an guideline, and various algorithms of machine learning are used as a guarantee. Data processing can mine and utilize data well, so that the data becomes data with quality; the rule set is consensus, and the common statistical data of iron is that the abnormal behavior in front of the plane is indiscriminate; the algorithm can enable irregular user entity behaviors to become regular, behavior characteristic values which cannot be described become describable and quantifiable, and the result which cannot be quantified becomes understandable and accurate.
At the present stage, the user entity behavior anomaly detection is usually performed by adopting a model training algorithm, and there are still insufficient places to be improved, which are mainly embodied in the following aspects: 1) the characteristics cannot be fully mined and extracted, and the multi-dimensional characteristics cannot be fully utilized; 2) most abnormal behavior detection methods such as authentication type (brute force cracking, abnormal time login and the like), timing sequence type and other detection models use characteristics or adopt rules are single; 3) the quality of data is more or less not considered, even the multi-dimensional and multi-regular anomaly detection model is not satisfactory and convincing in the processing of the raw data input into the algorithm model; 4) the use is more complex and the running speed is slow; 5) the accuracy of the detection result is not sufficient, or the regulation and control of the result accuracy is not flexible enough.
In view of the above problems in the prior art, there is no effective solution at present.
Disclosure of Invention
In order to solve the above problems, the present invention provides a method, an apparatus, a system and a computer readable storage medium for detecting abnormal behavior, wherein multidimensional data features are extracted for each user entity behavior, that is, a plurality of operation feature values of each user entity behavior are extracted, and each operation feature value participates in a K-means clustering algorithm, so that multidimensional features are fully utilized, and features of all dimensions participate in detection and judgment of abnormal behavior, thereby improving detection accuracy; normalizing the characteristic values participating in clustering, putting all the operation characteristic quantities on the same dimension, avoiding the influence of the absolute value of the characteristic, and improving the credibility of the characteristic after data processing; and each clustered cluster is used as a peer group, sample points in all peer groups are sequenced, abnormal behaviors are determined by setting a threshold parameter, the processing speed of a clustering algorithm combined with a peer group concept is high, and the threshold parameter is adjustable, so that the accuracy regulation and control are flexible.
In order to achieve the above object, an embodiment of the present invention provides an abnormal behavior detection method, including: s1, obtaining operation behaviors in a preset period and a plurality of operation characteristic values corresponding to the operation behaviors; the plurality of operation characteristic values are obtained through intelligent selection of people according to expert experience knowledge; s2, normalizing all the operation characteristic values, and taking the operation behavior corresponding to a plurality of operation characteristic values as a sample point; s3, determining a clustering K value by calculating CH scores corresponding to a plurality of preset K values; s4, carrying out K-means clustering on the sample points according to the clustering K values to obtain a plurality of final clusters and a final cluster center corresponding to each final cluster; and S5, taking each final cluster as a peer group, sequencing the sample points according to the distance between each sample point and the center of the corresponding final cluster, and judging the operation behavior corresponding to the sample point in the preset range as abnormal behavior.
Further optionally, the determining a clustering K value by calculating CH scores corresponding to a plurality of preset K values includes: s301, setting a maximum K value KmaxA plurality of preset K values are respectively taken as [2, Kmax]A consecutive integer of (1); s302, calculating CH scores of all sample points subjected to K-means clustering under each preset K value respectively; and S303, selecting the preset K value with the largest CH score as a clustering K value.
Further optionally, the performing K-means clustering on the sample points according to the clustered K values to obtain a plurality of final clusters and a final cluster center corresponding to each final cluster includes: s401, randomly selecting sample points with the same number as the clustering K values from all the sample points as current clustering center points; s402, calculating the distance from each sample point to each current clustering center point, taking the current clustering center point closest to each sample point as the cluster center of the sample point, and taking a plurality of sample points with the same current clustering center point as a cluster; s403, taking the center of the sample point in each cluster as the latest cluster center point; s404, the steps S402 and S403 are repeated until the distance between the center of the current cluster and the center of the previous cluster is smaller than a preset distance threshold value, the current cluster is used as a final cluster, and the cluster center of the current cluster is used as the final cluster center.
Further optionally, after the determining, as the abnormal behavior, the operation behavior corresponding to the sample point greater than the preset threshold further includes: s6, arranging each operation characteristic value as an element into a first matrix, wherein the elements in each row of the matrix are the operation characteristic values corresponding to the same operation behavior; s7, calculating the mean value of each row of elements in the first matrix, and subtracting the mean value of each row from each element in each row to obtain a second matrix; s8, calculating a covariance matrix of the second matrix, an eigenvalue of the covariance matrix and a corresponding eigenvector; s9, arranging corresponding eigenvectors into a matrix from top to bottom according to the sequence of the eigenvalues from big to small, and taking the first two rows of the matrix to form a third matrix; and S10, multiplying the third matrix and the first matrix to obtain a two-dimensional matrix.
Further optionally, the normalizing all the operation characteristic values includes: s201, obtaining a maximum operation characteristic value corresponding to operation characteristics of the same kind as a target operation characteristic value in the preset period; s202, acquiring a minimum operation characteristic value corresponding to operation characteristics of the same kind as the target operation characteristic value in the preset period; s203, calculating a first difference value between the target operation characteristic value and the minimum operation characteristic value; s204, calculating a second difference value between the maximum operation characteristic value and the minimum operation characteristic value; and S205, calculating a ratio of the first difference to the second difference to serve as a normalized operation characteristic value corresponding to the target operation characteristic value.
In another aspect, the present invention further provides an abnormal behavior detection apparatus, including: the data acquisition module is used for acquiring operation behaviors in a preset period and a plurality of operation characteristic values corresponding to the operation behaviors; the plurality of operation characteristic values are obtained through intelligent selection of people according to expert experience knowledge; the normalization module is used for normalizing all the operation characteristic values and taking the operation behavior corresponding to a plurality of operation characteristic values as a sample point; the clustering K value determining module is used for determining a clustering K value by calculating CH scores corresponding to a plurality of preset K values; the cluster generation module is used for carrying out K-means clustering on the sample points according to the clustering K values to obtain a plurality of final clusters and a final cluster center corresponding to each final cluster; and the abnormal behavior judging module is used for taking each final cluster as a peer-to-peer group, sequencing the sample points according to the distance between each sample point and the center of the corresponding final cluster, and judging the operation behavior corresponding to the sample point in the preset range as the abnormal behavior.
Further optionally, the clustering K value determining module includes: a preset K value determining submodule for setting a maximum K value KmaxA plurality of preset K values are respectively taken as [2, Kmax]A consecutive integer of (1); the CH score calculating submodule is used for calculating the CH scores of all the sample points subjected to K-means clustering under each preset K value; and the selection submodule is used for selecting the preset K value with the largest CH score as the clustering K value.
Further optionally, the cluster generating module includes: the initial clustering center point determining module is used for randomly selecting sample points with the same number as the clustering K values from all the sample points as current clustering center points; the cyclic iteration module is used for calculating the distance from each sample point to each current clustering center point, taking the current clustering center point closest to each sample point as the cluster center of the sample point, and taking a plurality of sample points with the same current clustering center point as a cluster; taking the center of the sample point in each cluster as the latest clustering center point; and circulating the steps until the distance between the center of the current cluster and the center of the previous cluster is smaller than a preset distance threshold, taking the current cluster as a final cluster, and taking the cluster center of the current cluster as the final cluster center.
Further optionally, the apparatus further comprises: the first matrix generation module is used for arranging each operation characteristic value as an element into a first matrix, wherein the elements in each row of the matrix are the operation characteristic values corresponding to the same operation behavior; the second matrix generation module is used for calculating the mean value of each row of elements in the first matrix and subtracting the mean value of each row from each element in each row to obtain a second matrix; the covariance matrix generation module is used for calculating a covariance matrix of the second matrix, an eigenvalue of the covariance matrix and a corresponding eigenvector; the third matrix generation module is used for arranging the corresponding eigenvectors into a matrix from top to bottom according to the sequence of the eigenvalues from big to small, and taking the first two rows of the matrix to form a third matrix; and the two-dimensional matrix generation module is used for multiplying the third matrix and the first matrix to obtain a two-dimensional matrix.
Further optionally, the normalization module includes: the maximum operation characteristic value operator module is used for acquiring a maximum operation characteristic value corresponding to the operation characteristic of the same kind as the target operation characteristic value in the preset period; the minimum operation characteristic value operator module is used for acquiring a minimum operation characteristic value corresponding to the operation characteristic of the same kind as the target operation characteristic value in the preset period; a first difference calculation submodule, configured to calculate a first difference between the target operation characteristic value and the minimum operation characteristic value; a second difference calculation submodule for calculating a second difference between the maximum operation characteristic value and the minimum operation characteristic value; and the ratio calculation submodule is used for calculating the ratio of the first difference value to the second difference value to serve as the normalized operation characteristic value corresponding to the target operation characteristic value.
On the other hand, the invention also provides an abnormal behavior detection system, which comprises the abnormal behavior detection device.
In another aspect, the present invention also provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the above-mentioned abnormal behavior detection method.
The technical scheme has the following beneficial effects: the invention adopts the human intelligence formed by expert experience knowledge to fully mine the multidimensional characteristics, and the characteristics are very fast selected; the characteristics are processed in a maximum minimization (MinMaxScale), and the data processing technology is very sensitive to abnormal data, so that the operation performance of an abnormal detection algorithm is improved, and the quality of the data is ensured; the core algorithm is a clustering peer-to-peer group algorithm consisting of K value (clustering category number) optimization, clustering and peer-to-peer groups, compared with the traditional machine learning algorithm, the use is more convenient, the real-time detection can be realized, and the detection accuracy can be flexibly regulated and controlled by a threshold parameter (threshold) set in the peer-to-peer group; the visualization with the dimensionality reduction and peer-to-peer group as core ideas greatly enhances the intuitiveness, the clarity and the understandability of the anomaly detection method.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of an abnormal behavior detection method provided by an embodiment of the present invention;
FIG. 2 is a flowchart of a method for determining a cluster K value according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for determining a final cluster and a final cluster center according to an embodiment of the present invention;
FIG. 4 is a flowchart of a method for visualizing a detection result according to an embodiment of the present invention;
FIG. 5 is a flow chart of a normalization method provided by an embodiment of the invention;
fig. 6 is a schematic structural diagram of an abnormal behavior detection apparatus according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a clustering K value determining module according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a cluster generation module according to an embodiment of the present invention;
FIG. 9 is a schematic structural diagram of a module for visualizing a detection result according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of a normalization module according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to solve the problems of insufficient extracted features, low reliability of processed data, low detection speed and insufficient detection accuracy in the prior art, the invention provides an abnormal behavior detection method, fig. 1 is a flow chart of the abnormal behavior detection method provided by the embodiment of the invention, and as shown in fig. 1, the method includes:
s1, obtaining operation behaviors in a preset period and a plurality of operation characteristic values corresponding to the operation behaviors; the plurality of operation characteristic values are obtained through intelligent selection of people according to expert experience knowledge;
in the data acquisition stage, the user entity behavior in a predetermined period, that is, the operation behavior in this embodiment, is acquired. Specifically, analyzing, converting and collecting operation compliance audit log or sensitive data leakage log data to obtain each user entity behavior. The operation compliance audit log is from log-in or operation logs of business systems such as SSO, 4A, bastion machine, CRM, report forms and the like; the sensitive data leakage log is derived from operation logs of a system or equipment related to sensitive data leakage, such as DLP (digital light processing), database audit and the like.
The human intelligence formed by expert experience knowledge in the field of big data security is used to fully select and mine the characteristic data used for algorithm analysis, namely the operation characteristic value in the embodiment. Taking the analysis scene of abnormal behavior of the user entity of the log requested by the SQL of the bastion machine as an example, the characteristics which can influence the detection result can be selected by human intelligence according to expert experience knowledge after the data collection and drop library. The feature is for an object, which generally refers to a user (dst _ account), but other entities such as source IP (src _ device _ IP) are also possible.
In this embodiment, features extracted from the peer group analysis scene of abnormal user entity behavior clustering in the bastion SQL request log are selected, which are shown in table 1.
Figure BDA0003365905020000051
Figure BDA0003365905020000061
S2, normalizing all the operation characteristic values, and taking the operation behavior corresponding to a plurality of operation characteristic values as a sample point;
because different operation characteristic values have different size ranges, if subsequent data processing is performed according to the original numerical values, the validity of the result is easily influenced by overlarge deviation of absolute values. In order to solve the problem, the characteristic data to be sent to the clustering peer-to-peer group algorithm is subjected to normalization processing, all operation characteristic values are put on the same dimension, and the influence of the absolute value of the characteristic values is avoided.
Each operation behavior is considered as a sample point, and each sample point is represented by an operation characteristic value of multiple dimensions.
S3, determining a clustering K value by calculating CH scores corresponding to a plurality of preset K values;
and calculating the CH scores corresponding to all the preset K values, and selecting one preset K value with the highest CH score as a clustering K value to facilitate the subsequent K-means clustering.
S4, carrying out K-means clustering on the sample points according to the clustering K values to obtain a plurality of final clusters and a final cluster center corresponding to each final cluster;
and performing K-means clustering on all sample points according to the clustering K values, and performing multiple loop iterations to obtain multiple clusters (classes) and the central point of each cluster. Specifically, the obtained cluster center coordinate value (feature value) calculated in this embodiment represents the center point of the cluster, and also obtains a category label (k value) representing each cluster.
And S5, taking each final cluster as a peer group, sequencing the sample points according to the distance between each sample point and the center of the corresponding final cluster, and judging the operation behavior corresponding to the sample point in the preset range as abnormal behavior.
After clustering is completed, the formed classes (clusters) are regarded as peer-to-peer groups, namely each cluster is considered equally (not treated differently according to the size of the cluster), all sample points are sequenced according to the distance between each sample point and the center of each cluster, and then the operation behaviors corresponding to a certain number or proportion of sample points are selected globally as abnormal behaviors according to a threshold parameter (threshold).
The setting of the threshold parameter includes two ways:
1) and (4) setting according to the proportion. The setting can be carried out according to the historical statistical proportion of the abnormal user entity behaviors in similar large-class scenes, and the appropriate proportion can also be set according to the human intelligence formed by expert experience knowledge.
For example, the sample points are arranged in descending order according to the order from the large center distance to the small center distance of the corresponding cluster, and the operation behavior corresponding to the sample points in the first 1% range is determined to be abnormal behavior.
2) Are set by number. The threshold parameter can be flexibly adjusted according to the feedback of the detection result, if the false alarm is excessive, the threshold parameter can be adjusted to be smaller, and if the false alarm is missed, the threshold parameter can be appropriately adjusted to be larger. Of course, if the feedback of the detection result cannot be received in time, the human intelligence can be fully exerted according to experience to set a proper numerical parameter.
For example, the sample points are arranged in descending order according to the descending order of the distance between the sample point and the center of the corresponding cluster, and the operation behavior corresponding to the first 5 sample points is judged to be abnormal behavior.
As an optional implementation manner, fig. 2 is a flowchart of a method for determining a K value of a cluster according to an embodiment of the present invention, and as shown in fig. 2, determining a K value of a cluster by calculating CH scores corresponding to a plurality of preset K values includes:
s301, setting a maximum K value KmaxA plurality of preset K values are respectively taken as [2, Kmax]A consecutive integer of (1);
setting a parameter value, i.e. the maximum K value KmaxThe K value is traversed from 2 to the maximum K value KmaxAnd obtaining a plurality of preset K values.
S302, calculating CH scores of all sample points subjected to K-means clustering under each preset K value respectively;
and clustering by using a K-means clustering algorithm under each preset K value to obtain CH scores for K-means clustering under each preset K value.
The CH score calculation logic is as follows:
Figure BDA0003365905020000071
where N represents the total number of data samples, K represents the number of cluster classes, tr (B)K)、tr(WK) Respectively represent matrices BK、WKThe trace of (c). B isKRepresenting the dispersion degree between the classes as the covariance between the classes; wKThe degree of closeness within a class is represented as covariance within the class. Their calculation formula is as follows:
Figure BDA0003365905020000072
Figure BDA0003365905020000073
wherein, CkIs the set of all data in class k, ckIs the center point of class k, ceIs the center point of all sample data, nkIs the total number of class k data points.
And S303, selecting the preset K value with the largest CH score as a clustering K value.
And determining the preset K value with the largest CH score as the optimal clustering K value, and finally calculating by using the clustering K value when carrying out K-means clustering.
As an optional implementation manner, fig. 3 is a flowchart of a method for determining a final cluster and a final cluster center according to an embodiment of the present invention, and as shown in fig. 3, performing K-means clustering on sample points according to a clustering K value to obtain a plurality of final clusters and a final cluster center corresponding to each final cluster, including:
s401, randomly selecting sample points with the same number as the clustering K values from all the sample points as current clustering center points;
s402, calculating the distance from each sample point to each current clustering center point, taking the current clustering center point closest to each sample point as the cluster center of the sample point, and taking a plurality of sample points with the same current clustering center point as a cluster;
s403, taking the center of the sample point in each cluster as the latest cluster center point;
s404, the steps S402 and S403 are repeated until the distance between the center of the current cluster and the center of the previous cluster is smaller than a preset distance threshold value, the current cluster is used as a final cluster, and the cluster center of the current cluster is used as the final cluster center.
The calculation process of the K-means clustering algorithm is as follows:
1) randomly selecting K sample points from the sample points as initial clustering centers, wherein the K value is the same as the clustering K value;
2) calculating the distance from each sample point to each initial clustering center, wherein an Euclidean distance function is adopted:
Figure BDA0003365905020000081
where n denotes a total of n dimensions (i.e., features), x denotes the coordinates of the sample point, ykThe coordinates of the K-th cluster center are represented, i 1,2,3.. No., n, K1, 2,3.. No. K;
3) and (4) classifying. The k value corresponding to the cluster center that is the shortest from each sample point is the category to which each point belongs, i.e., k is argmin d (x, y)k). Thus, all sample points are divided into K categories;
4) and updating K clustering centers. If K classes are formed in the above step, the centroids of all sample points in each cluster are used as new cluster center points from the formed K classes (clusters).
5) And (5) ending iteration and obtaining a clustering result. And repeating the steps 2) to 4) until the distance difference value between the updated clustering center and the clustering center generated in the previous iteration is smaller than a preset distance threshold, and stopping the iteration. And acquiring the final cluster center coordinate value (characteristic value) and the class label (k value) for subsequent use. The cluster generated at this time is the final cluster, and the cluster center is the final cluster center.
As an optional implementation manner, fig. 4 is a flowchart of a detection result visualization method provided by an embodiment of the present invention, and as shown in fig. 4, after determining an operation behavior corresponding to a sample point greater than a preset threshold as an abnormal behavior, the method further includes:
s6, arranging each operation characteristic value as an element into a first matrix, wherein the elements in each row of the matrix are the operation characteristic values corresponding to the same operation behavior;
the feature data (m sample points, each corresponding to n features) are grouped into a matrix X (i.e., a first matrix) of m rows and n columns by columns.
S7, calculating the mean value of each row of elements in the first matrix, and subtracting the mean value of each row from each element in each row to obtain a second matrix;
the average value of each row of the first matrix X is subtracted to obtain a second matrix.
S8, calculating a covariance matrix of the second matrix, an eigenvalue of the covariance matrix and a corresponding eigenvector;
solving a covariance matrix C, wherein the formula is as follows:
Figure BDA0003365905020000091
there are generally two methods for solving the eigenvalues and eigenvectors of the covariance matrix: eigenvalue decomposition method and singular value decomposition method. The present embodiment adopts a eigenvalue decomposition method, as follows:
Cν=λν
v is a characteristic vector of the matrix C; λ is a characteristic value corresponding to ν and is a real number.
C=Q∑Q-1
Wherein Q is a matrix formed by eigenvectors of the matrix C, and Σ is a diagonal matrix in which elements on one diagonal are formed by eigenvalues.
S9, arranging corresponding eigenvectors into a matrix from top to bottom according to the sequence of the eigenvalues from big to small, and taking the first two rows of the matrix to form a third matrix;
and arranging the eigenvectors into a matrix from top to bottom according to the sizes of the corresponding eigenvalues, and taking the first two rows to form a matrix H (namely a third matrix).
And S10, multiplying the third matrix by the first matrix to obtain a two-dimensional matrix.
And Y is the data reduced to two dimensions, namely HX, namely the visual two-dimensional matrix.
The multi-features of the feature points in the multi-dimensional space are reduced to two features through dimension reduction, so that the two-dimensional space can be plotted, and clustering peer-to-peer groups can be visually displayed on a screen.
In order to further realize visualization of the detection result, the embodiment displays the sample points corresponding to all abnormal behaviors, and simultaneously displays the current abnormal sample points. In addition, the normal points are sampled and displayed, and a certain number of points are randomly selected from the normal sample points in all the clusters in a peer-to-peer group mode to be displayed in consideration of the performance of result display. To facilitate distinguishing all points, the abnormal sample point, the current abnormal sample point, and the normal sample point can be distinguished by color, shape, and size.
Further, the present embodiment will show the important feature of the outlier. Specifically, the values, contribution degrees and corresponding cluster center values of the first several important features accounting for more than 90% of the outliers are shown. Thus, the abnormal behaviors (characteristics) of the user entities are clearly recognized.
Further, the present embodiment performs anomaly scoring on each operation behavior. And a simple scoring index is constructed, so that the abnormal detection result (alarm) is easier to understand. The logic for anomaly score calculation is: and for each abnormal point, taking the first few characteristics with the contribution degree not lower than 0.9, and calculating the average value of the ratio of the characteristic value to the coordinate value of the corresponding cluster center. The average value is more than 10 and is added by one; adding one minute if more than 20; more than 50, adding one minute; greater than 100, plus one minute. Base 5 points, add up to 9 points. The greater the final score, the greater the confidence that the user or entity is behaving abnormally.
As an alternative implementation manner, fig. 5 is a flowchart of a normalization method provided in an embodiment of the present invention, and as shown in fig. 5, normalizing all operation feature values includes:
s201, obtaining a maximum operation characteristic value corresponding to operation characteristics of the same kind as a target operation characteristic value in a preset period;
s202, acquiring a minimum operation characteristic value corresponding to operation characteristics of the same kind as the target operation characteristic value in a preset period;
s203, calculating a first difference value between the target operation characteristic value and the minimum operation characteristic value;
s204, calculating a second difference value between the maximum operation characteristic value and the minimum operation characteristic value;
and S205, calculating a ratio of the first difference to the second difference to serve as a normalized operation characteristic value corresponding to the target operation characteristic value.
In order to make all the operation values on the same dimension, the embodiment uses the minimum maximization (minmaxscale) method for normalization. The process of processing data by the minimum maximization method is as follows: for each operating characteristic value, according to x ═ x-minA)/(maxA-minA) Calculating; wherein x is the operation characteristic value before being normalized, minAIs the minimum operating characteristic value, max, among the same operating characteristic values in a predetermined periodAX' is the maximum operation characteristic value in the same operation characteristic values in a preset period, and is the operation characteristic value after normalization processing.
An embodiment of the present invention further provides an abnormal behavior detection apparatus, and fig. 6 is a schematic structural diagram of the abnormal behavior detection apparatus provided in the embodiment of the present invention, as shown in fig. 6, including:
the data acquisition module 100 is configured to acquire an operation behavior in a preset period and a plurality of operation characteristic values corresponding to the operation behavior; the plurality of operation characteristic values are obtained by intelligent selection of people according to expert experience knowledge;
in the data acquisition stage, the user entity behavior in a predetermined period, that is, the operation behavior in this embodiment, is acquired. Specifically, analyzing, converting and collecting operation compliance audit log or sensitive data leakage log data to obtain each user entity behavior. The operation compliance audit log is from log-in or operation logs of business systems such as SSO, 4A, bastion machine, CRM, report forms and the like; the sensitive data leakage log is derived from operation logs of a system or equipment related to sensitive data leakage, such as DLP (digital light processing), database audit and the like.
The human intelligence formed by expert experience knowledge in the field of big data security is used to fully select and mine the characteristic data used for algorithm analysis, namely the operation characteristic value in the embodiment. Taking the analysis scene of abnormal behavior of the user entity of the log requested by the SQL of the bastion machine as an example, the characteristics which can influence the detection result can be selected by human intelligence according to expert experience knowledge after the data collection and drop library. The feature is for an object, which generally refers to a user (dst _ account), but other entities such as source IP (src _ device _ IP) are also possible.
In this embodiment, the characteristics of the peer group analysis scene extraction of the abnormal user entity behavior clustering in the SQL request log of the bastion machine are selected, and refer to table 1 above.
A normalization module 200, configured to normalize all the operation characteristic values, and use an operation behavior corresponding to a plurality of operation characteristic values as a sample point;
because different operation characteristic values have different size ranges, if subsequent data processing is performed according to the original numerical values, the validity of the result is easily influenced by overlarge deviation of absolute values. In order to solve the problem, the characteristic data to be sent to the clustering peer-to-peer group algorithm is subjected to normalization processing, all operation characteristic values are put on the same dimension, and the influence of the absolute value of the characteristic values is avoided.
Each operation behavior is considered as a sample point, and each sample point is represented by an operation characteristic value of multiple dimensions.
A clustering K value determining module 300, configured to determine a clustering K value by calculating CH scores corresponding to a plurality of preset K values;
and calculating the CH scores corresponding to all the preset K values, and selecting one preset K value with the highest CH score as a clustering K value to facilitate the subsequent K-means clustering.
The cluster generation module 400 is configured to perform K-means clustering on the sample points according to the clustered K values to obtain a plurality of final clusters and a final cluster center corresponding to each final cluster;
and performing K-means clustering on all sample points according to the clustering K values, and performing multiple loop iterations to obtain multiple clusters (classes) and the central point of each cluster. Specifically, the cluster center coordinate value (feature value) obtained by the calculation in this embodiment represents the center point of the cluster, and also obtains a category label (k value) representing each cluster.
And the abnormal behavior determination module 500 is configured to use each final cluster as a peer group, sort the sample points according to the distance between each sample point and the center of the corresponding final cluster, and determine that the operation behavior corresponding to the sample point within the preset range is an abnormal behavior.
After clustering is completed, the formed classes (clusters) are regarded as peer-to-peer groups, namely each cluster is considered equally (not treated differently according to the size of the cluster), all sample points are sequenced according to the distance between each sample point and the center of each cluster, and then the operation behaviors corresponding to a certain number or proportion of sample points are selected globally as abnormal behaviors according to a threshold parameter (threshold).
The setting of the threshold parameter includes two ways:
1) and (4) setting according to the proportion. The setting can be carried out according to the historical statistical proportion of the abnormal user entity behaviors in similar large-class scenes, and the appropriate proportion can also be set according to the human intelligence formed by expert experience knowledge.
For example, the sample points are arranged in descending order according to the order from the large center distance to the small center distance of the corresponding cluster, and the operation behavior corresponding to the sample points in the first 1% range is determined to be abnormal behavior.
2) Are set by number. The threshold parameter can be flexibly adjusted according to the feedback of the detection result, if the false alarm is excessive, the threshold parameter can be adjusted to be smaller, and if the false alarm is missed, the threshold parameter can be appropriately adjusted to be larger. Of course, if the feedback of the detection result cannot be received in time, the human intelligence can be fully exerted according to experience to set a proper numerical parameter.
For example, the sample points are arranged in descending order according to the descending order of the distance between the sample point and the center of the corresponding cluster, and the operation behavior corresponding to the first 5 sample points is judged to be abnormal behavior.
As an optional implementation manner, fig. 7 is a schematic structural diagram of a clustering K value determining module provided in an embodiment of the present invention, and as shown in fig. 7, the clustering K value determining module 300 includes:
a preset K value determining submodule 3001 for setting a maximum K value KmaxA plurality of preset K values are respectively taken as [2, Kmax]A consecutive integer of (1);
setting a parameter value, i.e. the maximum K value KmaxThe K value is traversed from 2 to the maximum K value KmaxAnd obtaining a plurality of preset K values.
The CH score calculating submodule 3002 is used for calculating CH scores of all sample points subjected to K-means clustering under each preset K value respectively;
and clustering by using a K-means algorithm under each preset K value to obtain the CH score for performing K-means clustering under each preset K value.
The CH score calculation logic is as follows:
Figure BDA0003365905020000121
where N represents the total number of data samples, K represents the number of cluster classes, tr (B)K)、tr(WK) Respectively represent matrices BK、WKThe trace of (c). B isKRepresenting the dispersion degree between the classes as the covariance between the classes; wKThe degree of closeness within a class is represented as covariance within the class. Their calculation formula is as follows:
Figure BDA0003365905020000122
Figure BDA0003365905020000123
wherein, CkIs the set of all data in class k, ckIs the center point of class k, ceIs the center point of all sample data, nkIs the total number of class k data points.
The selection submodule 3003 is configured to select a preset K value with the largest CH score as a clustering K value.
And determining the preset K value with the largest CH score as the optimal clustering K value, and finally calculating by using the clustering K value when carrying out K-means clustering.
As an alternative implementation manner, fig. 8 is a schematic structural diagram of a cluster generating module provided in an embodiment of the present invention, and as shown in fig. 8, the cluster generating module 400 includes:
an initial clustering center point determining module 4001, configured to randomly select, from all sample points, sample points with the same number as the clustering K value as a current clustering center point;
the cyclic iteration module 4002 is configured to calculate a distance from each sample point to each current clustering center point, use the current clustering center point closest to each sample point as a cluster center of the sample point, and use a plurality of sample points having the same current clustering center point as one cluster; taking the center of the sample point in each cluster as the latest clustering center point; and circulating the steps until the distance between the center of the current cluster and the center of the previous cluster is smaller than a preset distance threshold, taking the current cluster as a final cluster, and taking the cluster center of the current cluster as the final cluster center.
The calculation process of the K-means clustering algorithm is as follows:
6) randomly selecting K sample points from the sample points as initial clustering centers, wherein the K value is the same as the clustering K value;
7) calculating the distance from each sample point to each initial clustering center, wherein an Euclidean distance function is adopted:
Figure BDA0003365905020000131
where n denotes a total of n dimensions (i.e., features), x denotes the coordinates of the sample point, ykThe coordinates of the K-th cluster center are represented, i 1,2,3.. No., n, K1, 2,3.. No. K;
8) and (4) classifying. The k value corresponding to the clustering midpoint shortest from each sample point is the category to which each point belongs, i.e., k is argmin d (x, y)k). Thus, all sample points are divided into K categories;
9) and updating K clustering centers. If K classes are formed in the above step, the centroids of all sample points in each cluster are used as new cluster center points from the formed K classes (clusters).
10) And (5) ending iteration and obtaining a clustering result. And repeating the steps 2) to 4) until the distance difference value between the updated clustering center and the clustering center generated in the previous iteration is smaller than a preset distance threshold, and stopping the iteration. And acquiring the final cluster center coordinate value (characteristic value) and the class label (k value) for subsequent use. The cluster generated at this time is the final cluster, and the cluster center is the final cluster center.
As an alternative implementation manner, fig. 9 is a schematic structural diagram of a module for visualizing a detection result according to an embodiment of the present invention, and as shown in fig. 9, the apparatus further includes:
a first matrix generating module 600, configured to arrange each operation characteristic value as an element into a first matrix, where the elements in each row of the matrix are operation characteristic values corresponding to the same operation behavior;
the feature data (m sample points, each corresponding to n features) are grouped into a matrix X of m rows and n columns, i.e. a first matrix, by columns.
A second matrix generation module 700, configured to calculate a mean value of each row of elements in the first matrix, and subtract the mean value of each row from each element in each row to obtain a second matrix;
the average value of each row of the first matrix X is subtracted to obtain a second matrix.
A covariance matrix generation module 800, configured to calculate a covariance matrix of the second matrix, an eigenvalue of the covariance matrix, and a corresponding eigenvector;
solving a covariance matrix C, wherein the formula is as follows:
Figure BDA0003365905020000141
there are generally two methods for solving the eigenvalues and eigenvectors of the covariance matrix: eigenvalue decomposition method and singular value decomposition method. The present embodiment adopts a eigenvalue decomposition method, as follows:
Cν=λν
v is a characteristic vector of the matrix C; λ is a characteristic value corresponding to ν and is a real number.
C=Q∑Q-1
Wherein Q is a matrix formed by eigenvectors of the matrix C, and Σ is a diagonal matrix in which elements on one diagonal are formed by eigenvalues.
A third matrix generating module 900, configured to arrange the corresponding eigenvectors into a matrix from top to bottom in rows according to a descending order of the eigenvalues, and form a third matrix by taking the first two rows of the matrix;
and arranging the eigenvectors into a matrix from top to bottom according to the sizes of the corresponding eigenvalues, and taking the first two rows to form a matrix H (namely a third matrix).
And a two-dimensional matrix generating module 1000, configured to multiply the third matrix with the first matrix to obtain a two-dimensional matrix.
And Y is the data reduced to two dimensions, namely HX, namely the visual two-dimensional matrix.
The multi-features of the feature points in the multi-dimensional space are reduced to two features through dimension reduction, so that the two-dimensional space can be plotted, and clustering peer-to-peer groups can be visually displayed on a screen.
In order to further realize visualization of the detection result, the embodiment displays the sample points corresponding to all abnormal behaviors, and simultaneously displays the current abnormal sample points. In addition, the normal points are sampled and displayed, and a certain number of points are randomly selected from the normal sample points in all the clusters in a peer-to-peer group mode to be displayed in consideration of the performance of result display. To facilitate distinguishing all points, the abnormal sample point, the current abnormal sample point, and the normal sample point can be distinguished by color, shape, and size.
Further, the present embodiment will show the important feature of the outlier. Specifically, the values, contribution degrees and corresponding cluster center values of the first several important features accounting for more than 90% of the outliers are shown. Thus, the abnormal behaviors (characteristics) of the user entities are clearly recognized.
Further, the present embodiment performs anomaly scoring on each operation behavior. And a simple scoring index is constructed, so that the abnormal detection result (alarm) is easier to understand. The logic for anomaly score calculation is: and for each abnormal point, taking the first few characteristics with the contribution degree not lower than 0.9, and calculating the average value of the ratio of the characteristic value to the corresponding cluster center coordinate value. The average value is more than 10 and is added by one; adding one minute if more than 20; more than 50, adding one minute; greater than 100, plus one minute. Base 5 points, add up to 9 points. The greater the final score, the greater the confidence that the user or entity is behaving abnormally.
As an optional implementation manner, fig. 10 is a schematic structural diagram of a normalization module provided in an embodiment of the present invention, and as shown in fig. 10, the normalization module includes:
a maximum operation characteristic value operator module 2001, configured to obtain a maximum operation characteristic value corresponding to an operation characteristic of the same type as the target operation characteristic value in a preset period;
a minimum operation feature value operator module 2002, configured to obtain a minimum operation feature value corresponding to an operation feature of the same type as the target operation feature value in a preset period;
a first difference calculation submodule 2003 for calculating a first difference between the target operation characteristic value and the minimum operation characteristic value;
a second difference calculation submodule 2004 for calculating a second difference between the maximum operational characteristic value and the minimum operational characteristic value;
and the ratio operator module 2005 is configured to calculate a ratio of the first difference to the second difference, where the ratio is used as the normalized operation characteristic value corresponding to the target operation characteristic value.
In order to make all the operation values on the same dimension, the embodiment uses the minimum maximization (minmaxscale) method for normalization. The process of processing data by the minimum maximization method is as follows: for each operating characteristic value, according to x ═ x-minA)/(maxA-minA) Calculating; wherein x is the operation characteristic value before being normalized, minAIs the minimum operating characteristic value, max, among the same operating characteristic values in a predetermined periodAX' is the maximum operation characteristic value in the same operation characteristic values in a preset period, and is the operation characteristic value after normalization processing.
The embodiment of the invention also provides an abnormal behavior detection system which comprises the abnormal behavior detection device.
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the above abnormal behavior detection method.
The storage medium stores the software, and the storage medium includes but is not limited to: optical disks, floppy disks, hard disks, erasable memory, etc.
The technical scheme has the following beneficial effects: the invention adopts the human intelligence formed by expert experience knowledge to fully mine the multidimensional characteristics, and the characteristics are very fast selected; the characteristics are processed in a maximum minimization (MinMaxScale), and the data processing technology is very sensitive to abnormal data, so that the operation performance of an abnormal detection algorithm is improved, and the quality of the data is ensured; the core algorithm is a clustering peer-to-peer group algorithm consisting of K value (clustering category number) optimization, clustering and peer-to-peer groups, compared with the traditional machine learning algorithm, the use is more convenient, the real-time detection can be realized, and the detection accuracy can be flexibly regulated and controlled by a threshold parameter (threshold) set in the peer-to-peer group; the visualization with the dimensionality reduction and peer-to-peer group as core ideas greatly enhances the intuitiveness, the clarity and the understandability of the anomaly detection method.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (12)

1. An abnormal behavior detection method, comprising:
s1, obtaining operation behaviors in a preset period and a plurality of operation characteristic values corresponding to the operation behaviors; the plurality of operation characteristic values are obtained through intelligent selection of people according to expert experience knowledge;
s2, normalizing all the operation characteristic values, and taking the operation behavior corresponding to a plurality of operation characteristic values as a sample point;
s3, determining a clustering K value by calculating CH scores corresponding to a plurality of preset K values;
s4, carrying out K-means clustering on the sample points according to the clustering K values to obtain a plurality of final clusters and a final cluster center corresponding to each final cluster;
and S5, taking each final cluster as a peer group, sequencing the sample points according to the distance between each sample point and the center of the corresponding final cluster, and judging the operation behavior corresponding to the sample point in the preset range as abnormal behavior.
2. The abnormal behavior detection method according to claim 1, wherein the determining a clustering K value by calculating CH scores corresponding to a plurality of preset K values comprises:
s301, setting a maximum K value KmaxA plurality of preset K values are respectively taken as [2, Kmax]A consecutive integer of (1);
s302, calculating CH scores of all sample points subjected to K-means clustering under each preset K value respectively;
and S303, selecting the preset K value with the largest CH score as a clustering K value.
3. The abnormal behavior detection method according to claim 1, wherein the K-means clustering of the sample points according to the K-value cluster to obtain a plurality of final clusters and a final cluster center corresponding to each final cluster comprises:
s401, randomly selecting sample points with the same number as the clustering K values from all the sample points as current clustering center points;
s402, calculating the distance from each sample point to each current clustering center point, taking the current clustering center point closest to each sample point as the cluster center of the sample point, and taking a plurality of sample points with the same current clustering center point as a cluster;
s403, taking the center of the sample point in each cluster as the latest cluster center point;
s404, the steps S402 and S403 are repeated until the distance between the center of the current cluster and the center of the previous cluster is smaller than a preset distance threshold value, the current cluster is used as a final cluster, and the cluster center of the current cluster is used as the final cluster center.
4. The abnormal behavior detection method according to claim 1, wherein after determining the operation behavior corresponding to the sample point larger than the preset threshold as the abnormal behavior, the method further comprises:
s6, arranging each operation characteristic value as an element into a first matrix, wherein the elements in each row of the matrix are the operation characteristic values corresponding to the same operation behavior;
s7, calculating the mean value of each row of elements in the first matrix, and subtracting the mean value of each row from each element in each row to obtain a second matrix;
s8, calculating a covariance matrix of the second matrix, an eigenvalue of the covariance matrix and a corresponding eigenvector;
s9, arranging corresponding eigenvectors into a matrix from top to bottom according to the sequence of the eigenvalues from big to small, and taking the first two rows of the matrix to form a third matrix;
and S10, multiplying the third matrix and the first matrix to obtain a two-dimensional matrix.
5. The abnormal behavior detection method according to claim 1, wherein the normalizing all the operation feature values comprises:
s201, obtaining a maximum operation characteristic value corresponding to operation characteristics of the same kind as a target operation characteristic value in the preset period;
s202, acquiring a minimum operation characteristic value corresponding to operation characteristics of the same kind as the target operation characteristic value in the preset period;
s203, calculating a first difference value between the target operation characteristic value and the minimum operation characteristic value;
s204, calculating a second difference value between the maximum operation characteristic value and the minimum operation characteristic value;
and S205, calculating a ratio of the first difference to the second difference to serve as a normalized operation characteristic value corresponding to the target operation characteristic value.
6. An abnormal behavior detection apparatus, comprising:
the data acquisition module is used for acquiring operation behaviors in a preset period and a plurality of operation characteristic values corresponding to the operation behaviors; the plurality of operation characteristic values are obtained through intelligent selection of people according to expert experience knowledge;
the normalization module is used for normalizing all the operation characteristic values and taking the operation behavior corresponding to a plurality of operation characteristic values as a sample point;
the clustering K value determining module is used for determining a clustering K value by calculating CH scores corresponding to a plurality of preset K values;
the cluster generation module is used for carrying out K-means clustering on the sample points according to the clustering K values to obtain a plurality of final clusters and a final cluster center corresponding to each final cluster;
and the abnormal behavior judging module is used for taking each final cluster as a peer-to-peer group, sequencing the sample points according to the distance between each sample point and the center of the corresponding final cluster, and judging the operation behavior corresponding to the sample point in the preset range as the abnormal behavior.
7. The abnormal behavior detection device according to claim 6, wherein the clustering K-value determination module comprises:
a preset K value determining submodule for setting a maximum K value KmaxA plurality of preset K values are respectively taken as [2, Kmax]A consecutive integer of (1);
the CH score calculating submodule is used for calculating the CH scores of all the sample points subjected to K-means clustering under each preset K value;
and the selection submodule is used for selecting the preset K value with the largest CH score as the clustering K value.
8. The abnormal behavior detection apparatus according to claim 6, wherein the cluster generation module comprises:
the initial clustering center point determining module is used for randomly selecting sample points with the same number as the clustering K values from all the sample points as current clustering center points;
the cyclic iteration module is used for calculating the distance from each sample point to each current clustering center point, taking the current clustering center point closest to each sample point as the cluster center of the sample point, and taking a plurality of sample points with the same current clustering center point as a cluster; taking the center of the sample point in each cluster as the latest clustering center point; and circulating the steps until the distance between the center of the current cluster and the center of the previous cluster is smaller than a preset distance threshold, taking the current cluster as a final cluster, and taking the cluster center of the current cluster as the final cluster center.
9. The abnormal behavior detection device according to claim 6, further comprising:
the first matrix generation module is used for arranging each operation characteristic value as an element into a first matrix, wherein the elements in each row of the matrix are the operation characteristic values corresponding to the same operation behavior;
the second matrix generation module is used for calculating the mean value of each row of elements in the first matrix and subtracting the mean value of each row from each element in each row to obtain a second matrix;
the covariance matrix generation module is used for calculating a covariance matrix of the second matrix, an eigenvalue of the covariance matrix and a corresponding eigenvector;
the third matrix generation module is used for arranging the corresponding eigenvectors into a matrix from top to bottom according to the sequence of the eigenvalues from big to small, and taking the first two rows of the matrix to form a third matrix;
and the two-dimensional matrix generation module is used for multiplying the third matrix and the first matrix to obtain a two-dimensional matrix.
10. The abnormal behavior detection device according to claim 6, wherein the normalization module comprises:
the maximum operation characteristic value operator module is used for acquiring a maximum operation characteristic value corresponding to the operation characteristic of the same kind as the target operation characteristic value in the preset period;
the minimum operation characteristic value operator module is used for acquiring a minimum operation characteristic value corresponding to the operation characteristic of the same kind as the target operation characteristic value in the preset period;
a first difference calculation submodule, configured to calculate a first difference between the target operation characteristic value and the minimum operation characteristic value;
a second difference calculation submodule for calculating a second difference between the maximum operation characteristic value and the minimum operation characteristic value;
and the ratio calculation submodule is used for calculating the ratio of the first difference value to the second difference value to serve as the normalized operation characteristic value corresponding to the target operation characteristic value.
11. An abnormal behavior detection system comprising the abnormal behavior detection apparatus according to any one of claims 6 to 8.
12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the abnormal behavior detection method according to any one of claims 1 to 5.
CN202111381886.5A 2021-11-22 2021-11-22 Abnormal behavior detection method, device and system and computer readable storage medium Pending CN114238952A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111381886.5A CN114238952A (en) 2021-11-22 2021-11-22 Abnormal behavior detection method, device and system and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111381886.5A CN114238952A (en) 2021-11-22 2021-11-22 Abnormal behavior detection method, device and system and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114238952A true CN114238952A (en) 2022-03-25

Family

ID=80750215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111381886.5A Pending CN114238952A (en) 2021-11-22 2021-11-22 Abnormal behavior detection method, device and system and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114238952A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116755532A (en) * 2023-08-14 2023-09-15 聊城市洛溪信息科技有限公司 Intelligent regulation and control system for ventilation device of computing server

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116755532A (en) * 2023-08-14 2023-09-15 聊城市洛溪信息科技有限公司 Intelligent regulation and control system for ventilation device of computing server
CN116755532B (en) * 2023-08-14 2023-10-31 聊城市洛溪信息科技有限公司 Intelligent regulation and control system for ventilation device of computing server

Similar Documents

Publication Publication Date Title
CN110880019B (en) Method for adaptively training target domain classification model through unsupervised domain
CN111339297B (en) Network asset anomaly detection method, system, medium and equipment
WO2019105163A1 (en) Target person search method and apparatus, device, program product and medium
CN109726735A (en) A kind of mobile applications recognition methods based on K-means cluster and random forests algorithm
US20220207397A1 (en) Artificial Intelligence (AI) Model Evaluation Method and System, and Device
CN111639497A (en) Abnormal behavior discovery method based on big data machine learning
CN111507385B (en) Extensible network attack behavior classification method
CN109558298B (en) Alarm execution frequency optimization method based on deep learning model and related equipment
CN111598179A (en) Power monitoring system user abnormal behavior analysis method, storage medium and equipment
CN109886146B (en) Flood information remote sensing intelligent acquisition method and device based on machine vision detection
CN106610977B (en) Data clustering method and device
CN108256449A (en) A kind of Human bodys' response method based on subspace grader
CN112598054A (en) Power transmission and transformation project quality general-purpose prevention and control detection method based on deep learning
CN114580572B (en) Abnormal value identification method and device, electronic equipment and storage medium
CN104835174B (en) Robust Model approximating method based on Hypergraph model search
CN114238952A (en) Abnormal behavior detection method, device and system and computer readable storage medium
CN115952067A (en) Database operation abnormal behavior detection method and readable storage medium
CN115577357A (en) Android malicious software detection method based on stacking integration technology
Zhang et al. Data anomaly detection based on isolation forest algorithm
CN116365519B (en) Power load prediction method, system, storage medium and equipment
CN108510483A (en) A kind of calculating using VLAD codings and SVM generates color image tamper detection method
CN112422546A (en) Network anomaly detection method based on variable neighborhood algorithm and fuzzy clustering
CN114708264B (en) Light spot quality judging method, device, equipment and storage medium
CN116702132A (en) Network intrusion detection method and system
CN112966732B (en) Multi-factor interactive behavior anomaly detection method with periodic attribute

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination