CN114266914A - Abnormal behavior detection method and device - Google Patents

Abnormal behavior detection method and device Download PDF

Info

Publication number
CN114266914A
CN114266914A CN202111657903.3A CN202111657903A CN114266914A CN 114266914 A CN114266914 A CN 114266914A CN 202111657903 A CN202111657903 A CN 202111657903A CN 114266914 A CN114266914 A CN 114266914A
Authority
CN
China
Prior art keywords
detected
samples
behavior
clustering
abnormal behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111657903.3A
Other languages
Chinese (zh)
Inventor
崔景洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Topsec Technology Co Ltd
Beijing Topsec Network Security Technology Co Ltd
Beijing Topsec Software Co Ltd
Original Assignee
Beijing Topsec Technology Co Ltd
Beijing Topsec Network Security Technology Co Ltd
Beijing Topsec Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Topsec Technology Co Ltd, Beijing Topsec Network Security Technology Co Ltd, Beijing Topsec Software Co Ltd filed Critical Beijing Topsec Technology Co Ltd
Priority to CN202111657903.3A priority Critical patent/CN114266914A/en
Publication of CN114266914A publication Critical patent/CN114266914A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides an abnormal behavior detection method and device, which are applied to the field of network security, wherein in the abnormal behavior detection method, a Dirichlet process mixed model can be used for carrying out fuzzy clustering on a plurality of groups of samples to be detected, and the category number of the staff behaviors can be automatically determined in the process of constructing the Dirichlet process mixed model, so that the problem that the category number of the staff behaviors is difficult to determine in the prior art is solved. In addition, through fuzzy clustering, the behavior of one employee can be divided into a plurality of categories, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.

Description

Abnormal behavior detection method and device
Technical Field
The present application relates to the field of network security, and in particular, to a method and an apparatus for detecting abnormal behavior.
Background
With the development of internet technology, network security problems in enterprises are more and more emphasized by managers, but the complex and changeable expression form of abnormal events brings great difficulty to detection work. In the enterprise at present, the external security problem in the network security problem is mainly prevention, and the internal security problem in the network security problem is mainly analysis and evaluation. However, many external security issues are rooted in internal violations, and therefore, more and more enterprises find that the behaviors of internal employees need to be analyzed to find abnormal points.
In the prior art, network intrusion is generally detected. In the aspect of employee behavior detection, due to the fact that employee behaviors are complex, the number of employee behavior categories is difficult to determine, and the attribution of the employee behavior categories is also difficult to determine, so that the accuracy of abnormal behavior detection of employees is low.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method and an apparatus for detecting abnormal behavior, so as to solve the technical problem that the accuracy of detecting abnormal behavior of an employee is low.
In a first aspect, an embodiment of the present application provides an abnormal behavior detection method, including: acquiring clustering results corresponding to a plurality of groups of samples to be detected; each group of samples to be detected comprises behavior data of an object to be detected in a preset time period, the clustering result is obtained by carrying out fuzzy clustering on a plurality of groups of samples to be detected by using a Dirichlet process mixed model, and the clustering result comprises the probability of a plurality of classes corresponding to each group of samples to be detected; carrying out abnormal behavior detection on a plurality of behaviors in the clustering result aiming at a group of samples to be detected; and if the abnormal behavior exists in the behaviors, the abnormal behavior of the object to be detected is represented. In the scheme, fuzzy clustering can be performed on a plurality of groups of samples to be detected by using the Dirichlet process mixed model, and the problem that the number of the employee behavior categories is difficult to determine in the prior art is solved because the number of the employee behavior categories can be automatically determined in the process of constructing the Dirichlet process mixed model. In addition, through fuzzy clustering, the behavior of one employee can be divided into a plurality of categories, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
In an optional implementation manner, before the obtaining of the clustering results corresponding to the multiple groups of samples to be detected, the method further includes: acquiring a plurality of groups of samples to be detected; coding each group of samples to be detected according to a behavior library so as to convert a plurality of groups of samples to be detected into a plurality of behavior vectors; and carrying out fuzzy clustering on each behavior vector according to the Dirichlet process mixed model to obtain clustering results corresponding to a plurality of groups of samples to be detected. In the scheme, the samples to be detected can be converted into behavior vectors based on the behavior library, so that the behavior data of the staff can be converted into vector data which can be calculated; then, a Dirichlet process mixed model is utilized to perform fuzzy clustering on a plurality of groups of samples to be detected, so that the problems that the number of the employee behavior categories is difficult to determine and the attribution of the employee behavior categories is difficult to determine in the prior art are solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
In an optional embodiment, the performing fuzzy clustering on each behavior vector according to the dirichlet allocation process hybrid model to obtain a clustering result corresponding to a plurality of groups of samples to be detected includes: randomly initializing a membership matrix; the membership matrix comprises initial probabilities of a plurality of classes corresponding to each group of samples to be detected; and updating the membership matrix according to the Dirichlet process mixed model and the behavior vector to obtain the clustering result. In the scheme, fuzzy clustering can be performed on a plurality of groups of samples to be detected by using the Dirichlet process mixed model, and the problem that the number of the employee behavior categories is difficult to determine in the prior art is solved because the number of the employee behavior categories can be automatically determined in the process of constructing the Dirichlet process mixed model. In addition, through fuzzy clustering, the behavior of one employee can be divided into a plurality of categories, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
In an optional embodiment, before performing abnormal behavior detection on multiple behaviors in the clustering result for a group of samples to be detected, the method further includes: acquiring a fuzzy granularity adjusting parameter; and extracting the probability of partial categories from the clustering result according to the fuzzy granularity adjusting parameters to obtain a new clustering result. In the above scheme, different fuzzy granularity adjusting parameters can be obtained according to different actual requirements, so that different fuzzy category partitions can be obtained according to different fuzzy granularity parameters, and different clustering results can be further obtained.
In an optional embodiment, the performing, for a group of samples to be detected, abnormal behavior detection on multiple behaviors in the clustering result includes: determining a plurality of behavior categories to which the samples to be detected belong according to the probability of the category corresponding to the samples to be detected in the new clustering result; and carrying out abnormal behavior detection on the sample to be detected according to the probability related to the plurality of behavior categories in the new clustering result. In the scheme, the behavior of one employee can be divided into a plurality of categories through fuzzy clustering, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
In an optional implementation manner, before the obtaining of the clustering results corresponding to the multiple groups of samples to be detected, the method further includes: acquiring an operation log; and processing the operation log to obtain a behavior library comprising a plurality of behaviors. In the above scheme, a behavior library including a plurality of behaviors may be obtained by acquiring and processing an operation log. The operation log can be processed in various ways, various behaviors can be sorted, and the dimensionality of the behaviors can be increased, so that the detection accuracy can be improved when the behavior library is used for detecting abnormal behaviors.
In a second aspect, an embodiment of the present application provides an abnormal behavior detection apparatus, including: the first acquisition module is used for acquiring clustering results corresponding to a plurality of groups of samples to be detected; each group of samples to be detected comprises behavior data of an object to be detected in a preset time period, the clustering result is obtained by carrying out fuzzy clustering on a plurality of groups of samples to be detected by using a Dirichlet process mixed model, and the clustering result comprises the probability of a plurality of classes corresponding to each group of samples to be detected; the detection module is used for detecting abnormal behaviors of a plurality of behaviors of the samples to be detected in the clustering result aiming at a group of samples to be detected; and if the abnormal behavior exists in the behaviors, the abnormal behavior of the object to be detected is represented. In the scheme, fuzzy clustering can be performed on a plurality of groups of samples to be detected by using the Dirichlet process mixed model, and the problem that the number of the employee behavior categories is difficult to determine in the prior art is solved because the number of the employee behavior categories can be automatically determined in the process of constructing the Dirichlet process mixed model. In addition, through fuzzy clustering, the behavior of one employee can be divided into a plurality of categories, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
In an optional embodiment, the abnormal behavior detection apparatus further includes: the second acquisition module is used for acquiring a plurality of groups of samples to be detected; the coding module is used for coding each group of samples to be detected according to the behavior library so as to convert a plurality of groups of samples to be detected into a plurality of behavior vectors; and the clustering module is used for carrying out fuzzy clustering on each behavior vector according to the Dirichlet process mixed model to obtain clustering results corresponding to a plurality of groups of samples to be detected. In the scheme, the samples to be detected can be converted into behavior vectors based on the behavior library, so that the behavior data of the staff can be converted into vector data which can be calculated; then, a Dirichlet process mixed model is utilized to perform fuzzy clustering on a plurality of groups of samples to be detected, so that the problems that the number of the employee behavior categories is difficult to determine and the attribution of the employee behavior categories is difficult to determine in the prior art are solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
In an optional embodiment, the clustering module is specifically configured to: randomly initializing a membership matrix; the membership matrix comprises initial probabilities of a plurality of classes corresponding to each group of samples to be detected; and updating the membership matrix according to the Dirichlet process mixed model and the behavior vector to obtain the clustering result. In the scheme, fuzzy clustering can be performed on a plurality of groups of samples to be detected by using the Dirichlet process mixed model, and the problem that the number of the employee behavior categories is difficult to determine in the prior art is solved because the number of the employee behavior categories can be automatically determined in the process of constructing the Dirichlet process mixed model. In addition, through fuzzy clustering, the behavior of one employee can be divided into a plurality of categories, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
In an optional embodiment, the abnormal behavior detection apparatus further includes: the third acquisition module is used for acquiring fuzzy granularity adjusting parameters; and the extracting module is used for extracting the probability of partial categories from the clustering result according to the fuzzy granularity adjusting parameters to obtain a new clustering result. In the above scheme, different fuzzy granularity adjusting parameters can be obtained according to different actual requirements, so that different fuzzy category partitions can be obtained according to different fuzzy granularity parameters, and different clustering results can be further obtained.
In an optional embodiment, the detection module is specifically configured to: determining a plurality of behavior categories to which the samples to be detected belong according to the probability of the category corresponding to the samples to be detected in the new clustering result; and carrying out abnormal behavior detection on the sample to be detected according to the probability related to the plurality of behavior categories in the new clustering result. In the scheme, the behavior of one employee can be divided into a plurality of categories through fuzzy clustering, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
In an optional embodiment, the abnormal behavior detection apparatus further includes: the fourth acquisition module is used for acquiring the operation log; and the processing module is used for processing the operation log to obtain a behavior library comprising a plurality of behaviors. In the above scheme, a behavior library including a plurality of behaviors may be obtained by acquiring and processing an operation log. The operation log can be processed in various ways, various behaviors can be sorted, and the dimensionality of the behaviors can be increased, so that the detection accuracy can be improved when the behavior library is used for detecting abnormal behaviors.
In a third aspect, an embodiment of the present application provides a computer program product, which includes computer program instructions, and when the computer program instructions are read and executed by a processor, the abnormal behavior detection method according to any one of the first aspect is performed.
In a fourth aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory, and a bus; the processor and the memory are communicated with each other through the bus; the memory stores computer program instructions executable by the processor, the processor invoking the computer program instructions capable of performing the abnormal behavior detection method of any of the first aspects.
In a fifth aspect, the present application provides a computer-readable storage medium, which stores computer program instructions, and when the computer program instructions are executed by a computer, the computer executes the abnormal behavior detection method according to any one of the first aspect.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a flowchart of an abnormal behavior detection method according to an embodiment of the present application;
fig. 2 is a block diagram of an abnormal behavior detection apparatus according to an embodiment of the present disclosure;
fig. 3 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
Referring to fig. 1, fig. 1 is a flowchart of an abnormal behavior detection method according to an embodiment of the present disclosure, where the abnormal behavior detection method may include the following steps:
step S101: and acquiring clustering results corresponding to a plurality of groups of samples to be detected.
Step S102: and aiming at a group of samples to be detected, carrying out abnormal behavior detection on a plurality of behaviors in the clustering result.
Specifically, in the embodiment of the present application, a plurality of groups of samples to be detected may be obtained in advance, and then a dirichlet process mixed model is used to perform fuzzy clustering on the plurality of groups of samples to be detected, so as to obtain corresponding clustering results. When abnormal detection needs to be performed on the behaviors in the samples to be detected, the clustering results can be obtained, and then abnormal behavior detection is performed on the clustering results for each group of samples to be detected.
Each group of samples to be detected comprises behavior data of an object to be detected in a preset time period; in other words, the data in a set of samples to be detected characterizes at least one behavior of a certain object to be detected operating within a certain time period. It is understood that, in the embodiment of the present application, the specific size of the preset time period is not specifically limited, for example: five minutes, one hour, etc.
For example, the plurality of sets of samples to be detected may include all behaviors of each employee in a certain enterprise within an hour and every five minutes; that is, for an employee in the enterprise, a set of samples to be tested includes all the activities of the employee within five minutes, and the employee corresponds to twelve sets of samples to be tested.
According to the above description, it can be known that a group of samples to be detected includes data corresponding to at least one behavior, so that in order to achieve that all the behaviors in the samples to be detected can be subjected to abnormal detection, the samples to be detected can be subjected to fuzzy clustering, and a clustering result is obtained. The clustering result includes probabilities of multiple categories corresponding to each group of samples to be detected, that is, the clustering result includes attribution probabilities of the categories corresponding to the samples to be detected.
It should be noted that, the specific implementation of obtaining multiple groups of samples to be detected and obtaining a clustering result through fuzzy clustering in the foregoing embodiment will be described in detail in the following embodiments, which will not be described here.
Next, for each group of samples to be detected in the multiple groups of samples to be detected, the abnormal behavior detection method provided in the embodiment of the present application may be used to perform abnormal detection. It can be understood that, according to the above description, since a group of samples to be detected includes data corresponding to at least one behavior, when performing abnormal behavior detection, all behaviors in the group of samples to be detected need to be detected.
As an implementation manner, when it is found that one or more abnormal behaviors exist in the behaviors of the group of samples to be detected after abnormal behavior detection, it may be considered that the abnormal behavior exists in the period of time for the employee corresponding to the group of samples to be detected. As another embodiment, in the process of detecting the abnormal behavior of the behavior in the group of samples to be detected, the number of the abnormal behavior may be counted, and the behavior of the employee within the period of time may be evaluated according to the counting result.
It is to be understood that, in the embodiment of the present application, a specific implementation manner of evaluating the behavior of the employee according to the counting result is not particularly limited, and a person skilled in the art may appropriately select the evaluation according to the actual situation. For example, when there is no abnormal behavior in the group of samples to be detected, the employee may be considered to have no abnormal behavior; when the abnormal behaviors in the group of samples to be detected are less than half of all the behaviors, the employee can be considered to have slight abnormal behaviors; when the abnormal behaviors in the group of samples to be detected are more than half of all the behaviors, the employee can be considered to have serious abnormal behaviors.
It should be noted that, the specific implementation of detecting abnormal behaviors in the clustering result in the foregoing embodiment will also be described in detail in the following embodiment, and will not be described here for the time being.
In the scheme, fuzzy clustering can be performed on a plurality of groups of samples to be detected by using the Dirichlet process mixed model, and the problem that the number of the employee behavior categories is difficult to determine in the prior art is solved because the number of the employee behavior categories can be automatically determined in the process of constructing the Dirichlet process mixed model. In addition, through fuzzy clustering, the behavior of one employee can be divided into a plurality of categories, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
Further, before the step S101, the abnormal behavior detection method provided in the embodiment of the present application may further include the following steps:
step 1), obtaining a plurality of groups of samples to be detected.
And 2) coding each group of samples to be detected according to the behavior library so as to convert a plurality of groups of samples to be detected into a plurality of behavior vectors.
And 3) carrying out fuzzy clustering on each behavior vector according to the Dirichlet process mixed model to obtain clustering results corresponding to a plurality of groups of samples to be detected.
Specifically, before performing anomaly detection on employee behaviors, a corresponding sample to be detected needs to be obtained first. The specific meaning of the sample to be detected has been described in detail in the above embodiments, and is not described herein again.
It is to be understood that, the embodiment of the present application is not particularly limited to the specific implementation of obtaining the sample to be detected, and those skilled in the art may appropriately select the implementation manner according to the actual situation. For example: the electronic equipment can receive a sample to be detected sent by external equipment; or the electronic device can read the operation log stored in the cloud as a sample to be detected, and the like.
Next, the electronic device may encode the sample to be detected according to a predetermined behavior library, so as to convert the sample to be detected into a behavior vector. It is understood that each set of samples to be detected corresponds to a behavior vector during the conversion process.
As an implementation mode, the sample to be detected can be coded by adopting a One-hot coding mode. In this encoding manner, the behavior vector may be a string including a plurality of 0 s or 1 s, where the number of characters in the behavior vector may be equal to the number of behaviors in the behavior library. For example: there are 100 behaviors in the behavior library, and the behavior vector includes 100 characters, i.e. the behavior vector is a 100-dimensional vector.
Specifically, first, the behavior libraries determined in advance may be sequentially numbered, that is: the first behavior in the behavior library is No. one, the second behavior in the behavior library is No. two, and so on; then, for each group of samples to be detected, if the group of samples to be detected comprises a first behavior, the first bit of the behavior vector is coded as 1; if the group of samples to be detected does not comprise the first behavior, the first bit of the behavior vector is coded as 0; if the group of samples to be detected comprises a second behavior, the second bit code of the behavior vector is 1; if the set of samples to be detected does not include the second behavior, the second bit code of the behavior vector is 0; and the analogy is carried out until the coding of all the behaviors in the behavior library is completed.
After the conversion of the behavior vectors is completed, fuzzy clustering can be performed on each behavior vector obtained in the embodiment according to a predetermined Dirichlet process mixed model, so that clustering results corresponding to a plurality of groups of samples to be detected are obtained. Then, detection of abnormal behavior may be performed based on the above-described clustering result.
In the scheme, the samples to be detected can be converted into behavior vectors based on the behavior library, so that the behavior data of the staff can be converted into vector data which can be calculated; then, a Dirichlet process mixed model is utilized to perform fuzzy clustering on a plurality of groups of samples to be detected, so that the problems that the number of the employee behavior categories is difficult to determine and the attribution of the employee behavior categories is difficult to determine in the prior art are solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
Further, the step of performing fuzzy clustering on each behavior vector according to the dirichlet allocation process mixed model to obtain clustering results corresponding to a plurality of groups of samples to be detected may include the following steps:
and step 1), randomly initializing a membership matrix.
And 2), updating the membership matrix according to the Dirichlet process mixed model and the behavior vector to obtain a clustering result.
Specifically, first, a membership matrix Q may be initialized. It should be noted that the membership matrix Q is a matrix with N rows and K columns, and includes probabilities of multiple categories corresponding to each group of samples to be detected; when the membership matrix Q is initialized, the membership matrix Q may be filled with an all-zero matrix, that is, the initial probability of each group of the multiple classes corresponding to the samples to be detected in the membership matrix Q is 0. This is because the matrix is updated element by element in the following algorithm, so that the use of all-zero matrix filling can provide a place-occupying function, and is convenient to locate when a certain element is updated.
Wherein N is the total number of samples to be detected, and K is the total number of behavior categories in the samples to be detected; k is obtained from the repeated dirichlet process, and the subsequent dirichlet process changes the total number of classes (for example, when the first time is performed, K may be set to 1, there is only one class at this time; there may be multiple classes in the subsequent sampling process).
Next, the initialized membership matrix Q may be updated according to a predetermined dirichlet process hybrid model and the behavior vector obtained in the above embodiment, so as to obtain a clustering result. It can be understood that, since the above process is a process of continuously updating the membership matrix Q, in the embodiment of the present application, the finally obtained clustering result is the updated membership matrix Q.
A specific embodiment of determining a hybrid model of the dirichlet process is described below: constructed by the chinese restaurant process.
The first step, supposing that an unlimited number of tables are placed in a Chinese restaurant;
secondly, each person eating the table selects one table, and the first person eating the table selects the first table;
thirdly, for the following customers, the table is selected to sit down according to the following rules: for the nth customer, there are
Figure BDA0003448866720000121
Is seated on a table with a person, where nkIs the number of customers already at the kth desk, n-1 represents the total number of customers already at the time before this customer, a0Controlling the extent of aggregation of Chinese restaurant processes, a0The bigger the size, the more scattered the eaters sit;
fourth, for the nth customer: is provided with
Figure BDA0003448866720000122
Is sitting on a new table, where nkIs the number of customers already at the kth desk and n-1 represents the total number of customers that are already in front of this customer. a is0Controlling the extent of aggregation of Chinese restaurant processes, a0The larger the size, the more discrete the eater sits.
Fifthly, according to a Gibbs sampling method, a plurality of behavior vector sets X to be detected are given as X ═ X1,x2,...,xN}, category clusteringNumber Z ═ Z1,z2,...,zKAnd Z-i is set to represent a category set of other behavior vectors except the ith behavior vector to be detected, and the following relationship is established:
p(zi=k|z-i,x1:n,α0)∝p(zi|z-i,α0)p(xi|zi=k,z-i,x-i);
the first item on the right in the above equation holds in connection with the chinese restaurant process as follows:
Figure BDA0003448866720000131
with collapsed gibbs sampling, samples can be drawn from the marginalized posterior, then:
p(Z|X)=∫θp(θ,Z|X)dθ;
the derivation of the calculation method is performed according to the above formula, and the final result is as follows:
if z isiIs the previous class, then:
Figure BDA0003448866720000132
if z isiIs a new class, then
Figure BDA0003448866720000133
The following describes a specific embodiment of updating the membership matrix according to the dirichlet process mixture model and the behavior vector.
After initializing the membership matrix Q, first, an iteration number (T is 1, and when T > T, the updating is stopped, where T is a cycle number input by a user and used for controlling the iteration process) may be set.
Then, updating is carried out for each behavior vector (i-th behavior vector in N): according to each category (kth category in K) corresponding to the behavior vector, a Dirichlet process mixed model p (z) can be basediK) updates the probabilities of the behaviors in the membership matrix Q. In this process, k class probabilities for each behavior vector are computed.
For example, assume that the membership matrix Q is a five-row three-column matrix: firstly, aiming at a first behavior vector, updating three probabilities of a first row in a membership degree matrix Q according to the first behavior vector; then, aiming at a second behavior vector, updating three probabilities of a second row in the membership degree matrix Q according to the second behavior vector; then, aiming at the third behavior vector, updating the three probabilities of the third row in the membership degree matrix Q according to the third behavior vector; then, aiming at the fourth behavior vector, updating the three probabilities in the fourth row of the membership degree matrix Q according to the fourth behavior vector; and then, aiming at the fifth behavior vector, updating the three probabilities of the fifth row in the membership degree matrix Q according to the fifth behavior vector.
Then, Gibbs sampling can be performed on the sample i, if the sampling result is a new class, a new class is created, and a column of membership degree matrix Q is expanded, and meanwhile, the value of K is also changed. At this time, the membership matrix Q completes a complete update, which is recorded as
Figure BDA0003448866720000141
And then changing the size of t, repeating the process to perform the next complete update of the membership degree matrix Q, and marking as the second complete update
Figure BDA0003448866720000142
Repeating the above processes, and recording as the third time after the third time of updating
Figure BDA0003448866720000143
Repeating the above steps until the T-th update is completed, wherein the membership matrix Q is
Figure BDA0003448866720000144
The membership matrix
Figure BDA0003448866720000145
Namely the clustering result.
That is, the above-described process can be expressed as the following steps:
step one, initializing a membership matrix Q at random;
step two, setting t to be 1;
thirdly, repeating the following steps:
step 1, aiming at the ith behavior vector in the N behavior vectors and the kth behavior in the behaviors of K in the ith behavior vector, calculating p (z)iK), updating the membership matrix
Figure BDA0003448866720000151
Step 2, ending the K cycle;
step 3, sampling Zi, if the new class is selected, K is K +1, and calculating p (z)iK +1), extended into a membership matrix
Figure BDA0003448866720000152
Step 4, ending the N circulation;
step 5, setting t as t + 1;
fourthly, ending the steps until T is larger than T or the sampling result is converged and outputting a membership matrix
Figure BDA0003448866720000153
In the scheme, fuzzy clustering can be performed on a plurality of groups of samples to be detected by using the Dirichlet process mixed model, and the problem that the number of the employee behavior categories is difficult to determine in the prior art is solved because the number of the employee behavior categories can be automatically determined in the process of constructing the Dirichlet process mixed model. In addition, through fuzzy clustering, the behavior of one employee can be divided into a plurality of categories, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
Further, before the step S102, the abnormal behavior detection method provided in the embodiment of the present application may further include the following steps:
and step 1), acquiring fuzzy granularity adjusting parameters.
And 2) extracting the probability of partial categories from the clustering result according to the fuzzy granularity adjusting parameters to obtain a new clustering result.
Specifically, the main function of the fuzzy granularity adjustment is to obtain different fuzzy category partitions by selecting different category extraction methods. Different fuzzy classification can obtain different clustering results, and different results can be obtained by carrying out multi-dimensional anomaly detection in different clustering results.
The following four methods are introduced for different fuzzy particle size adjusting parameters: first, TopN mode; secondly, defining a threshold mode by user; thirdly, selecting a mode at random in N classes; fourth, full dim mode.
In the TopN mode, the method can extract the TopN value with the highest membership degree from a plurality of behaviors in the membership degree matrix and return the corresponding class label. Wherein, N in TopN represents the number of labels of each sample category desired by the user. It will be appreciated that in the TopN mode, the number of new clustering result rows equals the number of row vectors and the number of columns equals N.
In the self-defined threshold value mode, the method can segment the values in the membership degree matrix according to the threshold value input by a user, and the class information larger than the threshold value is reserved as the result of fuzzy clustering. If the probability that a sample corresponds to all the categories is less than the threshold, then the max function is used to select the one with the highest probability corresponding to the category as the category of the sample. It will be appreciated that in the custom threshold mode, the number of new clustering result rows equals the number of row vectors and the number of columns equals 1.
In the N-type random selection mode, N in this method is similar to N in TopN, and is also input by the user. And after TopN class labels are obtained, randomly selecting one of the TopN class labels as a clustering result of the sample. It will be appreciated that in the N-type random selection mode, the number of new clustering result rows is equal to the number of row vectors and the number of columns is equal to 1.
In the fully-fuzzy mode, all sample attribution class probabilities in the membership matrix are retained. It will be appreciated that in the fully-ambiguous mode, the new clustering result rows equal the number of behavior vectors and the columns equal the number of behaviors.
In the above scheme, different fuzzy granularity adjusting parameters can be obtained according to different actual requirements, so that different fuzzy category partitions can be obtained according to different fuzzy granularity parameters, and different clustering results can be further obtained.
Further, the step S102 may include the following steps:
and step 1), determining a plurality of behavior categories to which the samples to be detected belong according to the probability of the category corresponding to the samples to be detected in the new clustering result.
And 2) carrying out abnormal behavior detection on the sample to be detected according to the probability related to the plurality of behavior categories in the new clustering result.
Specifically, for a group of samples to be detected, the behavior category included in the samples to be detected can be determined according to the new clustering result corresponding to the samples to be detected. For example: the total number of categories is 100, and the group of samples to be tested includes 20 behavior categories.
The resulting membership matrix may then be updated from the above embodiments
Figure BDA0003448866720000171
All probabilities associated with the 20 middle behavior categories are extracted. For example: if the number of the behavior vectors is 3, and the first behavior vector corresponds to 10 of the 20 behavior categories, extracting the 10 categories corresponding to the behavior vector and samples in the categories; the second behavior vector corresponds to 18 of the 20 behavior categories, and the corresponding behavior vector is extractedThe 18 classes and samples within the classes; the first behavior vector corresponds to 12 of the 20 behavior classes, and then the 12 classes corresponding to the behavior vector and the samples in the classes are extracted.
And finally, based on the extracted samples in the fuzzy categories to which all the samples to be detected belong, the abnormal behavior of the samples to be detected can be detected.
As an implementation manner, an embodiment of the present application provides an algorithm (ADAMCA) for detecting multiple cluster attribution anomalies.
In ADAMCA, among all data points in the same cluster, the distance between the kth point closest to the point O to be analyzed and the point O is called K-Nearest Neighbor distance (K-Nearest Neighbor Distances) and is denoted by the point O. If the point O belongs to M clusters in the fuzzy clustering result, the k neighbor distance of the point O is M. The larger the k-nearest neighbor distance of O, the more sparse the surrounding points are, the further away from the mainstream data distribution.
Assuming that all points from the kth point to point O are points between K neighbors of point O, these points are represented by Nk(O) represents. The neighborhood Density (NND) of point O is defined as the inverse of the average distance of point O from all points in the K neighborhood, i.e.:
Figure BDA0003448866720000181
wherein, the higher the average distance, the lower the neighbor density; a low neighbor density means that the neighborhood of the point is sparse. If the point O belongs to M clusters, the neighbor density of the point O has M, and the point O is marked as NNDM(O). In practice, it is difficult to determine whether a certain point is abnormal or not by the condition of the neighborhood density around the point, and the relative abnormal condition needs to be analyzed.
The abnormal degree of the point O can be measured by using a Near Neighbor Relative Abnormal Factor (NNRAF), and the abnormal Factor is divided into an average value of the ratio of the average Neighbor density of the sample point in the neighborhood of the point O to the Neighbor density of the point O, that is:
Figure BDA0003448866720000182
according to the definition of NNRAF, if the proximity relative anomaly factor of a point is less than 1, the point is positioned densely, the number of nearby sample points is large, and the point is a normal point. If the NNRAF value of the point is larger than 1, the local neighborhood density of the point is smaller compared with the points in the neighborhood, and the point belongs to an abnormal point.
Thus, the ADAMCA algorithm described above may include the following steps:
step one, traversing all samples, and respectively recording the samples as points O;
secondly, for the point O, extracting the point O in the K neighborhood of each cluster according to the new clustering result;
thirdly, calculating the reciprocal of the average distance between the point O and all points in the K neighborhood:
Figure BDA0003448866720000191
fourth, calculate the neighbor relative anomaly factor for point O:
Figure BDA0003448866720000192
and fifthly, judging whether the NNRAF of the point O is larger than 1, and if the NNRAF of the point O is larger than 1, determining that the point is an abnormal point.
In the scheme, the behavior of one employee can be divided into a plurality of categories through fuzzy clustering, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
Further, before the step S101, an embodiment of the present application may further provide a behavior library generating method, where the behavior library generating method may include the following steps:
step 1), obtaining an operation log.
And 2) processing the operation log to obtain a behavior library comprising a plurality of behaviors.
Specifically, the electronic device may collect and process behavior data in order to generate the behavior library. First, the electronic device may obtain an operation log. The specific type of the operation log is not specifically limited in the embodiments of the present application, for example, the operation log may include a database operation log, a host operation log, and the like.
It can be understood that, in the embodiment of the present application, neither the time length nor the time interval of the obtained operation log is specifically limited. For example: the electronic equipment can acquire the operation log once a day, and the acquired operation log can be an operation log of a whole day; alternatively, the operation log may be acquired once every month, the acquired operation log may be an operation log of a half month, or the like.
In addition, the embodiment of the present application also does not specifically limit the specific implementation of obtaining the operation log, and those skilled in the art can appropriately select the operation log according to actual situations. For example: the electronic equipment can receive an operation log sent by the external equipment; or the electronic device may read an operation log stored in the cloud.
As an embodiment, the operation log obtained by the electronic device may generally include the following contents: the main Account number Host _ Account is used for distinguishing different employees; the operation Behavior Behavior is used for distinguishing different behaviors of the staff; the operation Object is used for distinguishing objects operated by the staff; the operation Time operation _ Time is used for distinguishing the Time of the operation of the employee.
Next, after acquiring the operation log, the electronic device needs to perform certain processing on the operation log in order to obtain a behavior library required in the abnormal behavior detection method. The specific implementation manner of processing the operation log in the embodiment of the present application is also not limited in particular, and a person skilled in the art may select one or more processing manners to process the operation log according to actual situations.
Three ways of handling the operation log are exemplified below.
First, as an embodiment, since the operation data of the user, which is usually stored in the operation data, is very cluttered, the operation log can be flushed. There are various ways to clean the operation log, for example: the problem of incomplete data can be solved by cleaning, the detection and solution of error values can be realized by cleaning, the detection and elimination of repeated records can be realized by cleaning, the problem of inconsistency among data can be solved by cleaning, and the like.
And secondly, extracting and splicing the behaviors in the operation log. For example, the number of the host operation commands or the database operation commands is very limited; and the related instructions are less frequently used, so that the distinction degree is not enough when the operation instructions are used for describing the behavior of the staff. In general, the log records can contain "operation ID", "operation time", "operation behavior", "operation object", and the like, so when describing the employee data operation behavior, a form of operation + operation object concatenation can be used.
Taking the following table 1 as an example, where table 1 is a database operation behavior sample table, the operation behaviors of three employees in table 1 are spliced to obtain: "select NG1_ TBCS _ A2", "select cjb2_60_ 98" and "insert NG1_ ACCT _ A1", which are clearly more able to distinguish the behavior of the employees than "select", "select" and "insert", and also increase the dimension of the behavior.
Table 1 database operation behavior sample table
Operation ID Time of operation Operational behavior Operation object
kdyyd1 2021/6/9 17:49:56 select NG1_TBCS_A2
kmkf3 2021/8/28 10:45:39 select cjb2_60_98
kdyyb2 2021/6/9 17:49:56 insert NG1_ACCT_A1
Thirdly, the times of actions in the operation log can be counted, and actions with too low occurrence frequency can be deleted. It is understood that, in the process, if the operation log is spliced before, the process can also delete the splicing error at the same time.
It should be noted that the three processing methods in the foregoing embodiments are only three examples provided in the embodiments of the present application, and in an actual process, a person skilled in the art may select one or more processing methods to process the operation log, or may select other processing methods to process the operation log.
In the above scheme, a behavior library including a plurality of behaviors may be obtained by acquiring and processing an operation log. The operation log can be processed in various ways, various behaviors can be sorted, and the dimensionality of the behaviors can be increased, so that the detection accuracy can be improved when the behavior library is used for detecting abnormal behaviors.
Referring to fig. 2, fig. 2 is a block diagram of an abnormal behavior detection apparatus according to an embodiment of the present disclosure, where the abnormal behavior detection apparatus 200 may include: a first obtaining module 201, configured to obtain clustering results corresponding to multiple groups of samples to be detected; each group of samples to be detected comprises behavior data of an object to be detected in a preset time period, the clustering result is obtained by carrying out fuzzy clustering on a plurality of groups of samples to be detected by using a Dirichlet process mixed model, and the clustering result comprises the probability of a plurality of classes corresponding to each group of samples to be detected; the detection module 202 is configured to perform abnormal behavior detection on multiple behaviors of a group of samples to be detected in the clustering result; and if the abnormal behavior exists in the behaviors, the abnormal behavior of the object to be detected is represented.
In the embodiment of the application, a Dirichlet process mixed model can be used for carrying out fuzzy clustering on a plurality of groups of samples to be detected, and the problem that the number of the employee behavior categories is difficult to determine in the prior art is solved because the number of the employee behavior categories can be automatically determined in the process of constructing the Dirichlet process mixed model. In addition, through fuzzy clustering, the behavior of one employee can be divided into a plurality of categories, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
Further, the abnormal behavior detection apparatus 200 further includes: the second acquisition module is used for acquiring a plurality of groups of samples to be detected; the coding module is used for coding each group of samples to be detected according to the behavior library so as to convert a plurality of groups of samples to be detected into a plurality of behavior vectors; and the clustering module is used for carrying out fuzzy clustering on each behavior vector according to the Dirichlet process mixed model to obtain clustering results corresponding to a plurality of groups of samples to be detected.
In the embodiment of the application, the samples to be detected can be converted into the behavior vectors based on the behavior library, so that the behavior data of the staff can be converted into vector data which can be calculated; then, a Dirichlet process mixed model is utilized to perform fuzzy clustering on a plurality of groups of samples to be detected, so that the problems that the number of the employee behavior categories is difficult to determine and the attribution of the employee behavior categories is difficult to determine in the prior art are solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
Further, the clustering module is specifically configured to: randomly initializing a membership matrix; the membership matrix comprises initial probabilities of a plurality of classes corresponding to each group of samples to be detected; and updating the membership matrix according to the Dirichlet process mixed model and the behavior vector to obtain the clustering result.
In the embodiment of the application, a Dirichlet process mixed model can be used for carrying out fuzzy clustering on a plurality of groups of samples to be detected, and the problem that the number of the employee behavior categories is difficult to determine in the prior art is solved because the number of the employee behavior categories can be automatically determined in the process of constructing the Dirichlet process mixed model. In addition, through fuzzy clustering, the behavior of one employee can be divided into a plurality of categories, and the behavior of the employee is evaluated from multiple angles, so that the problem that the attribute of the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
Further, the abnormal behavior detection apparatus 200 further includes: the third acquisition module is used for acquiring fuzzy granularity adjusting parameters; and the extracting module is used for extracting the probability of partial categories from the clustering result according to the fuzzy granularity adjusting parameters to obtain a new clustering result.
In the embodiment of the application, different fuzzy granularity adjusting parameters can be obtained according to different actual requirements, so that different fuzzy category divisions can be obtained according to different fuzzy granularity parameters, and different clustering results can be further obtained.
Further, the detection module 202 is specifically configured to: determining a plurality of behavior categories to which the samples to be detected belong according to the probability of the category corresponding to the samples to be detected in the new clustering result; and carrying out abnormal behavior detection on the sample to be detected according to the probability related to the plurality of behavior categories in the new clustering result.
In the embodiment of the application, the behavior of one employee can be divided into a plurality of categories through fuzzy clustering, and the behavior of the employee is evaluated from multiple angles, so that the problem that the behavior category of the employee is difficult to determine in the prior art is solved. Therefore, the accuracy of detecting the abnormal behavior of the employee can be improved.
The abnormal behavior detection apparatus 200 further includes: the fourth acquisition module is used for acquiring the operation log; and the processing module is used for processing the operation log to obtain a behavior library comprising a plurality of behaviors.
In the embodiment of the application, a behavior library including a plurality of behaviors can be obtained by acquiring and processing the operation log. The operation log can be processed in various ways, various behaviors can be sorted, and the dimensionality of the behaviors can be increased, so that the detection accuracy can be improved when the behavior library is used for detecting abnormal behaviors.
Referring to fig. 3, fig. 3 is a block diagram of an electronic device according to an embodiment of the present disclosure, where the electronic device 300 includes: at least one processor 301, at least one communication interface 302, at least one memory 303, and at least one communication bus 304. Wherein the communication bus 304 is used for realizing direct connection communication of these components, the communication interface 302 is used for communicating signaling or data with other node devices, and the memory 303 stores machine readable instructions executable by the processor 301. When the electronic device 300 is operating, the processor 301 communicates with the memory 303 via the communication bus 304, and the machine-readable instructions, when called by the processor 301, perform the above-described abnormal behavior detection method.
For example, the processor 301 of the embodiment of the present application may implement the following method by reading the computer program from the memory 303 through the communication bus 304 and executing the computer program: step S101: and acquiring clustering results corresponding to a plurality of groups of samples to be detected. Step S102: and aiming at a group of samples to be detected, carrying out abnormal behavior detection on a plurality of behaviors in the clustering result.
The processor 301 includes one or more, which may be an integrated circuit chip, having signal processing capabilities. The Processor 301 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Micro Control Unit (MCU), a Network Processor (NP), or other conventional processors; the Processor may also be a dedicated Processor, including a Neural-Network Processing Unit (NPU), a Graphics Processing Unit (GPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, and a discrete hardware component. Also, when the processor 301 is a plurality of processors, a part thereof may be a general-purpose processor, and another part thereof may be a dedicated processor.
The Memory 303 includes one or more of, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an electrically Erasable Programmable Read-Only Memory (EEPROM), and the like.
It will be appreciated that the configuration shown in fig. 3 is merely illustrative and that electronic device 300 may include more or fewer components than shown in fig. 3 or have a different configuration than shown in fig. 3. The components shown in fig. 3 may be implemented in hardware, software, or a combination thereof. In the embodiment of the present application, the electronic device 300 may be, but is not limited to, an entity device such as a desktop, a notebook computer, a smart phone, an intelligent wearable device, and a vehicle-mounted device, and may also be a virtual device such as a virtual machine. In addition, the electronic device 300 is not necessarily a single device, but may also be a combination of multiple devices, such as a server cluster, and the like.
Embodiments of the present application further provide a computer program product, including a computer program stored on a computer-readable storage medium, where the computer program includes computer program instructions, and when the computer program instructions are executed by a computer, the computer can perform the steps of the abnormal behavior detection method in the foregoing embodiments, for example, including: acquiring clustering results corresponding to a plurality of groups of samples to be detected; each group of samples to be detected comprises behavior data of an object to be detected in a preset time period, the clustering result is obtained by carrying out fuzzy clustering on a plurality of groups of samples to be detected by using a Dirichlet process mixed model, and the clustering result comprises the probability of a plurality of classes corresponding to each group of samples to be detected; carrying out abnormal behavior detection on a plurality of behaviors in the clustering result aiming at a group of samples to be detected; and if the abnormal behavior exists in the behaviors, the abnormal behavior of the object to be detected is represented.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
In addition, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
It should be noted that the functions, if implemented in the form of software functional modules and sold or used as independent products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. An abnormal behavior detection method, comprising:
acquiring clustering results corresponding to a plurality of groups of samples to be detected; each group of samples to be detected comprises behavior data of an object to be detected in a preset time period, the clustering result is obtained by carrying out fuzzy clustering on a plurality of groups of samples to be detected by using a Dirichlet process mixed model, and the clustering result comprises the probability of a plurality of classes corresponding to each group of samples to be detected;
carrying out abnormal behavior detection on a plurality of behaviors in the clustering result aiming at a group of samples to be detected; and if the abnormal behavior exists in the behaviors, the abnormal behavior of the object to be detected is represented.
2. The abnormal behavior detection method according to claim 1, wherein before the obtaining of the clustering results corresponding to the plurality of groups of samples to be detected, the method further comprises:
acquiring a plurality of groups of samples to be detected;
coding each group of samples to be detected according to a behavior library so as to convert a plurality of groups of samples to be detected into a plurality of behavior vectors;
and carrying out fuzzy clustering on each behavior vector according to the Dirichlet process mixed model to obtain clustering results corresponding to a plurality of groups of samples to be detected.
3. The abnormal behavior detection method according to claim 2, wherein the performing fuzzy clustering on each behavior vector according to the dirichlet process mixed model to obtain clustering results corresponding to a plurality of groups of samples to be detected comprises:
randomly initializing a membership matrix; the membership matrix comprises initial probabilities of a plurality of classes corresponding to each group of samples to be detected;
and updating the membership matrix according to the Dirichlet process mixed model and the behavior vector to obtain the clustering result.
4. The abnormal behavior detection method according to any one of claims 1 to 3, wherein before the abnormal behavior detection is performed on the plurality of behaviors in the clustering result for a group of samples to be detected, the method further comprises:
acquiring a fuzzy granularity adjusting parameter;
and extracting the probability of partial categories from the clustering result according to the fuzzy granularity adjusting parameters to obtain a new clustering result.
5. The abnormal behavior detection method according to claim 4, wherein the performing abnormal behavior detection on the plurality of behaviors in the clustering result for a group of samples to be detected comprises:
determining a plurality of behavior categories to which the samples to be detected belong according to the probability of the category corresponding to the samples to be detected in the new clustering result;
and carrying out abnormal behavior detection on the sample to be detected according to the probability related to the plurality of behavior categories in the new clustering result.
6. The abnormal behavior detection method according to any one of claims 1 to 3, wherein before the obtaining of the clustering results corresponding to the plurality of groups of samples to be detected, the method further comprises:
acquiring an operation log;
and processing the operation log to obtain a behavior library comprising a plurality of behaviors.
7. An abnormal behavior detection apparatus, comprising:
the first acquisition module is used for acquiring clustering results corresponding to a plurality of groups of samples to be detected; each group of samples to be detected comprises behavior data of an object to be detected in a preset time period, the clustering result is obtained by carrying out fuzzy clustering on a plurality of groups of samples to be detected by using a Dirichlet process mixed model, and the clustering result comprises the probability of a plurality of classes corresponding to each group of samples to be detected;
the detection module is used for detecting abnormal behaviors of a plurality of behaviors of the samples to be detected in the clustering result aiming at a group of samples to be detected; and if the abnormal behavior exists in the behaviors, the abnormal behavior of the object to be detected is represented.
8. A computer program product comprising computer program instructions which, when read and executed by a processor, perform the abnormal behavior detection method according to any one of claims 1 to 6.
9. An electronic device, comprising: a processor, a memory, and a bus;
the processor and the memory are communicated with each other through the bus;
the memory stores computer program instructions executable by the processor, the processor invoking the computer program instructions to perform the abnormal behavior detection method of any of claims 1-6.
10. A computer-readable storage medium storing computer program instructions which, when executed by a computer, cause the computer to perform the abnormal behavior detection method of any one of claims 1 to 6.
CN202111657903.3A 2021-12-30 2021-12-30 Abnormal behavior detection method and device Pending CN114266914A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111657903.3A CN114266914A (en) 2021-12-30 2021-12-30 Abnormal behavior detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111657903.3A CN114266914A (en) 2021-12-30 2021-12-30 Abnormal behavior detection method and device

Publications (1)

Publication Number Publication Date
CN114266914A true CN114266914A (en) 2022-04-01

Family

ID=80832097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111657903.3A Pending CN114266914A (en) 2021-12-30 2021-12-30 Abnormal behavior detection method and device

Country Status (1)

Country Link
CN (1) CN114266914A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114943865A (en) * 2022-06-17 2022-08-26 平安科技(深圳)有限公司 Target detection sample optimization method based on artificial intelligence and related equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114943865A (en) * 2022-06-17 2022-08-26 平安科技(深圳)有限公司 Target detection sample optimization method based on artificial intelligence and related equipment
CN114943865B (en) * 2022-06-17 2024-05-07 平安科技(深圳)有限公司 Target detection sample optimization method based on artificial intelligence and related equipment

Similar Documents

Publication Publication Date Title
US11868856B2 (en) Systems and methods for topological data analysis using nearest neighbors
US10367888B2 (en) Cloud process for rapid data investigation and data integrity analysis
US7437308B2 (en) Methods for estimating the seasonality of groups of similar items of commerce data sets based on historical sales date values and associated error information
WO2019144066A1 (en) Systems and methods for preparing data for use by machine learning algorithms
CN111612041A (en) Abnormal user identification method and device, storage medium and electronic equipment
CN110647995A (en) Rule training method, device, equipment and storage medium
WO2021111540A1 (en) Evaluation method, evaluation program, and information processing device
CN115545103A (en) Abnormal data identification method, label identification method and abnormal data identification device
Qin et al. Evaluation of goaf stability based on transfer learning theory of artificial intelligence
CN111581969A (en) Medical term vector representation method, device, storage medium and electronic equipment
CN114266914A (en) Abnormal behavior detection method and device
Cai et al. MWFP-outlier: Maximal weighted frequent-pattern-based approach for detecting outliers from uncertain weighted data streams
CN112395881A (en) Material label construction method and device, readable storage medium and electronic equipment
CN107609110B (en) Mining method and device for maximum multiple frequent patterns based on classification tree
CN112686312A (en) Data classification method, device and system
Xavier et al. A comparative analysis of dissimilarity measures for clustering categorical data
JP6950505B2 (en) Discrimination program, discrimination method and discrimination device
Wang et al. A layout-based classification method for visualizing time-varying graphs
Tsay Interestingness measures for actionable patterns
US11928123B2 (en) Systems and methods for network explainability
WO2023050461A1 (en) Data clustering method and system, and storage medium
US20240054187A1 (en) Information processing apparatus, analysis method, and storage medium
CN113032628A (en) Method, device, equipment and medium for determining content ecological index segmentation threshold
Shah et al. A Hybrid approach to improving clustering accuracy using SVM
CN115238816A (en) User classification method based on multi-metadata fusion and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination