CN112995331A - User behavior threat detection method and device and computing equipment - Google Patents

User behavior threat detection method and device and computing equipment Download PDF

Info

Publication number
CN112995331A
CN112995331A CN202110319137.3A CN202110319137A CN112995331A CN 112995331 A CN112995331 A CN 112995331A CN 202110319137 A CN202110319137 A CN 202110319137A CN 112995331 A CN112995331 A CN 112995331A
Authority
CN
China
Prior art keywords
behavior
sequence
short sequence
behaviors
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110319137.3A
Other languages
Chinese (zh)
Other versions
CN112995331B (en
Inventor
杜凤珠
黄�俊
袁帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nsfocus Technologies Inc
Nsfocus Technologies Group Co Ltd
Original Assignee
Nsfocus Technologies Inc
Nsfocus Technologies Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nsfocus Technologies Inc, Nsfocus Technologies Group Co Ltd filed Critical Nsfocus Technologies Inc
Priority to CN202110319137.3A priority Critical patent/CN112995331B/en
Publication of CN112995331A publication Critical patent/CN112995331A/en
Application granted granted Critical
Publication of CN112995331B publication Critical patent/CN112995331B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general

Abstract

The embodiment of the invention relates to the technical field of network security, in particular to a user behavior threat detection method, a user behavior threat detection device, computing equipment and a computer readable storage medium. The method comprises the following steps: determining a plurality of behavior short sequences of a user in a set time period; determining abnormal behavior conditions of various behaviors in the set time period; aiming at the ith behavior short sequence, determining the sequence abnormal condition of the ith behavior short sequence at least according to the (i-1) th behavior short sequence; and determining the comprehensive abnormal condition of the ith behavior short sequence according to the abnormal conditions of the behaviors of various types in the ith behavior short sequence and the abnormal conditions of the sequence of the ith behavior short sequence. The user portrait is extracted through multi-angle integrated learning, the abnormal condition of the short sequence of behaviors is determined more comprehensively, and the false alarm rate is reduced. According to the last behavior short sequence, the sequence abnormal condition of the current behavior short sequence is determined, the influence of historical behaviors on the abnormal condition is considered, and the accuracy of user behavior threat detection is improved.

Description

User behavior threat detection method and device and computing equipment
Technical Field
The embodiment of the invention relates to the technical field of network security, in particular to a user behavior threat detection method, a user behavior threat detection device, computing equipment and a computer readable storage medium.
Background
Network attack behaviors are already a normal state at present, and therefore, how to effectively detect the network attack behaviors becomes a focus. Some of the attacks come from the interior of the enterprise, and the attacks are embedded into a large amount of normal data, so that the difficulty of data mining analysis is increased. Meanwhile, attackers often have knowledge related to a security defense mechanism, and can take targeted measures to avoid the attackers, so that the identification difficulty is increased. The accuracy of recognition can be increased to a large extent if the detection model can be built based on user behavior.
However, detection systems with mature applications are still rarely seen in reality, and the fact that the false alarm rate is too high is one of the greatest benefits for restricting the active detection application.
Based on the problems, the invention provides a detection method based on behavior portraits, which can greatly reduce the false alarm rate and accurately judge the legality of the abnormal behaviors of the user in real time.
Disclosure of Invention
The embodiment of the invention provides a user behavior threat detection method, which is used for solving the problem of overhigh false alarm rate of a user behavior threat detection system.
The embodiment of the invention provides a user behavior threat detection method, which comprises the following steps:
determining a plurality of behavior short sequences of a user in a set time period; the behaviors in each behavior short sequence are arranged according to a time sequence, and the adjacent behaviors meet set conditions;
determining abnormal behavior conditions of various behaviors in the set time period;
aiming at the ith behavior short sequence, determining the sequence abnormal condition of the ith behavior short sequence at least according to the (i-1) th behavior short sequence; the ith-1 behavior short sequence is a behavior short sequence which is positioned before the ith behavior short sequence in time sequence;
determining the comprehensive abnormal condition of the ith behavior short sequence according to the abnormal conditions of the behaviors of various types in the ith behavior short sequence and the abnormal conditions of the sequence of the ith behavior short sequence; wherein the comprehensive abnormal condition of each behavior short sequence is used for indicating the behavior condition of the user.
The behavior of the user is divided into a plurality of behavior short sequences, and then the comprehensive abnormal condition of the short sequences for indicating the behavior condition of the user is finally determined through the sequence abnormal condition and the abnormal condition of the behaviors in the sequences, so that on one hand, the comprehensive abnormal condition of the short sequences is determined based on the abnormal condition of various behaviors in a plurality of behaviors, and on the other hand, the comprehensive abnormal condition of the sequences is determined based on the mutual influence among the behavior short sequences, so that the integrated learning can be performed from a plurality of angles, the user portrait is extracted, the behavior condition of the user is determined more comprehensively, and the false alarm rate is reduced. According to the last behavior short sequence, the sequence abnormal condition of the current behavior short sequence is determined, the influence of the historical behaviors on the abnormal condition is fully considered, and the accuracy of the behavior threat detection of the user is improved.
Optionally, determining abnormal behavior conditions of various behaviors in the set time period includes:
aiming at any kind of behaviors, acquiring a plurality of single-kind behavior characteristic models of the behaviors; the single-class behavior feature models are obtained by training behavior data of users in different historical periods;
respectively inputting the behavior data of the behavior in the set time period into the plurality of single-class behavior characteristic models to obtain a plurality of abnormal conditions of the sub-behaviors;
and determining the abnormal behavior condition of the type of behavior according to the plurality of abnormal behavior conditions.
The corresponding behavior abnormal condition of the behavior is calculated according to the category of the behavior, so that the operation is simplified; the single-class behavior feature model is obtained by training the behavior data of each user in different historical periods, so that a plurality of single-class behavior feature models are obtained for the same behavior, and the influence of the historical behavior on abnormal conditions is considered; the final behavior abnormal condition of the behavior is determined through the plurality of sub-behavior abnormal conditions, the problems of singleness and bias caused by the judgment of a single model are avoided, and the accuracy is improved.
Optionally, the step of inputting the behavior data of the class of behavior in the set time period into the plurality of single-class behavior feature models respectively to obtain a plurality of sub-behavior abnormal conditions includes:
determining various behavior characteristics of the behavior according to the behavior data of the behavior within the set time period, wherein the various behavior characteristics comprise the relationship between the user and the equipment, the relationship between the user and the behavior and the time, and the relationship between the user and the behavior attribute;
and respectively inputting each behavior characteristic of the behavior into the single behavior characteristic models to obtain a plurality of abnormal conditions of the sub-behaviors.
By setting the behavior characteristics of the behaviors, including the relationship between the user and the equipment, the relationship between the user and the behaviors and the time, and the relationship between the user and the behaviors and the behavior attributes, the behavior characteristics of the behaviors can be more comprehensively counted, and more comprehensive and accurate user portrayal can be obtained.
Optionally, each behavior data includes a timestamp of when the behavior occurs, a user name corresponding to the behavior, a device identifier of the bearer behavior, a behavior category, and a behavior attribute.
The behaviors are expressed according to the behavior data mode, so that each behavior has a more uniform expression mode, and subsequent extraction, calculation and processing are facilitated.
Optionally, determining a sequence abnormal condition of the ith behavioral short sequence at least according to the ith-1 behavioral short sequence includes:
inputting the plurality of behavior short sequences into a behavior sequence characteristic model to obtain the sequence abnormal condition of each behavior short sequence; the behavior sequence feature model is obtained by training the behavior short sequences of the historical time periods of the users.
Therefore, the behavior sequence characteristic model is obtained through the behavior short sequence training in the historical period, and the accuracy of model detection is improved.
Optionally, determining a plurality of short sequences of behaviors of the user within a set period includes:
arranging all behaviors of the user in the set time period according to a time sequence;
if the time interval between the mth behavior and the mth-1 behavior is smaller than a preset threshold value, classifying the mth behavior into a behavior short sequence in which the mth-1 behavior is located; otherwise, creating a new behavior short sequence for the mth behavior; the m-1 th behavior is a behavior preceding the m-th behavior in chronological order.
The behaviors are arranged according to the time sequence, so that the obtained short behavior sequence is convenient for subsequent detection of abnormal states of the sequence, and the accuracy is improved.
Optionally, the behavior sequence feature model is a conditional random field CRF model;
the single-Class behavior feature model is a single-Class support vector machine One-Class SVM model.
Through the normalization processing in the CRF model, the bias problem of the model is solved, and the accuracy of the user behavior threat detection is improved. Meanwhile, the CRF model is supervised learning, and the One-Class SVM model is unsupervised learning. The detection results obtained by the two models can be mutually supplemented, the respective defects are made up, the respective advantages are exerted, and a more comprehensive and accurate user behavior threat detection system is constructed.
An embodiment of the present invention further provides a device for detecting a user behavior threat, including:
the determining unit is used for determining a plurality of behavior short sequences of a user in a set time period; the behaviors in each behavior short sequence are arranged according to a time sequence, and the adjacent behaviors meet set conditions;
a processing unit to:
determining abnormal behavior conditions of various behaviors in the set time period;
aiming at the ith behavior short sequence, determining the sequence abnormal condition of the ith behavior short sequence at least according to the (i-1) th behavior short sequence; the ith-1 behavior short sequence is a behavior short sequence which is positioned before the ith behavior short sequence in time sequence;
determining the comprehensive abnormal condition of the ith behavior short sequence according to the abnormal conditions of the behaviors of various types in the ith behavior short sequence and the abnormal conditions of the sequence of the ith behavior short sequence; wherein the comprehensive abnormal condition of each behavior short sequence is used for indicating the behavior condition of the user.
An embodiment of the present invention further provides a computing device, including:
a memory for storing a computer program;
a processor for calling the computer program stored in the memory and executing the method according to the obtained program.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer-executable program is stored, where the computer-executable program is used to enable a computer to execute any one of the methods described above.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 illustrates an exemplary possible user behavioral threat detection method;
FIG. 2 illustrates a flow of partitioning a short sequence of behaviors;
FIG. 3 illustrates another possible user behavioral threat detection method;
fig. 4 is a block diagram of a hardware configuration of a user behavioral threat detection apparatus 400 according to an embodiment of the present invention.
Detailed Description
To make the objects, embodiments and advantages of the present application clearer, the following description of exemplary embodiments of the present application will clearly and completely describe the exemplary embodiments of the present application with reference to the accompanying drawings in the exemplary embodiments of the present application, and it is to be understood that the described exemplary embodiments are only a part of the embodiments of the present application, and not all of the embodiments.
All other embodiments, which can be derived by a person skilled in the art from the exemplary embodiments described herein without inventive step, are intended to be within the scope of the claims appended hereto. In addition, while the disclosure herein has been presented in terms of one or more exemplary examples, it should be appreciated that aspects of the disclosure may be implemented solely as a complete embodiment.
It should be noted that the brief descriptions of the terms in the present application are only for the convenience of understanding the embodiments described below, and are not intended to limit the embodiments of the present application. These terms should be understood in their ordinary and customary meaning unless otherwise indicated.
The terms "first," "second," "third," and the like in the description and claims of this application and in the above-described drawings are used for distinguishing between similar or analogous objects or entities and are not necessarily intended to limit the order or sequence of any particular one, Unless otherwise indicated. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein.
Furthermore, the terms "comprises" and "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or device that comprises a list of elements is not necessarily limited to those elements explicitly listed, but may include other elements not expressly listed or inherent to such product or device.
Fig. 1 illustrates an exemplary possible threat detection method for user behavior, including:
step 101, determining a plurality of behavior short sequences of a user in a set time period; the behaviors in each behavior short sequence are arranged according to a time sequence, and the adjacent behaviors meet set conditions;
step 102, determining abnormal behavior conditions of various behaviors in the set time period;
103, aiming at the ith behavior short sequence, determining the sequence abnormal condition of the ith behavior short sequence at least according to the ith-1 behavior short sequence; the ith-1 behavior short sequence is a behavior short sequence which is positioned before the ith behavior short sequence in time sequence;
104, determining a comprehensive abnormal condition of the ith behavior short sequence according to the abnormal conditions of the behaviors of various types in the ith behavior short sequence and the abnormal conditions of the sequence of the ith behavior short sequence; wherein the comprehensive abnormal condition of each behavior short sequence is used for indicating the behavior condition of the user.
Step 101 and step 102 may be performed simultaneously or in an interchangeable order, and the execution order of the two steps is not limited in the embodiment of the present invention.
The user behavior threat detection process may be performed according to behavior data obtained in real time, such as collecting behavior data of a certain user in real time, and dividing the collected behavior data of the user into behavior short sequences to obtain a plurality of behavior short sequences of the user. Specifically, the division of the behavior short sequence may be conditioned on the time interval of the adjacent behaviors, such as a time interval of 1 minute; the N consecutive behaviors may be set as one short sequence of behaviors, and the number of consecutive behaviors may be set as a set condition.
The behavior data of the user usually includes various behaviors, such as login behavior, web browsing behavior, and e-mail sending/receiving behavior. In order to effectively detect the behaviors of the user, various behaviors can be respectively detected to obtain the abnormal behavior condition of each behavior. If the behavior characteristics of a certain behavior in the historical behaviors of the same type of users are summarized, the detected user behavior data is compared with the behavior characteristics of the historical behaviors of the same type of users, and therefore the behavior abnormal condition of the behavior is determined. If the behavior characteristics of the historical behaviors of the users of the same type are that the webpage browsing behavior is higher than the entertainment information and the behavior data of the user is concentrated on the sports information, certain behavior abnormity can be determined to exist.
And dividing the behavior data of the user into a plurality of behavior short sequences, and respectively calculating the sequence abnormal conditions of the behavior short sequences. In calculating the sequence abnormality, each behavior short sequence cannot be viewed in isolation, but calculation is performed based on at least the behavior short sequence preceding the behavior short sequence. Therefore, the sequence abnormal condition can be determined more accurately and comprehensively. In the embodiment of the present invention, a Model concerning state influence between sequences, such as an HMM (Hidden Markov Model), a CRF (Conditional Random Field algorithm), and the like, may be used to calculate a sequence anomaly, which is not limited herein.
And determining the behavior condition of the user according to the combination of the sequence abnormal condition of each behavior short sequence and the behavior abnormal conditions of various behaviors in the sequence. If the comprehensive abnormal condition of the behavior short sequence exceeds a set threshold value, but the behaviors in the short sequence do not exceed the set threshold value, the abnormal condition of the behavior sequence of the user can be determined. In this way, the behavior of the user can be determined from a plurality of angles.
Optionally, in step 101, behavior data of the user in a set time period is analyzed by the flow log analysis module to obtain normalized behavior data, where each behavior data includes a timestamp of the occurrence of the behavior, a user name corresponding to the behavior, a device identifier of the load-bearing behavior, a behavior category, and a behavior attribute. The behavior categories and the corresponding behavior attributes are shown in table 1. The following are examples only.
Behavior classes Behavior attributes
Log of log-in behavior Login and logout
FTP Log Connect and disconnect
Mail journal Transmitting and receiving
HTTP Log Upload, download, access, etc
File transfer log Read, write, etc
TABLE 1
Optionally, for attributes of different behaviors, certain processing is required in the analysis process. In the e-mail activity, considering the confidentiality of the mail content in a company, directly abandoning the mail content and the attachment information without processing; adding the recipient information into the behavior attribute in the sent mail; in the received mail, the sender information is added to the behavior attribute. In file read-write activities, the path and file name are added to the behavior attribute. In web browsing, URL information is added to behavior attributes. Finally, each behavior can be parsed into a piece of behavior data: (timestamp, user name corresponding to the behavior, device identification bearing the behavior, behavior category and behavior attribute).
Optionally, the normalized behavior data is divided according to user names to obtain a plurality of behavior short sequences of each user in a set time period.
Optionally, determining a plurality of short sequences of behaviors of the user within a set period includes:
arranging all behaviors of the user in the set time period according to a time sequence;
if the time interval between the mth behavior and the mth-1 behavior is smaller than a preset threshold value, classifying the mth behavior into a behavior short sequence in which the mth-1 behavior is located; otherwise, creating a new behavior short sequence for the mth behavior; the m-1 th behavior is a behavior preceding the m-th behavior in chronological order.
For example, the behavior of user A during the day is arranged in chronological order as shown in Table 2. For convenience of description, 1 indicates a login behavior, 2 indicates an FTP behavior, 3 indicates a mail behavior, 4 indicates an HTTP behavior, and 5 indicates a file transfer behavior.
Time stamp User name Device identification Behavior classes Behavior attributes
1:00 A Device a 1 Check-in
2:00 A Device a 2 Connection of
3:00 A Device d 1 Check-in
4:00 A Device b 3 Sending
7:00 A Device e 2 Connection of
8:00 A Device e 3 Sending
9:00 A Device f 4 Upload to
10:00 A Device c 4 Upload to
11:00 A Device f 5 Reading
12:00 A Deviceb 5 Reading
TABLE 2
Fig. 2 shows a flow of dividing a row into short sequences. According to the table, if the preset threshold of the time interval is two hours, the time interval between the first behavior data and the second behavior data is not more than two hours, and the first behavior data and the second behavior data are divided into the same behavior short sequence; if the time interval between the second behavior data and the third behavior data does not exceed two hours, the third behavior data is classified into the behavior short sequence in which the first behavior data and the second behavior data are located, and so on, so as to obtain a behavior short sequence X ═ (X1, X2), where X1 includes 1,2,1, 3; x2 includes 2,3,4,4,5, 5.
Alternatively, if the time interval between two adjacent behavior data cannot be determined, the two behavior data may be combined into one behavior sequence, and then the behavior sequence is compared with the existing behavior short sequence in terms of similarity, and then the behavior sequence is incorporated into the existing behavior short sequence with the highest similarity.
The above are merely examples, and embodiments of the present invention are not limited thereto.
Alternatively, the behavior short sequence is determined by any of the behavior data listed in table 2. For example, a plurality of behavior short sequences of the user are determined by three items of time stamps, user names corresponding to the behaviors and behavior categories.
Optionally, in step 102, determining abnormal behavior conditions of various behaviors in the set time period includes the following steps, as shown in fig. 3:
301, aiming at any type of behaviors, acquiring a plurality of single type behavior characteristic models of the type of behaviors; the single-class behavior feature models are obtained by training behavior data of users in different historical periods;
step 302, respectively inputting the behavior data of the behavior in the set time period into the plurality of single-class behavior characteristic models to obtain a plurality of abnormal conditions of the sub-behaviors;
step 303, determining the abnormal behavior condition of the behavior according to the abnormal behavior conditions of the plurality of children.
Optionally, in step 301, the single-Class behavior feature model is a One-Class SVM (One-Class Support Vector Machine) model.
The training process of the single-Class behavior feature model is explained below by taking an One-Class SVM model as an example.
Various behaviors of the historical user are arranged according to a time sequence at fixed time intervals and divided into different data blocks, and each data block comprises a working day and a rest day in order to ensure the universality of a model obtained by training. For example, the behavior data of the history user B in two weeks is shown in table 3, the behavior data of the history user C in two weeks is shown in table 4, and the behavior data of the history user D in two weeks is shown in table 5. The first five groups of behavior data of each table are the first week, and the last five groups of behavior data of each table are the second week.
Time stamp User name Device identification Behavior classes Behavior attributes
1:00 B Device a 1 Check-in
2:00 B Device a 2 Connection of
3:00 B Device d 1 Check-in
4:00 B Device b 3 Sending
7:00 B Device e 2 Connection of
8:00 B Device e 1 Sending
9:00 B Device f 2 Upload to
10:00 B Device c 1 Upload to
11:00 B Device f 3 Reading
12:00 B Device b 2 Reading
TABLE 3
Time stamp User name Device identification Behavior classes Behavior attributes
1:00 C Device a 1 Check-in
2:00 C Device a 2 Connection of
3:00 C Device d 1 Check-in
4:00 C Device b 3 Sending
7:00 C Device e 2 Connection of
8:00 C Device e 1 Sending
9:00 C Device f 2 Upload to
10:00 C Device c 1 Upload to
11:00 C Device f 3 Reading
12:00 C Device b 2 Reading
TABLE 4
Time stamp User name Device identification Behavior classes Behavior attributes
1:00 D Device a 1 Check-in
2:00 D Device a 2 Connection of
3:00 D Device d 1 Check-in
4:00 D Device b 3 Sending
7:00 D Device e 2 Connection of
8:00 D Device e 1 Sending
9:00 D Device f 2 Upload to
10:00 D Device c 1 Upload to
11:00 D Device f 3 Reading
12:00 D Device b 2 Reading
TABLE 5
The two data blocks divided according to the above table are shown in tables 6 and 7:
data block 1:
1:00 B device a 1 Check-in
2:00 B Device a 2 Connection of
3:00 B Device d 1 Check-out
4:00 B Device b 3 Sending
7:00 B Device e 2 Connection of
1:00 C Device a 1 Check-in
2:00 C Device a 2 Connection of
3:00 C Device d 1 Check-in
4:00 C Device b 3 Sending
7:00 C Device e 2 Connection of
1:00 D Device a 1 Check-in
2:00 D Device a 2 Connection of
3:00 D Device d 1 Check-in
4:00 D Device b 3 Sending
7:00 D Device e 2 Connection of
TABLE 6
Data block 2:
8:00 B device e 1 Sending
9:00 B Device f 2 Upload to
10:00 B Device c 1 Upload to
11:00 B Device f 3 Reading
12:00 B Device b 2 Reading
8:00 C Device e 1 Sending
9:00 C Device f 2 Upload to
10:00 C Device c 1 Upload to
11:00 C Device f 3 Reading
12:00 C Device b 2 Reading
8:00 D Device e 1 Sending
9:00 D Device f 2 Upload to
10:00 D Device c 1 Upload to
11:00 D Device f 3 Reading
12:00 D Device b 2 Reading
TABLE 7
According to the divided data blocks, behavior characteristics of each behavior of each historical user can be constructed, wherein each behavior characteristic comprises a relationship between the user and equipment, a relationship between the user and the behavior and time, and a relationship between the user and the behavior and behavior attributes.
Taking the data block 1 in table 6 as an example, the historical user B has 5 behaviors in total, wherein 2 login behaviors are provided, 2 devices are used in the login behaviors in total, the time interval in the login behaviors is smaller than a preset threshold, for example, 1 device in 20 minutes, and 1 device is logged in the login behaviors, so that the behavior characteristics of the login behaviors of the historical user B can be obtained as follows: (2/5,2/5,1/5,1/5).
By analogy, according to the data block in table 6, since there are 3 historical users, there can be 3 behavior features for login behavior, 3 behavior features for FTP behavior, and 3 behavior features for mail behavior. According to the data blocks of table 7, since there are 3 historical users, there are 3 behavior features for login behavior, 3 behavior features for FTP behavior, and 3 behavior features for mail behavior.
And constructing a One-Class SVM model.
The One-Class SVM model maximizes the distance of the hyperplane from the origin using the maximum distance from the origin under the condition that the training data can be divided.
Assuming a hyperplane:
Figure BDA0002992090330000121
similar to the binary support vector, the geometric distance is maximized:
Figure BDA0002992090330000122
Figure BDA0002992090330000124
adding some relaxation operators
Figure BDA0002992090330000123
ξ represents the relaxation variable like C in a binary SVM while:
1) ξ it sets an upper limit for the score of an outlier (considered outlier within the training data set).
2) ξ is the lower bound on the number of samples in the training data set as support vectors.
Inputting 3 behavior characteristics of login behaviors in the data block 1 into the model as sample data to obtain a single-class behavior characteristic model A1 for login behaviors, a single-class behavior characteristic model B1 for FTP behaviors and a single-class behavior characteristic model C1 for mail behaviors;
similarly, 3 behavior features of login behavior in the data block 2 are input into the model as sample data, so that a single-class behavior feature model a2 for login behavior, a single-class behavior feature model B2 for FTP behavior, and a single-class behavior feature model C2 for mail behavior can be obtained.
So far, a plurality of single-class behavior feature models are obtained according to the behavior data training of the users in different historical periods.
Optionally, in step 301, behaviors of the user in the set time period are divided according to behavior categories, and a plurality of single-class behavior feature models corresponding to the various types of behaviors are obtained, for example, for login behaviors, the corresponding single-class behavior feature models a1 and a2 are obtained.
Optionally, in step 302, the step of inputting the behavior data of the class of behaviors in the set time period into the plurality of single-class behavior feature models respectively to obtain a plurality of sub-behavior abnormal conditions includes:
determining various behavior characteristics of the behavior according to the behavior data of the behavior within the set time period, wherein the various behavior characteristics comprise the relationship between the user and the equipment, the relationship between the user and the behavior and the time, and the relationship between the user and the behavior attribute;
and respectively inputting each behavior characteristic of the behavior into the single behavior characteristic models to obtain a plurality of abnormal conditions of the sub-behaviors.
For example, taking the behavior data of the user a in table 2 as an example, the behavior characteristics of the user a for each behavior category are determined according to the method for determining behavior characteristics described above, and for example, the behavior characteristics for login behavior is a, the behavior characteristics for FTP behavior is b, the behavior characteristics for mail behavior is c, the behavior characteristics for HTTP behavior is d, and the behavior characteristics for file transfer behavior is e.
And respectively inputting the behavior characteristics a into the single-class behavior characteristic models A1 and A2 corresponding to the behavior characteristics a to obtain the sub-behavior abnormal conditions a1 and a 2.
Optionally, in step 303, a behavioral exception condition of the type of behavior is determined according to the plurality of child behavioral exception conditions.
For example, the abnormal behavior condition of the login behavior of the user a is obtained by averaging a1 and a2, and other algorithms may be used, which is not limited by the embodiment of the present invention.
Optionally, in step 103, for the ith behavior short sequence, determining a sequence abnormal condition of the ith behavior short sequence at least according to the ith-1 behavior short sequence; the ith-1 behavior short sequence is a behavior short sequence which is positioned before the ith behavior short sequence in time sequence;
optionally, determining a sequence abnormal condition of the ith behavioral short sequence at least according to the ith-1 behavioral short sequence includes:
inputting the plurality of behavior short sequences into a behavior sequence characteristic model to obtain the sequence abnormal condition of each behavior short sequence; the behavior sequence feature model is obtained by training the behavior short sequences of the historical time periods of the users.
Optionally, the behavior sequence feature model is a CRF model.
The following describes the training process of the CRF model by taking the model as an example.
As shown in tables 3,4 and 5, the behavior of the historical user B, C, D is divided into a plurality of behavior short sequences according to the method for dividing behavior short sequences, and the behavior short sequences are used as sample data for training the CRF.
Constructing a CRF model:
(1) defining a characteristic function fk(x,li-1,li,i),k∈{1,2,...,M}。
Where x is any behavioral short sequence, i represents the position of the behavioral short sequence in the set, liIs the short sequence behavior anomaly. Each characteristic function represents that the abnormal condition of the current position behavior short sequence is l when the current position is iiAnd the abnormal condition of the short sequence of the previous behavior is li-1A "likelihood" of time, although this "likelihood" is not a probability, it is typically 0 or 1.
(2) Given a set of short sequences X ═ X1,x2,...,xnWhen the position of the movable part is changed,
the whole short sequence is abnormal by l ═ l (l)1l2...ln) The conditional probability:
P(l=l1l2...ln|X={x1,x2,...,xn})
the influence of each feature function on the whole should be different, and therefore, the weight W (W) needs to be added to each function1,w2,...wM) The larger the weight is, the larger the influence of the feature function on the labeling result is. How much should the weight of each feature function be taken? In practice, the weights W are parameters of the model, and the parameters acquire values of the parameters through parameter learning.
The above-mentioned "probability" that can be expressed is not a probability, and should be normalized, and this normalization procedure solves the bias problem to make its value between 0 and 1, so that the normalization term z (x) is introduced: (bias problem, i.e. the effect of historical behavior is fixed, no normalization process)
Figure BDA0002992090330000151
The final model is therefore a conditional probability:
Figure BDA0002992090330000152
and inputting the sample data into a CRF model, and determining model parameters.
Thus, a well-trained behavior sequence characteristic model is obtained.
And when the behavior short sequence of a certain user is determined, inputting the behavior short sequence into the behavior sequence characteristic model to obtain the sequence abnormal condition of the sequence.
For example, the plurality of behavioral short sequences x1 and x2 obtained in step 101 are input into the behavioral sequence characteristic model, and the sequence abnormal conditions p1 and p2 of each behavioral short sequence are obtained, wherein the sequence abnormal condition of each behavioral short sequence is related to the sequence abnormal condition of the immediately preceding behavioral short sequence.
Optionally, in step 104, determining a comprehensive abnormal condition of the ith behavior short sequence according to the abnormal behavior condition of each type of behavior in the ith behavior short sequence and the abnormal sequence condition of the ith behavior short sequence; wherein the comprehensive abnormal condition of each behavior short sequence is used for indicating the behavior condition of the user.
For example, the behavior for the user a is short in sequence x1, x1 includes login behavior, FTP behavior, and mail behavior. The sequence abnormality condition of the behavior short sequence x1 is p 1; the abnormal behavior conditions of the three types of behaviors are calculated by a single-type behavior feature model, such as s1, s2 and s 3. Then the comprehensive abnormal situation of the short sequence of behaviors x1 of user a can be calculated according to the following formula:
Figure BDA0002992090330000153
for the behavior short sequence x1 of the user a, the larger the probability of the occurrence of an abnormality, the closer the value of the abnormality score S is to 1. On the contrary, the value of S approaches to 0. And finally, judging whether the current behavior short sequence is abnormal or not according to the selected score threshold. When the abnormal behavior short sequence is judged, the system sends an alarm to the safety operation and maintenance personnel; and when the normal behavior is judged, storing the current behavior data into the historical data, and updating the historical user behavior mode as sample data.
Alternatively, the aggregate abnormal situation of the user may be determined by a plurality of behavioral short sequences corresponding thereto.
For example, the comprehensive abnormal conditions of the two short sequences of behaviors of the user a are averaged to obtain the comprehensive abnormal condition of the user a. The above are merely examples, and embodiments of the present invention are not limited thereto.
The behavior of the user is divided into a plurality of behavior short sequences, and then the comprehensive abnormal condition of the short sequences for indicating the behavior condition of the user is finally determined through the sequence abnormal condition and the abnormal condition of the behaviors in the sequences, so that on one hand, the comprehensive abnormal condition of the short sequences is determined based on the abnormal condition of various behaviors in a plurality of behaviors, and on the other hand, the comprehensive abnormal condition of the sequences is determined based on the mutual influence among the behavior short sequences, so that the integrated learning can be performed from a plurality of angles, the user portrait is extracted, the behavior condition of the user is determined more comprehensively, and the false alarm rate is reduced. According to the last behavior short sequence, the sequence abnormal condition of the current behavior short sequence is determined, the influence of the historical behaviors on the abnormal condition is fully considered, and the accuracy of the behavior threat detection of the user is improved.
An embodiment of the present invention further provides a device for detecting a user behavior threat, as shown in fig. 4, including:
a determining unit 401, configured to determine a plurality of behavior short sequences of a user within a set time period; the behaviors in each behavior short sequence are arranged according to a time sequence, and the adjacent behaviors meet set conditions;
a processing unit 402 for:
determining abnormal behavior conditions of various behaviors in the set time period;
aiming at the ith behavior short sequence, determining the sequence abnormal condition of the ith behavior short sequence at least according to the (i-1) th behavior short sequence; the ith-1 behavior short sequence is a behavior short sequence which is positioned before the ith behavior short sequence in time sequence;
determining the comprehensive abnormal condition of the ith behavior short sequence according to the abnormal conditions of the behaviors of various types in the ith behavior short sequence and the abnormal conditions of the sequence of the ith behavior short sequence; wherein the comprehensive abnormal condition of each behavior short sequence is used for indicating the behavior condition of the user.
An embodiment of the present invention further provides a computing device, including:
a memory for storing a computer program;
a processor for calling the computer program stored in the memory and executing the method according to the obtained program.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer-executable program is stored, where the computer-executable program is used to enable a computer to execute any one of the methods described above.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A method for detecting a user behavior threat, comprising:
determining a plurality of behavior short sequences of a user in a set time period; the behaviors in each behavior short sequence are arranged according to a time sequence, and the adjacent behaviors meet set conditions;
determining abnormal behavior conditions of various behaviors in the set time period;
aiming at the ith behavior short sequence, determining the sequence abnormal condition of the ith behavior short sequence at least according to the (i-1) th behavior short sequence; the ith-1 behavior short sequence is a behavior short sequence which is positioned before the ith behavior short sequence in time sequence;
determining the comprehensive abnormal condition of the ith behavior short sequence according to the abnormal conditions of the behaviors of various types in the ith behavior short sequence and the abnormal conditions of the sequence of the ith behavior short sequence; wherein the comprehensive abnormal condition of each behavior short sequence is used for indicating the behavior condition of the user.
2. The method of claim 1,
determining abnormal behavior conditions of various behaviors in the set time period, wherein the abnormal behavior conditions comprise:
aiming at any kind of behaviors, acquiring a plurality of single-kind behavior characteristic models of the behaviors; the single-class behavior feature models are obtained by training behavior data of users in different historical periods;
respectively inputting the behavior data of the behavior in the set time period into the plurality of single-class behavior characteristic models to obtain a plurality of abnormal conditions of the sub-behaviors;
and determining the abnormal behavior condition of the type of behavior according to the plurality of abnormal behavior conditions.
3. The method of claim 2,
respectively inputting the behavior data of the behavior in the set time period into the plurality of single-class behavior characteristic models to obtain a plurality of abnormal conditions of the sub-behaviors, wherein the method comprises the following steps:
determining various behavior characteristics of the behavior according to the behavior data of the behavior within the set time period, wherein the various behavior characteristics comprise the relationship between the user and the equipment, the relationship between the user and the behavior and the time, and the relationship between the user and the behavior attribute;
and respectively inputting each behavior characteristic of the behavior into the single behavior characteristic models to obtain a plurality of abnormal conditions of the sub-behaviors.
4. The method of claim 3,
each behavior data includes a timestamp when the behavior occurs, a user name corresponding to the behavior, a device identifier for bearing the behavior, a behavior category and a behavior attribute.
5. The method of claim 1,
determining a sequence abnormal condition of the ith behavior short sequence at least according to the ith-1 behavior short sequence, wherein the sequence abnormal condition comprises the following steps:
inputting the plurality of behavior short sequences into a behavior sequence characteristic model to obtain the sequence abnormal condition of each behavior short sequence; the behavior sequence feature model is obtained by training the behavior short sequences of the historical time periods of the users.
6. The method of any one of claims 1 to 5,
determining a plurality of behavior short sequences of a user in a set period of time, comprising:
arranging all behaviors of the user in the set time period according to a time sequence;
if the time interval between the mth behavior and the mth-1 behavior is smaller than a preset threshold value, classifying the mth behavior into a behavior short sequence in which the mth-1 behavior is located; otherwise, creating a new behavior short sequence for the mth behavior; the m-1 th behavior is a behavior preceding the m-th behavior in chronological order.
7. The method of claim 6,
the behavior sequence characteristic model is a conditional random field CRF model;
the single-Class behavior feature model is a single-Class support vector machine One-Class SVM model.
8. A user behavioral threat detection apparatus, comprising:
the determining unit is used for determining a plurality of behavior short sequences of a user in a set time period; the behaviors in each behavior short sequence are arranged according to a time sequence, and the adjacent behaviors meet set conditions;
a processing unit to:
determining abnormal behavior conditions of various behaviors in the set time period;
aiming at the ith behavior short sequence, determining the sequence abnormal condition of the ith behavior short sequence at least according to the (i-1) th behavior short sequence; the ith-1 behavior short sequence is a behavior short sequence which is positioned before the ith behavior short sequence in time sequence;
determining the comprehensive abnormal condition of the ith behavior short sequence according to the abnormal conditions of the behaviors of various types in the ith behavior short sequence and the abnormal conditions of the sequence of the ith behavior short sequence; wherein the comprehensive abnormal condition of each behavior short sequence is used for indicating the behavior condition of the user.
9. A computing device, comprising:
a memory for storing a computer program;
a processor for calling a computer program stored in said memory, for executing the method of any one of claims 1 to 7 in accordance with the obtained program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer-executable program for causing a computer to execute the method of any one of claims 1 to 7.
CN202110319137.3A 2021-03-25 2021-03-25 User behavior threat detection method and device and computing equipment Active CN112995331B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110319137.3A CN112995331B (en) 2021-03-25 2021-03-25 User behavior threat detection method and device and computing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110319137.3A CN112995331B (en) 2021-03-25 2021-03-25 User behavior threat detection method and device and computing equipment

Publications (2)

Publication Number Publication Date
CN112995331A true CN112995331A (en) 2021-06-18
CN112995331B CN112995331B (en) 2022-11-22

Family

ID=76333649

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110319137.3A Active CN112995331B (en) 2021-03-25 2021-03-25 User behavior threat detection method and device and computing equipment

Country Status (1)

Country Link
CN (1) CN112995331B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569949A (en) * 2021-07-28 2021-10-29 广州博冠信息科技有限公司 Abnormal user identification method and device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102938070A (en) * 2012-09-11 2013-02-20 广西工学院 Behavior recognition method based on action subspace and weight behavior recognition model
CN106951783A (en) * 2017-03-31 2017-07-14 国家电网公司 A kind of Method for Masquerade Intrusion Detection and device based on deep neural network
US20170329314A1 (en) * 2014-11-26 2017-11-16 Shenyang Institute Of Automation, Chinese Academy Of Sciences Modbus tcp communication behaviour anomaly detection method based on ocsvm dual-outline model
CN108616545A (en) * 2018-06-26 2018-10-02 中国科学院信息工程研究所 A kind of detection method, system and electronic equipment that network internal threatens
CN108881194A (en) * 2018-06-07 2018-11-23 郑州信大先进技术研究院 Enterprises user anomaly detection method and device
CN110555182A (en) * 2018-05-31 2019-12-10 中国电信股份有限公司 User portrait determination method and device and computer readable storage medium
US20200285737A1 (en) * 2019-03-05 2020-09-10 Microsoft Technology Licensing, Llc Dynamic cybersecurity detection of sequence anomalies
CN111709028A (en) * 2020-04-21 2020-09-25 中国科学院信息工程研究所 Network security state evaluation and attack prediction method
CN111709765A (en) * 2020-03-25 2020-09-25 中国电子科技集团公司电子科学研究院 User portrait scoring method and device and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102938070A (en) * 2012-09-11 2013-02-20 广西工学院 Behavior recognition method based on action subspace and weight behavior recognition model
US20170329314A1 (en) * 2014-11-26 2017-11-16 Shenyang Institute Of Automation, Chinese Academy Of Sciences Modbus tcp communication behaviour anomaly detection method based on ocsvm dual-outline model
CN106951783A (en) * 2017-03-31 2017-07-14 国家电网公司 A kind of Method for Masquerade Intrusion Detection and device based on deep neural network
CN110555182A (en) * 2018-05-31 2019-12-10 中国电信股份有限公司 User portrait determination method and device and computer readable storage medium
CN108881194A (en) * 2018-06-07 2018-11-23 郑州信大先进技术研究院 Enterprises user anomaly detection method and device
CN108616545A (en) * 2018-06-26 2018-10-02 中国科学院信息工程研究所 A kind of detection method, system and electronic equipment that network internal threatens
US20200285737A1 (en) * 2019-03-05 2020-09-10 Microsoft Technology Licensing, Llc Dynamic cybersecurity detection of sequence anomalies
CN111709765A (en) * 2020-03-25 2020-09-25 中国电子科技集团公司电子科学研究院 User portrait scoring method and device and storage medium
CN111709028A (en) * 2020-04-21 2020-09-25 中国科学院信息工程研究所 Network security state evaluation and attack prediction method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569949A (en) * 2021-07-28 2021-10-29 广州博冠信息科技有限公司 Abnormal user identification method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112995331B (en) 2022-11-22

Similar Documents

Publication Publication Date Title
CN110417721B (en) Security risk assessment method, device, equipment and computer readable storage medium
US10459827B1 (en) Machine-learning based anomaly detection for heterogenous data sources
US10437945B2 (en) Systems and methods for order-of-magnitude viral cascade prediction in social networks
CN108809745A (en) A kind of user's anomaly detection method, apparatus and system
US11496495B2 (en) System and a method for detecting anomalous patterns in a network
CN103593609B (en) Trustworthy behavior recognition method and device
US11507881B2 (en) Analysis apparatus, analysis method, and analysis program for calculating prediction error and extracting error factor
EP3648433B1 (en) System and method of training behavior labeling model
Kuang et al. An anomaly intrusion detection method using the CSI-KNN algorithm
JP7044117B2 (en) Model learning device, model learning method, and program
Xu et al. A data-driven preprocessing scheme on anomaly detection in big data applications
CN112929381A (en) Detection method, device and storage medium for false injection data
CN112995331B (en) User behavior threat detection method and device and computing equipment
Turchin et al. Tuning complex event processing rules using the prediction-correction paradigm
CN110659807B (en) Risk user identification method and device based on link
Anderka et al. Automatic ATM Fraud Detection as a Sequence-based Anomaly Detection Problem.
Maurya et al. Online anomaly detection via class-imbalance learning
CN111177725A (en) Method, device, equipment and storage medium for detecting malicious click operation
ABID et al. Anomaly detection in WSN: critical study with new vision
Pannell et al. Anomaly detection over user profiles for intrusion detection
Liu et al. Securing online reputation systems through trust modeling and temporal analysis
Yang et al. Effective mobile web user fingerprinting via motion sensors
Ghosh et al. Real time failure prediction of load balancers and firewalls
Khandelwal et al. Machine learning methods leveraging ADFA-LD dataset for anomaly detection in linux host systems
CN114463117A (en) User behavior prediction method, system and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant