CN112488175A - Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium - Google Patents

Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium Download PDF

Info

Publication number
CN112488175A
CN112488175A CN202011347823.3A CN202011347823A CN112488175A CN 112488175 A CN112488175 A CN 112488175A CN 202011347823 A CN202011347823 A CN 202011347823A CN 112488175 A CN112488175 A CN 112488175A
Authority
CN
China
Prior art keywords
behavior
user
users
detection method
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011347823.3A
Other languages
Chinese (zh)
Other versions
CN112488175B (en
Inventor
李兴国
邹斯达
苗功勋
路冰
孙宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD
Nanjing Zhongfu Information Technology Co Ltd
Zhongfu Information Co Ltd
Zhongfu Safety Technology Co Ltd
Original Assignee
BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD
Nanjing Zhongfu Information Technology Co Ltd
Zhongfu Information Co Ltd
Zhongfu Safety Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD, Nanjing Zhongfu Information Technology Co Ltd, Zhongfu Information Co Ltd, Zhongfu Safety Technology Co Ltd filed Critical BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD
Priority to CN202011347823.3A priority Critical patent/CN112488175B/en
Publication of CN112488175A publication Critical patent/CN112488175A/en
Application granted granted Critical
Publication of CN112488175B publication Critical patent/CN112488175B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a behavior aggregation characteristic-based abnormal user detection method, a terminal and a storage medium, which are used for acquiring user behavior information in a preset time period; aggregating the characteristic attributes in the user preset time period based on the access address information; configuring a matrix of each user into a row vector; respectively calculating a correlation coefficient between any two users as behavior similarity; searching two users with the maximum similarity, and clustering the two users into one class; calculating the similarity between the class and other users, updating a similarity matrix of the users which are aggregated into the class, and then repeating iterative calculation; and after the repeated iterative computation reaches a preset threshold value, stopping the clustering process, and determining that the user who is separated from the intranet group has abnormal behaviors. Thus, the present invention reduces the false alarm rate of anomaly detection. The abnormal use hidden in the group can be identified, and the safety of data information is guaranteed.

Description

Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method, a terminal device, and a storage medium for detecting an abnormal user based on behavior aggregation characteristics.
Background
With the rapid development of computer network technology, data information has become an important carrier at present. The data information carries information about the enterprise, information about the user, information about the transaction, and information about the communication. The data information has very important effect on every enterprise of every person.
The network architecture based on the TCP/IP protocol covers every corner of the world, brings great convenience to the life of people, but is accompanied by an increasingly serious information security problem. Each important institution, large-scale enterprise and the like defend against the leakage risk of data by means of arranging a fire wall, a network intrusion detection system, antivirus software and the like in an internal network frame, but single-point detection is usually limited to a small part of rules and cannot deal with the predicted stealing behavior, so that the safety of data information cannot be guaranteed.
Disclosure of Invention
In order to overcome the defects in the prior art and improve the accuracy of detecting abnormal behaviors in the internal network group, the invention provides a behavior aggregation characteristic-based abnormal user detection method, which comprises the following steps:
acquiring user behavior information in a preset time period;
aggregating the characteristic attributes in the user preset time period based on the access address information;
based on original behavior characteristics, performing adjacent element column transformation to obtain a behavior square matrix with the size of the number of target servers, and configuring the matrix of each user into a row of vectors;
respectively calculating correlation coefficients between any two users as behavior similarity of two users in the intranet group within a period of time;
searching two users with the maximum similarity according to a similarity matrix between the users, and clustering the two users into one type;
calculating the similarity between the class and other users, updating a similarity matrix of the users which are aggregated into the class, and then repeating iterative calculation;
and after the repeated iterative computation reaches a preset threshold value, stopping the clustering process, and determining that the user who is separated from the intranet group has abnormal behaviors.
It is further noted that, the traffic quintuple used by the user is extracted from the detector of the communication network, and the traffic data accessed by the user is intercepted;
the target IP aggregation behavior characteristics are obtained according to the user access in the specified time window granularity;
generating a behavior feature matrix based on each source IP of the behavior features.
It is further noted that the step-wise neighboring element column transformation includes:
and (3) the user IP1 and the adjacent elements of the characteristic dimension of the user IP2 are multiplied in a cross mode to be pieced together into a new behavior characteristic square matrix:
Figure BDA0002800457770000021
wherein m and n both belong to the [1, num _ distip ] interval;
m and n are used to represent subscripts of elements, column indexes of matrix, respectively;
i represents a row index of the behavior matrix, a positive integer not exceeding 6;
p denotes an element in the above-described original behavior feature matrix.
It is further noted that the step of calculating the correlation coefficient between any two users includes;
and (3) the similarity between the two user behavior vectors, configuring the user behavior characteristic matrix into a row of vectors, and performing the calculation of the similarity by using the following formula:
Figure BDA0002800457770000022
x and Y are behavior vectors of two users;
n represents the length of the vector;
EX, EY represent the mean of the two vectors.
It should be further noted that, before the step of acquiring the user behavior information within the preset time period, the method further includes:
and acquiring behavior information of accessing the target server within a preset time period based on the intranet user.
It should be further noted that the behavior information of the access target server includes: and the intranet user accesses the flow information, the time information and the information of the application protocol generated by each target server.
Further, in a preset time period, counting target servers connected with an intranet user, and forming a target server IP list;
and counting the sum of the uplink and downlink flow generated respectively when the intranet user is connected with each target server and the total frequency of connection with each target server respectively.
Further, in a preset time period, counting the total connection time of each connection of the intranet user with the target server and the time between the last connection time of the intranet user with the target server and the current time, and configuring the activity;
and counting the number of the types of the application layer protocols used by the intranet user and the target server in connection within a preset time period.
The invention also provides a terminal device for realizing the abnormal user detection method based on the behavior aggregation characteristics, which is characterized by comprising the following steps:
the memory is used for storing a computer program and an abnormal user detection method based on the behavior aggregation characteristics;
and the processor is used for executing the computer program and the abnormal user detection method based on the behavior aggregation characteristics so as to realize the steps of the abnormal user detection method based on the behavior aggregation characteristics.
The invention also provides a readable storage medium with the abnormal user detection method based on the behavior aggregation characteristics, which is characterized in that the readable storage medium stores a computer program, and the computer program is executed by a processor to realize the steps of the abnormal user detection method based on the behavior aggregation characteristics.
According to the technical scheme, the invention has the following advantages:
the abnormal user detection method based on the behavior aggregation characteristics can extract the rule shared by high frequency to extract the characteristics for the behaviors, realizes multiplexing, such as time, protocol, flow byte, connection frequency and the like, has high expansion degree based on time and high execution efficiency,
the method can be applied to the scene of abnormal detection in the intranet group. Outlier detection based on similarity can help enterprises or other organization security analysis personnel to track clues of stealing secret in a traceable intranet, capture inconspicuous abnormal behaviors, handle security incidents as soon as possible and minimize loss.
In a real scene, the method and the device can meet the requirement that an analyst handles the problem of unbalanced behavior distribution of a user caused by working habits or task dispatching, overcome the problem that adjacent access node information cannot be associated, and greatly reduce the false alarm rate of abnormal detection. The abnormal users hidden in the group can be identified, and the safety of data information is guaranteed.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings used in the description will be briefly introduced, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a flow chart of a method for detecting abnormal users based on behavior aggregation features;
FIG. 2 is a diagram illustrating an example of an abnormal user detection method based on behavior aggregation features;
fig. 3 is a flowchart of an embodiment of an abnormal user detection method based on behavior aggregation characteristics.
Detailed Description
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In order to improve the accuracy of detecting abnormal users in a group in an intranet, the invention provides a method for detecting abnormal users based on behavior aggregation characteristics, as shown in fig. 1. The method is performed based on a terminal device and a readable storage medium. Reading related information according to a preset method, and counting related data of the user within the time; fully mining the associated information between adjacent target IPs; namely, the calculation of the similarity is executed by running the codes arranged in the readable storage device based on the running of the processor; and the anomaly detection module finds potential outlier users by using a system clustering method on the basis of the calculation result of the previous step.
The user related to the invention is an intranet user. An intranet user refers to a network established over a local area network, or a corporate office network. Of course, the inner net can be connected with the outer net. Of course, the area coverage of the optical fiber dedicated single building, a community, or an office area may be larger in the intranet based on the ethernet technology.
System architecture 100 in an intranet may include one or more of terminal devices 101, 102, 103, a network 104, and a server 105. The communication network 104 is a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The communication network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
It should be understood that the number of terminal devices, networks, and servers in fig. 2 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, server 105 may be a server cluster comprised of multiple servers, or the like.
It should be noted that the computer system 200 of the electronic device shown in fig. 2 is only an example, and should not bring any limitation to the functions and the scope of the application of the embodiments of the present disclosure.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may be various electronic devices having display screens including, but not limited to, smart phones, tablets, portable and desktop computers, digital cinema projectors, and the like.
The server 105 may be a server that provides various services. For example, the user sends data information to the server 105 using the terminal device 103 (which may be the terminal device 101 or 102).
The method specifically comprises the following steps:
s101, acquiring user behavior information in a preset time period;
the preset time period may be set based on the actual use environment and use needs, and may be set in hours, days, weeks, or months. The user behavior information is based on the access of the terminal equipment to the target server and data operation.
S102, aggregating the characteristic attributes in the user preset time period based on the access address information;
the access address information includes: an access source IP and an access target IP.
S103, based on the original behavior characteristics, performing adjacent element column transformation to obtain a behavior square matrix with the size of the number of target servers, and configuring the matrix of each user into a row of vectors;
s104, respectively calculating correlation coefficients between any two users as behavior similarity of two users in the intranet group within a period of time; the larger the calculated coefficient is, the higher the similarity is.
S105, searching two users with the maximum similarity according to the similarity matrix between the users, and clustering the two users into one type;
s106, calculating the similarity between the class and other users, updating the similarity matrix of the users which are gathered into the class, and then repeating iterative calculation;
and S107, after the repeated iterative computation reaches a preset threshold value, stopping the clustering process, and determining that the user departing from the intranet group has abnormal behaviors.
Based on the above method steps, it can be seen from fig. 1 that due to the characteristics of the traffic data, a rule shared by high frequencies can be extracted to extract features for behaviors, so as to realize multiplexing, such as time, protocol, traffic byte, connection frequency, and the like, the time-based expansion degree is high, the execution efficiency is high,
the method can be applied to the scene of abnormal detection in the intranet group. Outlier detection based on similarity can help enterprises or other organization security analysis personnel to track clues of stealing secret in a traceable intranet, capture inconspicuous abnormal behaviors, handle security incidents as soon as possible and minimize loss.
In a real scene, the method and the device can meet the requirement that an analyst handles the problem of unbalanced behavior distribution of a user caused by working habits or task dispatching, overcome the problem that adjacent access node information cannot be associated, and greatly reduce the false alarm rate of abnormal detection. The abnormal users hidden in the group can be identified, and the safety of data information is guaranteed.
The flowcharts and block diagrams in the figures of the method for detecting abnormal users based on behavior aggregation characteristics provided by the present invention illustrate the architecture, functions and operations of possible implementations of the method, apparatus and computer program product according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The embodiment of the device described in the abnormal user detection method based on behavior aggregation features provided by the present invention is only illustrative, for example, the division of the unit is only a logical function division, and there may be another division manner in actual implementation, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
Fig. 3 schematically shows a flowchart of another abnormal user detection method based on behavior aggregation features according to an embodiment of the present disclosure. The method steps of this embodiment may be executed by the terminal device, the server, or both, for example, the server 105 in fig. 2 may be executed by the terminal device and the server, but the present disclosure is not limited thereto.
S201, extracting a flow quintuple used by a user from a detector of a communication network, and intercepting flow data accessed by the user;
the flow quintuple comprises an intranet user IP, a port, a target server IP and a port.
The target IP aggregation behavior characteristics are obtained according to the user access in the specified time window granularity;
generating a behavior feature matrix based on each source IP of the behavior features.
S202, acquiring behavior information of accessing the target server within a preset time period based on the intranet user.
S203, the behavior information of the access target server includes: and the intranet user accesses the flow information, the time information and the information of the application protocol generated by each target server.
S204, counting target servers connected with an intranet user within a preset time period, and forming a target server IP list;
and S205, counting the sum of the uplink and downlink flow generated respectively when the intranet user is connected with each target server and the total frequency of connection with each target server respectively.
S206, in a preset time period, counting the total connection time of each time the intranet user connects the target server and the time between the last connection time of the intranet user and the current time of the target server, and configuring the activity;
and S207, counting the number of application layer protocol types used by the intranet user and the target server in connection within a preset time period.
S208, acquiring user behavior information in a preset time period;
s209, aggregating the characteristic attributes in the user preset time period based on the access address information;
s210, based on original behavior characteristics, performing adjacent element column transformation to obtain a behavior square matrix with the size of the number of target servers, and configuring the matrix of each user into a row of vectors;
s211, respectively calculating correlation coefficients between any two users as behavior similarity of two users in the intranet group within a period of time; the larger the calculated coefficient is, the higher the similarity is.
The user behavior feature matrix is shown in table 1 below, and is an aggregated user behavior feature matrix, and the construction process is as follows:
TABLE 1
Figure BDA0002800457770000071
Figure BDA0002800457770000081
1) Acquiring flow data accessed by a user from a flow quintuple of the detector;
2) the target IP aggregation behavior characteristics are obtained according to the user access in the specified time window granularity;
3) a behavior feature matrix is generated for each source IP of the behavior. For coarse-grained time windows, all behavior information is directly aggregated, and only one feature matrix is generated for each source IP.
The adjacent element column transformation for the steps in the method comprises:
and (3) the user IP1 and the adjacent elements of the characteristic dimension of the user IP2 are multiplied in a cross mode to be pieced together into a new behavior characteristic square matrix:
Figure BDA0002800457770000082
wherein m and n both belong to the [1, num _ distip ] interval;
m and n are used to represent subscripts of elements, column indexes of matrix, respectively;
i represents a row index of the behavior matrix, a positive integer not exceeding 6;
p denotes an element in the above-described original behavior feature matrix.
Calculating a correlation coefficient between any two users includes;
and (3) the similarity between the two user behavior vectors, configuring the user behavior characteristic matrix into a row of vectors, and performing the calculation of the similarity by using the following formula:
Figure BDA0002800457770000083
x and Y are behavior vectors of two users;
n represents the length of the vector;
EX, EY represent the mean of the two vectors. The correlation coefficient is a centered cosine similarity which can indicate whether the directions of vectors in the space are similar or not, and also processes different dimensions of the components.
S212, searching two users with the maximum similarity according to the similarity matrix between the users, and clustering the two users into one type;
s213, calculating the similarity between the class and other users, updating the similarity matrix of the users which are aggregated into the class, and repeating iterative calculation;
and S214, after the repeated iterative computation reaches a preset threshold value, stopping the clustering process, wherein the user who is separated from the intranet group is determined to have abnormal behavior.
The method can be applied to the scene of abnormal detection in the intranet group. Outlier detection based on similarity can help enterprises or other organization security analysis personnel to track clues of stealing secret in a traceable intranet, capture inconspicuous abnormal behaviors, handle security incidents as soon as possible and minimize loss.
In a real scene, the method and the device can meet the requirement that an analyst handles the problem of unbalanced behavior distribution of a user caused by working habits or task dispatching, overcome the problem that adjacent access node information cannot be associated, and greatly reduce the false alarm rate of abnormal detection. The abnormal users hidden in the group can be identified, and the safety of data information is guaranteed.
The present invention also provides an implementation manner, and the method steps of the implementation manner may be executed by the terminal device, the server, or the terminal device and the server interactively execute, for example, the server 105 in fig. 2 is described above, but the present disclosure is not limited thereto.
Extracting a flow quintuple used by a user from a detector of a communication network, and intercepting flow data accessed by the user; the flow quintuple comprises an intranet user IP, a port, a target server IP and a port.
The target IP aggregation behavior characteristics are obtained according to the user access in the specified time window granularity;
generating a behavior feature matrix based on each source IP of the behavior features.
And acquiring behavior information of accessing the target server within a preset time period based on the intranet user.
The behavior information of the access target server comprises: and the intranet user accesses the flow information, the time information and the information of the application protocol generated by each target server.
Counting target servers connected with an intranet user within a preset time period, and forming a target server IP list;
counting an IP list of an intranet user and connected target servers within a preset time period, the sum of uplink and downlink flow generated by connection with each target server and the total frequency of connection with the target servers;
counting the time of the end time of each connection of the intranet user within a preset time period minus the starting time, namely the total time length of the connection with each target server and the time length of the last connection time of each target server from the current time, wherein the total time length is represented as the activity degree;
and counting the number of the application layer protocols commonly used by the intranet user and each target server in the connection within the time period.
And then constructing a matrix, and respectively aggregating according to the user Source IP and the access server Dist IP based on the obtained statistical characteristics. Each user generates a behavior feature matrix.
The user behavior feature matrix is shown in table 1 below, and is an aggregated user behavior feature matrix, and the construction process is as follows:
TABLE 1
dstip_1 dstip_2 dstip_3 dstip_4 …… dstip_n
Frequency of connection
Upstream flow
Downstream traffic
Kind of protocol
Duration of connection
Final connection
Calculating similarity
a) Feature engineering
Based on the step of obtaining the behavior feature matrix, obtaining the preprocessed data, and performing feature engineering on the feature matrix to meet the requirement of the algorithm:
transforming adjacent element columns under the condition of coarse granularity; as a total aggregation of behavior information in a preset time period, long-term behavior information of a user is stored in an original characteristic matrix, but correlation between different target servers is not further mined, and two servers adjacent to each other in a number segment often have more correlation information, so that the following transformation is performed on the basis of the behavior characteristic matrix in table 1:
Figure BDA0002800457770000111
finally, a behavior characteristic matrix with the size of the number of the access target servers is output.
b) Correlation coefficient between behavior vectors
Assuming two users Q and C, the correlation coefficient between them is calculated:
in order to facilitate the calculation process, the behavior feature matrix obtained in the feature engineering step is straightened into a row of vectors, then the calculation of the following formula is carried out, and under the condition of only using flow data, the similarity of the behaviors between the users Q and C is in a form of a table 3:
Figure BDA0002800457770000112
TABLE 3
Example user similarity matrix Q
C Similarity(Q,C)
Locating anomalous users
And finding the abnormal users in the group in the intranet according to the similarity.
a) Based on the similarity matrix between all users obtained in the above steps, it is assumed that initially all N users are classified by themselves.
b) Two users with the maximum similarity in the matrix are gathered into one class, then the distance between the two users and other users is calculated, and if the K class is formed by combining I and J users, the distance between the users H and K is as follows: .
c) And (c) updating the matrix of the similarity, and then repeating the method in the step b for iterative calculation.
Setting a threshold value by combining with the actual service condition, stopping the calculation process after the expected effect is achieved, and if a user is a single user at the moment, deducing that the trip is an abnormal user.
The invention has multiple types of restored service scenes and overcomes more pain points. In a real scene, the similarity degree between users can be measured by starting from the daily behavior characteristics of the users, and the false alarm rate of abnormal detection is reduced. And the false alarm rate of the abnormal detection is reduced. The abnormal use hidden in the group can be identified, and the safety of data information is guaranteed.
The invention also provides a terminal device for realizing the abnormal user detection method based on the behavior aggregation characteristics, which comprises the following steps: the memory is used for storing a computer program and an abnormal user detection method based on the behavior aggregation characteristics; and the processor is used for executing the computer program and the abnormal user detection method based on the behavior aggregation characteristics so as to realize the steps of the abnormal user detection method based on the behavior aggregation characteristics.
Based on the method, the invention also provides a readable storage medium with the abnormal user detection method based on the behavior aggregation characteristics, and the readable storage medium stores a computer program which is executed by a processor to realize the steps of the abnormal user detection method based on the behavior aggregation characteristics.
The terminal device includes a Central Processing Unit (CPU), which can detect abnormal users based on behavior aggregation characteristics according to programs stored in a Read-Only Memory (ROM). Or a program loaded from a storage section into a Random Access Memory (RAM) to execute the abnormal user detection method based on the behavior aggregation characteristic. In the RAM, various programs and data necessary for system operation are also stored. The CPU, ROM, and RAM are connected to each other via a bus. An input/output (I/O) interface is also connected to the bus.
The following components are connected to the I/O interface: an input section including a keyboard, a mouse, and the like; an output section including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker and the like; a storage section including a hard disk and the like; and a communication section including a Network interface card such as a LAN (Local Area Network) card, a modem, or the like. The communication section performs communication processing via a network such as the internet. The drive is also connected to the I/O interface as needed. A removable medium such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive as necessary, so that the computer program read out therefrom is mounted into the storage section as necessary.
The terminal device implementing the abnormal user detection method based on behavior aggregation features is the unit and algorithm steps of each example described in connection with the embodiments disclosed herein, and can be implemented in electronic hardware, computer software, or a combination of both, and in the above description, the components and steps of each example have been generally described in terms of functions in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. An abnormal user detection method based on behavior aggregation characteristics is characterized by comprising the following steps:
acquiring user behavior information in a preset time period;
aggregating the characteristic attributes in the user preset time period based on the access address information;
based on original behavior characteristics, performing adjacent element column transformation to obtain a behavior square matrix with the size of the number of target servers, and configuring the matrix of each user into a row of vectors;
respectively calculating correlation coefficients between any two users as behavior similarity of two users in the intranet group within a period of time;
searching two users with the maximum similarity according to a similarity matrix between the users, and clustering the two users into one type;
calculating the similarity between the class and other users, updating a similarity matrix of the users which are aggregated into the class, and then repeating iterative calculation;
and after the repeated iterative computation reaches a preset threshold value, stopping the clustering process, and determining that the user who is separated from the intranet group has abnormal behaviors.
2. The abnormal user detection method based on behavior aggregation feature of claim 1,
extracting a flow quintuple used by a user from a detector of a communication network, and intercepting flow data accessed by the user;
the target IP aggregation behavior characteristics are obtained according to the user access in the specified time window granularity;
generating a behavior feature matrix based on each source IP of the behavior features.
3. The abnormal user detection method based on behavior aggregation feature of claim 1,
the step of adjacent element column transformation comprises the following steps:
and (3) the user IP1 and the adjacent elements of the characteristic dimension of the user IP2 are multiplied in a cross mode to be pieced together into a new behavior characteristic square matrix:
Figure FDA0002800457760000011
wherein m and n both belong to the [1, num _ distip ] interval;
m and n are used to represent subscripts of elements, column indexes of matrix, respectively;
i represents a row index of the behavior matrix, a positive integer not exceeding 6;
p denotes an element in the above-described original behavior feature matrix.
4. The abnormal user detection method based on behavior aggregation feature of claim 1,
calculating a correlation coefficient between any two users;
and (3) the similarity between the two user behavior vectors, configuring the user behavior characteristic matrix into a row of vectors, and performing the calculation of the similarity by using the following formula:
Figure FDA0002800457760000021
x and Y are behavior vectors of two users;
n represents the length of the vector;
EX, EY represent the mean of the two vectors.
5. The abnormal user detection method based on behavior aggregation feature of claim 1,
the method also comprises the following steps before the user behavior information in the preset time period is acquired:
and acquiring behavior information of accessing the target server within a preset time period based on the intranet user.
6. The abnormal user detection method based on behavior aggregation features according to claim 5,
the behavior information of the access target server comprises: and the intranet user accesses the flow information, the time information and the information of the application protocol generated by each target server.
7. The abnormal user detection method based on behavior aggregation features according to claim 5,
counting target servers connected with an intranet user within a preset time period, and forming a target server IP list;
and counting the sum of the uplink and downlink flow generated respectively when the intranet user is connected with each target server and the total frequency of connection with each target server respectively.
8. The abnormal user detection method based on behavior aggregation features according to claim 5,
counting the total connection time of each time that an intranet user connects a target server and the time between the last connection time of the intranet user and the current time of the target server within a preset time period, and configuring the activity;
and counting the number of the types of the application layer protocols used by the intranet user and the target server in connection within a preset time period.
9. A terminal device for realizing an abnormal user detection method based on behavior aggregation characteristics is characterized by comprising the following steps:
the memory is used for storing a computer program and an abnormal user detection method based on the behavior aggregation characteristics;
a processor for executing the computer program and the abnormal user detection method based on behavior aggregation characteristics to realize the steps of the abnormal user detection method based on behavior aggregation characteristics as claimed in any one of claims 1 to 8.
10. A readable storage medium having a behavior aggregation feature based abnormal user detection method, wherein the readable storage medium has stored thereon a computer program, which is executed by a processor to implement the steps of the behavior aggregation feature based abnormal user detection method according to any one of claims 1 to 8.
CN202011347823.3A 2020-11-26 2020-11-26 Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium Active CN112488175B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011347823.3A CN112488175B (en) 2020-11-26 2020-11-26 Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011347823.3A CN112488175B (en) 2020-11-26 2020-11-26 Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN112488175A true CN112488175A (en) 2021-03-12
CN112488175B CN112488175B (en) 2023-06-23

Family

ID=74935842

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011347823.3A Active CN112488175B (en) 2020-11-26 2020-11-26 Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN112488175B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005216066A (en) * 2004-01-30 2005-08-11 Internatl Business Mach Corp <Ibm> Error detection system and method therefor
US20050210027A1 (en) * 2004-03-16 2005-09-22 International Business Machines Corporation Methods and apparatus for data stream clustering for abnormality monitoring
CN108270620A (en) * 2018-01-15 2018-07-10 深圳市联软科技股份有限公司 Network anomaly detection method, device, equipment and medium based on Portrait brand technology
CN108322428A (en) * 2017-01-18 2018-07-24 阿里巴巴集团控股有限公司 A kind of abnormal access detection method and equipment
CN111431909A (en) * 2020-03-27 2020-07-17 南京聚铭网络科技有限公司 Method and device for detecting grouping abnormity in user entity behavior analysis and terminal
CN111586001A (en) * 2020-04-28 2020-08-25 咪咕文化科技有限公司 Abnormal user identification method and device, electronic equipment and storage medium
CN111641629A (en) * 2020-05-28 2020-09-08 腾讯科技(深圳)有限公司 Abnormal behavior detection method, device, equipment and storage medium
CN111784392A (en) * 2020-06-29 2020-10-16 中国平安财产保险股份有限公司 Abnormal user group detection method, device and equipment based on isolated forest
US20200336500A1 (en) * 2019-04-18 2020-10-22 Oracle International Corporation Detecting anomalies during operation of a computer system based on multimodal data

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005216066A (en) * 2004-01-30 2005-08-11 Internatl Business Mach Corp <Ibm> Error detection system and method therefor
US20050210027A1 (en) * 2004-03-16 2005-09-22 International Business Machines Corporation Methods and apparatus for data stream clustering for abnormality monitoring
CN108322428A (en) * 2017-01-18 2018-07-24 阿里巴巴集团控股有限公司 A kind of abnormal access detection method and equipment
CN108270620A (en) * 2018-01-15 2018-07-10 深圳市联软科技股份有限公司 Network anomaly detection method, device, equipment and medium based on Portrait brand technology
US20200336500A1 (en) * 2019-04-18 2020-10-22 Oracle International Corporation Detecting anomalies during operation of a computer system based on multimodal data
CN111431909A (en) * 2020-03-27 2020-07-17 南京聚铭网络科技有限公司 Method and device for detecting grouping abnormity in user entity behavior analysis and terminal
CN111586001A (en) * 2020-04-28 2020-08-25 咪咕文化科技有限公司 Abnormal user identification method and device, electronic equipment and storage medium
CN111641629A (en) * 2020-05-28 2020-09-08 腾讯科技(深圳)有限公司 Abnormal behavior detection method, device, equipment and storage medium
CN111784392A (en) * 2020-06-29 2020-10-16 中国平安财产保险股份有限公司 Abnormal user group detection method, device and equipment based on isolated forest

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
何堃: "基于聚类的用户特征分析", 《中国优秀硕士学位论文全文数据库》 *
王丽娜 等: "基于相似度聚类分析方法的异常入侵检测系统的模型及实现", 《小型微型计算机系统》 *
金倩倩 等: "基于相似度分析的电力信息内网用户行为异常预警方法", 《计算机系统应用》 *

Also Published As

Publication number Publication date
CN112488175B (en) 2023-06-23

Similar Documents

Publication Publication Date Title
CN112491877A (en) User behavior sequence anomaly detection method, terminal and storage medium
US11323471B2 (en) Advanced cybersecurity threat mitigation using cyberphysical graphs with state changes
Sultan et al. Call detail records driven anomaly detection and traffic prediction in mobile cellular networks
US10320827B2 (en) Automated cyber physical threat campaign analysis and attribution
US10248910B2 (en) Detection mitigation and remediation of cyberattacks employing an advanced cyber-decision platform
US10305917B2 (en) Graph-based intrusion detection using process traces
US20180013771A1 (en) Advanced cybersecurity threat mitigation for inter-bank financial transactions
US20160226901A1 (en) Anomaly Detection Using Adaptive Behavioral Profiles
AU2017204597A1 (en) Identifying network security risks
US11194906B2 (en) Automated threat alert triage via data provenance
WO2021114921A1 (en) Method and apparatus for constructing relationship network based on privacy protection
WO2016168531A1 (en) Integrated community and role discovery in enterprise networks
US11074652B2 (en) System and method for model-based prediction using a distributed computational graph workflow
EP3494506A1 (en) Detection mitigation and remediation of cyberattacks employing an advanced cyber-decision platform
Edsberg Møllgaard et al. Understanding components of mobility during the COVID-19 pandemic
CN113452656B (en) Method, apparatus, electronic device and computer readable medium for identifying abnormal behavior
CN111400357A (en) Method and device for identifying abnormal login
WO2017019391A1 (en) Graph-based intrusion detection using process traces
CN114662157B (en) Block compressed sensing indistinguishable protection method and device for social text data stream
JP2022521833A (en) Graph stream mining pipeline for efficient subgraph detection
CN111327466A (en) Alarm analysis method, system, equipment and medium
CN112488175B (en) Abnormal user detection method based on behavior aggregation characteristics, terminal and storage medium
CN109063721A (en) A kind of method and device that behavioural characteristic data are extracted
KR101543377B1 (en) Apparatus and method for analyzing data using mapreduce based on nosql
He et al. Automated mining of approximate periodicity on numeric data: a statistical approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant