CN115577348A - User abnormal operation behavior identification method and device - Google Patents

User abnormal operation behavior identification method and device Download PDF

Info

Publication number
CN115577348A
CN115577348A CN202110764472.4A CN202110764472A CN115577348A CN 115577348 A CN115577348 A CN 115577348A CN 202110764472 A CN202110764472 A CN 202110764472A CN 115577348 A CN115577348 A CN 115577348A
Authority
CN
China
Prior art keywords
user
behavior
users
deviation
baseline
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110764472.4A
Other languages
Chinese (zh)
Inventor
应越菲
徐良
吴永卫
王珺玮
林昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Zhejiang Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202110764472.4A priority Critical patent/CN115577348A/en
Publication of CN115577348A publication Critical patent/CN115577348A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Abstract

The invention discloses a method and a device for identifying abnormal operation behaviors of a user, wherein the method comprises the following steps: according to the behavior data of the users in the appointed time, constructing the behavior characteristic vector of each user; clustering each user according to the behavior characteristic vector of each user to obtain user groups of different categories; establishing a user behavior baseline, and establishing a user group behavior baseline; calculating deviation values of the users according to the behavior characteristic vectors of the users, the user behavior baselines and/or the user group behavior baselines; and clustering according to the deviation value of each user, and identifying abnormal operation behaviors of the users. According to the behavior data of the user in the appointed time, the behavior characteristic vector is constructed for the user, different behavior characteristic vectors can be constructed based on different times, the method is suitable for the behavior data of the user in different times, and the abnormal operation behavior of the user can be accurately and flexibly identified. And calculating the deviation value of the user based on different deviation dimensions, and performing clustering processing to obtain abnormal operation behaviors.

Description

User abnormal operation behavior identification method and device
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for identifying abnormal operation behaviors of a user.
Background
Currently, the existing method for identifying internal threats by a security event analysis system is mainly based on rules and thresholds to detect internal threat events, but under a specific scenario, many alarms are invalid. Such as: in the security analysis for preventing the leakage of the sensitive data of the client, if the behavior of some internal business personnel for accessing the detailed list of the client frequently is analyzed normally only according to a certain rule or threshold value, a large amount of false alarms are generated. If the rules are set to provide that the sales force visits the customer details more than 1000 times a day, the risk of data leakage is considered. However, the business volume of each business point is different, some business points have large business volume, and some business points have small business volume; moreover, when the traffic is busy in some time periods and idle in some time periods, a large amount of false alarms can be caused by the abnormal behavior judgment result obtained by only depending on a fixed rule or threshold analysis. Secondly, the existing security incident analysis system needs a large number of people with high-level security experience during application, and the input manpower and material resources are huge.
Disclosure of Invention
In view of the above, the present invention is proposed to provide a user abnormal operation behavior recognition method and apparatus that overcomes or at least partially solves the above problems.
According to an aspect of the present invention, there is provided a method for identifying abnormal operation behavior of a user, including:
according to the behavior data of the users in the appointed time, constructing the behavior characteristic vector of each user;
clustering each user according to the behavior feature vector of each user to obtain user groups of different categories;
establishing a user behavior baseline, and establishing a user group behavior baseline;
calculating deviation values of all users with different deviation dimensions according to the behavior feature vectors of all users, the user behavior baselines and/or the user group behavior baselines;
and clustering according to the deviation value of each user, and identifying abnormal operation behaviors of the users.
According to another aspect of the present invention, there is provided a user abnormal operation behavior recognition apparatus including:
the first construction module is suitable for constructing the behavior characteristic vector of each user according to the behavior data of the user in the appointed time;
the clustering module is suitable for clustering each user according to the behavior characteristic vector of each user to obtain user groups of different categories;
the second construction module is suitable for constructing a user behavior baseline and constructing a user group behavior baseline;
the calculation module is suitable for calculating deviation values of all users with different deviation dimensions according to the behavior characteristic vectors of all users, the user behavior baselines and/or the user group behavior baselines;
and the identification module is suitable for clustering according to the deviation value of each user and identifying the abnormal operation behavior of the user.
According to still another aspect of the present invention, there is provided an electronic apparatus including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface are communicated with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the user abnormal operation behavior identification method.
According to still another aspect of the present invention, a computer storage medium is provided, where at least one executable instruction is stored in the storage medium, and the executable instruction causes a processor to execute an operation corresponding to the method for identifying an abnormal operation behavior of a user as described above.
According to the method and the device for identifying the abnormal operation behaviors of the user, the behavior characteristic vector is constructed for the user in the appointed time according to the behavior data of the user in the appointed time, different behavior characteristic vectors can be constructed based on different times, the method and the device are suitable for the behavior data of the user in different times, and the abnormal operation behaviors of the user are accurately and flexibly identified. And calculating the deviation value of the user based on different deviation dimensions, and performing clustering processing to obtain abnormal operation behaviors.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 shows a flow diagram of a method for identifying abnormal operation behavior of a user according to one embodiment of the invention;
FIG. 2 is a schematic diagram of a framework structure for identifying abnormal operation behaviors of a user;
fig. 3 shows a functional block diagram of a user abnormal operation behavior recognition apparatus according to an embodiment of the present invention;
fig. 4 shows a schematic structural diagram of an electronic device according to an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Fig. 1 shows a flowchart of a method for identifying abnormal operation behavior of a user according to an embodiment of the present invention. As shown in fig. 1, the method for identifying abnormal operation behavior of a user specifically includes the following steps:
and step S101, constructing behavior characteristic vectors of all users according to the behavior data of the users in the appointed time.
In this embodiment, the operation behavior of the user is analyzed based on UEBA (user and entity behavior analysis). UEBA essentially belongs to data-driven security analysis techniques and requires the collection of a large and extensive amount of user behavior class data. In the big data era, data is the basis of all analyses, and a small or low quality input necessarily results in a low value output. However, this does not mean that the more pure the data, the better, the data that is not relevant to the scene, and excessive collection only increases the system burden.
The basis for analyzing the operation behaviors of the user is data, the premise of data acquisition is that scenes are matched with the acquired data to be analyzed in a specific scene, and high-quality and various data are the core of the behavior analysis of the user entity. Data that can be used for user entity behavior analysis, including security logs, network traffic, threat intelligence, and identity access-related logs, etc., access as much data as possible that is relevant to the user scenario.
These data are roughly classified into 3 types of user identification data, entity identification data, and user behavior data. User identity data is divided into two categories: one is true identity data, such as profile data; one is virtual identity data, such as user registration data on the network. The entity identity data is a unique identity of a user in the network, such as an IP address, a MAC address, and the like. The user behavior data classification can be divided into network behavior information and terminal behavior information.
The overall frame of the present embodiment can be divided into four layers: a data layer, an ETL (Extract-Transform-Load, data warehouse technology) layer, an analysis engine layer and a display layer. Specifically, as shown in fig. 2, the data layer and ETL layer: the main function is to collect data and wash the data into high-quality and multi-dimensional data which accords with a safety analysis scene. And an analysis engine layer: and finally, identifying the abnormal operation behavior of the user by processing the data. A display layer: abnormal operation data of the user (not shown in the figure) is presented.
The analysis engine layer firstly needs to construct behavior feature vectors of all users, namely, the behavior features are extracted. The method is the basis of the whole user behavior analysis modeling, and related behavior data needs to be found out by combining with actual business requirements.
The embodiment acquires the operation behavior data generated by each user in a specified time unit, wherein the specified time unit is, for example, a day unit, and the operation behavior data generated by each user in each time node in the day is respectively extracted. The operation behaviors comprise account login behaviors, device connection behaviors, internet surfing behaviors, database access behaviors, file transmission behaviors, mail receiving and sending behaviors and the like. The time unit is a day, but the present embodiment is not limited thereto, and may be set according to the implementation, and is not limited herein. Obtaining well processed log data from the ETL layer, such as: a web log (http), a device log (device), a service system log (login), a file log (file), and the like. And combining the log data together according to the date to form the behavior sequence characteristic of the user. For example, the behavior data of a certain user in 2020/02/02 day is as follows: 07. According to the operation behavior data, at least one behavior feature vector of the user can be comprehensively constructed from multiple angles. The plurality of angles includes, for example: account: the type of the account, the login times of the account in one day, the first login time of the account, the last exit time of the account, the total login time of the account and the like; device fingerprint: the number of connected devices, the time of connecting devices, etc.; log on the internet: the number of url address accesses, etc.; database logging: number of times the database is accessed, etc.; file log: the format of the file uploaded/downloaded by the user, the size of the file, the number of the files and the like; e, mail log: the number of mails sent by the user, whether the mail is sent by a company mail or a private mail, whether the mail account of the receiver is the company mail or the private mail, the size of the sent mail, whether the mail has an attachment, the number of attachments and the like. And recording the operation behavior data of each user in a specified time unit in a numerical form, and constructing to obtain at least one behavior characteristic vector of each user. For example, (7, 41, 16, 55,9.36, 10,2,1, 4,0,9,0, 6) indicates that a user logs in at 7 am 41, logs out at 55 pm for the last time, the total login time is 9 hours and 36 minutes, the number of times of accessing url addresses in the day is 10, the number of times of logging in a database is 2, the number of times of accessing specified data is 1, the number of times of downloading specified data is 1, the number of sent mails is 4, the number of sent mails is 0, the number of received addresses is 9, the number of received addresses is 0, and the number of sent mails is 6.
And S102, clustering each user according to the behavior feature vector of each user to obtain user groups of different categories.
In the embodiment, an unsupervised machine learning algorithm (such as a K-Means, one Class SVM, an isolated forest algorithm and the like) is adopted, and clustering processing is performed on each user according to the behavior characteristic vector of each user, so that the users with similar behavior patterns are divided into a dynamic group to obtain a plurality of classes of user groups. The behavior feature vectors of the users in each user group have similarity. For example, there are large differences in the behavior patterns of sales personnel, technicians, and management personnel among users. Specifically, for a certain operation behavior, the times, duration and the like of accessing the database by users with different role attributes are different, and the users can be divided into different user groups through clustering.
Before clustering processing, because the dimensionality of the behavior characteristic vector of each user may be different, for example, the dimensionality of the behavior characteristic vector constructed by the user a in a specified time unit is 10, the dimensionality of the behavior characteristic vector constructed by the user B in the specified time unit is 15, and the dimensionality of different behavior characteristic vectors increases the difficulty of clustering processing, the dimensionality of the behavior characteristic vector of each user is counted first to obtain the highest dimensionality, for example, the dimensionalities of the user a and the user B are counted to obtain the highest dimensionality 15. And carrying out dimension completion processing on the behavior characteristic vectors of all the users according to the highest dimension. The supplementary processing is specifically to fill in with a specified supplementary value at a specified position of the behavior feature vector of the user. If the end of the behavior characteristic vector of the user A is filled with a specified supplementary numerical value such as 0, the dimensionality of the behavior characteristic vector of each user reaches the same dimensionality. Based on the behavior characteristic vectors of all users with the same dimensionality, clustering processing is carried out by using a specified algorithm, such as a K-Means algorithm, the behavior characteristic vectors are clustered into K categories, namely K user groups, according to the actual service condition, and the behavior characteristic vectors of all the users in each user group have similarity.
And S103, constructing a user behavior baseline and a user group behavior baseline.
After clustering users to different user groups, behavior baselines for characterizing normal users and user groups can be set for different users and different user groups. The behavior baseline can be constructed based on statistical characteristics (based on the mean, standard deviation and the like of the data of the behavior characteristics of the user). Specifically, for any user, according to historical operation behavior data of the user, a large number of behavior feature vectors of the user in different specified time units are constructed to form behavior feature matrix data of the user, a behavior feature vector mean value is obtained through calculation, namely the mean value of each feature vector is obtained through calculation, and the mean value of each feature vector is used as a user behavior baseline, namely a baseline for representing normal behaviors of the user. And aiming at any user group, calculating the average value of each characteristic vector according to the user behavior baseline of each user in the group to be used as the user group behavior baseline.
The establishment of the user behavior baseline is dynamic, namely the user behavior baseline can be dynamically adjusted according to the influence of factors such as different time periods or regions, and meanwhile, the user group behavior baseline is adjusted, so that the user behavior baseline can be better adapted to the change of the user behavior under different conditions.
And step S104, calculating deviation values of all users with different deviation dimensions according to the behavior feature vectors of all users, the user behavior baselines and/or the user group behavior baselines.
The deviation value of each user is calculated through algorithms such as machine learning or deep learning algorithms, and the deviation value can be formed by a plurality of different deviation dimensions. The deviation values of the different deviation dimensions include, for example, deviation values of behavior feature vectors of users and behavior baselines of users, deviation values of behavior feature vectors of users and behavior baselines of user groups, deviation values of behavior feature vectors of users in a specified time period and predicted behavior feature vectors of users in the time period, deviation values of behavior baselines of users in the user group where the user is located and behavior baselines of other user groups, and the like.
The deviation value of each user can be calculated according to the following formula:
Figure BDA0003150579980000071
wherein error is a deviation value; d is the dimension number of the feature vector; w is a group of i Is the weighting factor of the ith dimension. When calculating the deviation value of the behavior feature vector of the user and the behavior baseline of the user, X i The ith feature vector is a user behavior baseline; x is the number of i The ith behavior feature vector of the user; when calculating the deviation value of the behavior feature vector of the user and the user group behavior baseline, X i The ith characteristic vector is a user group behavior baseline of the user; x is the number of i The ith behavior feature vector of the user; when calculating the deviation between the behavior feature vector of the user in a specified time period and the predicted behavior feature vector of the user in the time periodWhen a difference is found, X i Predicting the ith characteristic vector in a certain period of time in the future by fitting the behavior characteristic vector of the user in a period of historical time by using a prediction algorithm; x is the number of i The real ith behavior feature vector of the content in the same predicted time period is taken as the user; the prediction algorithm may be a time-series machine learning algorithm, and the like, which is not limited herein. Further, when calculating the deviation value between the user behavior baseline of the user group and the behavior baselines of other user groups, X i The ith characteristic vector is a user group behavior baseline of the user; x is the number of i The ith behavior feature vector of the behavior baselines of other groups which are not the group where the user is located. Considering that users in the same group have similar behavior baselines, the operation behaviors of different groups of users have larger difference, which reflects that the network behavior and the terminal behavior have larger difference, and the deviation value of the deviation dimension can be clustered and recognized by calculating the deviation value through two inter-deviation values through the formula or through Euclidean distance from different dimensions such as access times, access duration or average access duration.
And respectively calculating each user by using the formula to obtain the deviation value of each user. Each user has a plurality of deviation values for different deviation dimensions.
And step S105, clustering according to the deviation value of each user, and identifying abnormal operation behaviors of the user.
And splicing the plurality of calculated deviation values of the user into a deviation value characteristic matrix aiming at any user. And clustering the deviation value characteristic matrix of each user to obtain abnormal data. Here, the deviation value feature matrix is clustered by using an unsupervised machine learning algorithm, and abnormal data are obtained. And then sorting the abnormal data from large to small, obtaining the abnormal data sorted in the first order, performing backtracking and checking, and judging whether the abnormal operation behavior exists indeed so as to confirm the corresponding abnormal operation behavior of the user.
Optionally, this embodiment further includes the following steps:
and step S106, constructing an abnormal behavior database according to the identified abnormal operation behavior of the user.
And S107, training according to the abnormal behavior database to generate an abnormal behavior judgment model so as to identify the abnormal operation behavior of the user.
In the initial stage of implementing the embodiment, the user tag cannot be directly established, and backtracking and investigation need to be performed according to the abnormal data rank, so as to determine whether an exact abnormal operation behavior occurs. Based on data accumulation for a period of time, if the abnormal operation behavior of the user is identified, an abnormal behavior database is constructed, the abnormal behavior database is used as a sample, a user label is marked, training is performed by using a machine learning algorithm if the abnormal operation behavior exists, various characteristics of the abnormal operation behavior are modeled and analyzed, and an abnormal behavior judgment model is generated, so that the abnormal operation behavior of the user can be accurately identified, and the identification accuracy is further improved.
According to the method for identifying the abnormal operation behaviors of the user, provided by the invention, the behavior characteristic vector is constructed for the user according to the behavior data of the user in the specified time, different behavior characteristic vectors can be constructed based on different times, the method is suitable for the behavior data of the user at different times, and the abnormal operation behaviors of the user can be accurately and flexibly identified. And calculating the deviation value of the user based on different deviation dimensions, and performing clustering processing to obtain abnormal operation behaviors.
Fig. 3 shows a functional block diagram of a user abnormal operation behavior recognition apparatus according to an embodiment of the present invention. As shown in fig. 3, the apparatus for identifying abnormal operation behavior of a user includes the following modules:
the first construction module 310 is adapted to construct behavior feature vectors of each user according to behavior data of the user in a specified time;
the clustering module 320 is adapted to perform clustering processing on each user according to the behavior feature vector of each user to obtain user groups of different categories;
a second construction module 330, adapted to construct user behavior baselines, and to construct user group behavior baselines;
the calculating module 340 is adapted to calculate deviation values of the users with different deviation dimensions according to the behavior feature vectors of the users, the user behavior baselines and/or the user group behavior baselines;
and the identification module 350 is adapted to perform clustering processing according to the deviation value of each user and identify abnormal operation behaviors of the user.
Optionally, the first building block 310 is further adapted to:
acquiring operation behavior data generated by each user in a specified time unit; the operation behavior comprises an account login behavior, a connection equipment behavior, an internet surfing behavior, a database access behavior, a file transmission behavior and/or a mail receiving and sending behavior;
and recording the operation behavior data of each user in a specified time unit in a numerical form, and constructing to obtain at least one behavior characteristic vector of each user.
Optionally, the clustering module 320 is further adapted to:
counting the characteristic dimension of the behavior characteristic vector of each user to obtain the highest characteristic dimension;
according to the highest characteristic dimension, performing characteristic dimension completion processing on the behavior characteristic vectors of all the users; wherein the supplementary treatment specifically comprises the following steps: filling the specified positions of the behavior feature vectors of the users with specified completion values;
clustering the behavior characteristic vectors of all users by using a specified algorithm to obtain a plurality of categories of user groups; and the behavior characteristic vectors of the users in each user group have similarity.
Optionally, the second building block 330 is further adapted to:
aiming at any user, calculating to obtain a behavior feature vector mean value according to a plurality of behavior feature vectors of the user, and taking the behavior feature vector mean value as a user behavior baseline;
and aiming at any user group, calculating to obtain an average value as a user group behavior baseline according to the user behavior baseline of each user in the group.
Optionally, the calculation module 340 is further adapted to:
respectively calculating deviation values of the users according to different deviation dimensions according to the behavior characteristic vectors of the users, the user behavior baselines and/or the user group behavior baselines; wherein each user has a plurality of deviation values of different deviation dimensions; the deviation values for the different deviation dimensions include: the deviation value of the behavior characteristic vector of the user and the behavior baseline of the user, the deviation value of the behavior characteristic vector of the user and the behavior baseline of the user group, the deviation value of the behavior characteristic vector of the user in a specified time period and the predicted behavior characteristic vector of the user in the time period, and/or the deviation value of the behavior baseline of the user in the user group where the user is located and the behavior baselines of other user groups.
Optionally, the identification module 350 is further adapted to:
splicing the multiple calculated deviation values of the user into a deviation value characteristic matrix aiming at any user;
clustering the deviation value characteristic matrix of each user to obtain abnormal data;
and sequencing the abnormal data from large to small, and obtaining the abnormal data sequenced in advance to perform backtracking and investigation so as to confirm the abnormal operation behavior of the corresponding user.
Optionally, the apparatus further comprises: the training module 360 is suitable for constructing an abnormal behavior database according to the identified abnormal operation behaviors of the user; and training according to the abnormal behavior database to generate an abnormal behavior judgment model so as to identify the abnormal operation behavior of the user.
The descriptions of the modules refer to the corresponding descriptions in the method embodiments, and are not repeated herein.
The application also provides a nonvolatile computer storage medium, wherein the computer storage medium stores at least one executable instruction, and the computer executable instruction can execute the method for identifying the abnormal operation behavior of the user in any method embodiment.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the electronic device.
As shown in fig. 4, the electronic device may include: a processor (processor) 402, a communication Interface 404, a memory 406, and a communication bus 408.
Wherein:
the processor 402, communication interface 404, and memory 406 communicate with each other via a communication bus 408.
A communication interface 404 for communicating with network elements of other devices, such as clients or other servers.
The processor 402 is configured to execute the program 410, and may specifically perform relevant steps in the above-described method for identifying an abnormal operation behavior of a user.
In particular, program 410 may include program code comprising computer operating instructions.
The processor 402 may be a central processing unit CPU, or an Application Specific Integrated Circuit ASIC (Application Specific Integrated Circuit), or one or more Integrated circuits configured to implement an embodiment of the present invention. The electronic device comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 406 for storing a program 410. Memory 406 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 410 may be specifically configured to enable the processor 402 to execute the method for identifying the abnormal operation behavior of the user in any method embodiment described above. For specific implementation of each step in the program 410, reference may be made to corresponding steps and corresponding descriptions in units in the above embodiment for identifying abnormal operation behaviors of the user, which are not described herein again. It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described devices and modules may refer to the corresponding process descriptions in the foregoing method embodiments, and are not described herein again.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: rather, the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the devices in an embodiment may be adaptively changed and arranged in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Moreover, those skilled in the art will appreciate that although some embodiments described herein include some features included in other embodiments, not others, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.
Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It will be appreciated by those skilled in the art that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in a user abnormal operation behavior recognition apparatus according to an embodiment of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on a computer readable medium or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means can be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Claims (10)

1. A method for identifying abnormal operation behaviors of a user is characterized by comprising the following steps:
according to the behavior data of the users in the appointed time, constructing the behavior characteristic vector of each user;
clustering each user according to the behavior feature vector of each user to obtain user groups of different categories;
establishing a user behavior baseline, and establishing a user group behavior baseline;
calculating deviation values of all users with different deviation dimensions according to the behavior feature vectors of all users, the user behavior baselines and/or the user group behavior baselines;
and clustering according to the deviation value of each user, and identifying abnormal operation behaviors of the users.
2. The method of claim 1, wherein constructing the behavior feature vector of each user according to the behavior data of the user in a specified time further comprises:
acquiring operation behavior data generated by each user in a specified time unit; the operation behavior comprises an account login behavior, a connection equipment behavior, an internet surfing behavior, a database access behavior, a file transmission behavior and/or a mail receiving and sending behavior;
and recording the operation behavior data of each user in a specified time unit in a numerical value form, and constructing and obtaining at least one behavior feature vector of each user.
3. The method of claim 1, wherein the clustering each user according to the behavior feature vector of each user to obtain user groups of different categories further comprises:
counting the dimensionality of the behavior characteristic vector of each user to obtain the highest dimensionality;
according to the highest dimensionality, conducting dimensionality completion processing on the behavior characteristic vectors of all the users; wherein the supplementary treatment specifically comprises: filling the specified positions of the behavior feature vectors of the user with specified completion values;
clustering the behavior characteristic vectors of all users by using a specified algorithm to obtain a plurality of categories of user groups; and the behavior feature vectors of the users in each user group have similarity.
4. The method of claim 1, wherein the constructing a user behavior baseline, and wherein the constructing a user group behavior baseline further comprises:
aiming at any user, calculating to obtain a behavior feature vector mean value according to a plurality of behavior feature vectors of the user, and taking the behavior feature vector mean value as a user behavior baseline;
and aiming at any user group, calculating to obtain an average value as a user group behavior baseline according to the user behavior baseline of each user in the group.
5. The method of claim 1, wherein calculating deviation values for each user for different deviation dimensions based on the behavior feature vector for each user, the user behavior baseline, and/or the user group behavior baseline further comprises:
respectively calculating deviation values of the users according to the behavior characteristic vectors of the users, the user behavior baselines and/or the user group behavior baselines and different deviation dimensions; wherein each user has a plurality of deviation values of different deviation dimensions; the deviation values of the different deviation dimensions include: the deviation value of the behavior characteristic vector of the user and the behavior baseline of the user, the deviation value of the behavior characteristic vector of the user and the behavior baseline of the user group, the deviation value of the behavior characteristic vector of the user in a specified time period and the predicted behavior characteristic vector of the user in the time period, and/or the deviation value of the behavior baseline of the user in the user group where the user is located and the behavior baselines of other user groups.
6. The method of claim 5, wherein the clustering process is performed according to the deviation value of each user, and the identifying abnormal user operation behavior further comprises:
splicing the multiple calculated deviation values of the user into a deviation value characteristic matrix aiming at any user;
clustering the deviation value characteristic matrix of each user to obtain abnormal data;
and sequencing the abnormal data from large to small, and obtaining the abnormal data sequenced in advance to perform backtracking and investigation so as to confirm the abnormal operation behavior of the corresponding user.
7. The method according to any one of claims 1-6, further comprising:
constructing an abnormal behavior database according to the identified abnormal operation behavior of the user;
and training according to the abnormal behavior database to generate an abnormal behavior judgment model so as to identify the abnormal operation behavior of the user.
8. An apparatus for recognizing abnormal operation behavior of a user, the apparatus comprising:
the first construction module is suitable for constructing the behavior characteristic vector of each user according to the behavior data of the user in the appointed time;
the clustering module is suitable for clustering each user according to the behavior characteristic vector of each user to obtain user groups of different categories;
the second construction module is suitable for constructing a user behavior baseline and constructing a user group behavior baseline;
the calculation module is suitable for calculating deviation values of the users with different deviation dimensions according to the behavior feature vectors of the users, the user behavior baselines and/or the user group behavior baselines;
and the identification module is suitable for carrying out clustering processing according to the deviation value of each user and identifying abnormal operation behaviors of the users.
9. An electronic device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface are communicated with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the user abnormal operation behavior identification method according to any one of claims 1-7.
10. A computer storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform an operation corresponding to the method for identifying abnormal operation behavior of a user according to any one of claims 1 to 7.
CN202110764472.4A 2021-07-06 2021-07-06 User abnormal operation behavior identification method and device Pending CN115577348A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110764472.4A CN115577348A (en) 2021-07-06 2021-07-06 User abnormal operation behavior identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110764472.4A CN115577348A (en) 2021-07-06 2021-07-06 User abnormal operation behavior identification method and device

Publications (1)

Publication Number Publication Date
CN115577348A true CN115577348A (en) 2023-01-06

Family

ID=84579234

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110764472.4A Pending CN115577348A (en) 2021-07-06 2021-07-06 User abnormal operation behavior identification method and device

Country Status (1)

Country Link
CN (1) CN115577348A (en)

Similar Documents

Publication Publication Date Title
CN110399925B (en) Account risk identification method, device and storage medium
CN107423613B (en) Method and device for determining device fingerprint according to similarity and server
US20100211551A1 (en) Method, system, and computer readable recording medium for filtering obscene contents
CN111614690A (en) Abnormal behavior detection method and device
CN109831459B (en) Method, device, storage medium and terminal equipment for secure access
CN110855648B (en) Early warning control method and device for network attack
CN112837069A (en) Block chain and big data based secure payment method and cloud platform system
CN111090807A (en) Knowledge graph-based user identification method and device
CN112839014A (en) Method, system, device and medium for establishing model for identifying abnormal visitor
CN113132311A (en) Abnormal access detection method, device and equipment
CN113313479A (en) Payment service big data processing method and system based on artificial intelligence
CN111611519A (en) Method and device for detecting personal abnormal behaviors
CN113282920B (en) Log abnormality detection method, device, computer equipment and storage medium
CN112437034A (en) False terminal detection method and device, storage medium and electronic device
CN113434857A (en) User behavior safety analysis method and system applying deep learning
CN116599743A (en) 4A abnormal detour detection method and device, electronic equipment and storage medium
CN106982147A (en) The communication monitoring method and device of a kind of Web communication applications
CN112347457A (en) Abnormal account detection method and device, computer equipment and storage medium
CN115577348A (en) User abnormal operation behavior identification method and device
CN115801309A (en) Big data-based computer terminal access security verification method and system
CN111049839B (en) Abnormity detection method and device, storage medium and electronic equipment
CN111429110B (en) Store standardized auditing method, store standardized auditing device, store standardized auditing equipment and store medium
CN110990810B (en) User operation data processing method, device, equipment and storage medium
CN114528908A (en) Network request data classification model training method, classification method and storage medium
CN114492576A (en) Abnormal user detection method, system, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination