CN114338248A

CN114338248A - User abnormal behavior detection method and device based on machine learning

Info

Publication number: CN114338248A
Application number: CN202210249805.4A
Authority: CN
Inventors: 孙基男; 刘学洋; 胡文蕙; 李天赐; 郑超凡
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2022-03-15
Filing date: 2022-03-15
Publication date: 2022-04-12
Anticipated expiration: 2042-03-15
Also published as: CN114338248B

Abstract

The invention provides a method and a device for detecting abnormal user behaviors based on machine learning, wherein the method comprises the following steps: acquiring operation log information of a user; inputting the operation log information into a behavior detection model to obtain a user behavior attribute output by the behavior detection model; the behavior detection model is obtained by training by taking sample log information as a sample and taking user behavior attribute sample data corresponding to the sample log information as a label; the behavior detection model internally comprises a plurality of neural network models, each neural network model outputs a reference behavior attribute based on the operation log information, and the user behavior attribute is determined based on the weight parameters of the plurality of neural network models and the corresponding output reference behavior attribute. According to the method and the device for detecting the abnormal user behaviors based on the machine learning, the accuracy of obtaining the user behavior attributes is higher, the efficiency of detecting the abnormal user behaviors can be improved, and the abnormal user behaviors can be found in time conveniently.

Description

User abnormal behavior detection method and device based on machine learning

Technical Field

The invention relates to the technical field of computers, in particular to a user abnormal behavior detection method and device based on machine learning.

Background

With the development of internet technology, various communities are active in work and life, various electronic devices can reflect operation behaviors of users, and the behaviors of the users need to be monitored in time and abnormal behaviors of the users need to be detected in time in the process of managing the communities.

The existing detection for the abnormal behaviors of the user has the problems of low detection accuracy, low efficiency and untimely abnormal behavior discovery.

Disclosure of Invention

The invention provides a user abnormal behavior detection method and device based on machine learning, which are used for solving the defects of low detection accuracy and efficiency and untimely abnormal behavior discovery in the detection of abnormal behaviors of users in the prior art, realizing higher accuracy in obtaining user behavior attributes, improving the efficiency of detecting the abnormal behaviors of the users and facilitating the timely discovery of the abnormal behaviors of the users.

The invention provides a user abnormal behavior detection method based on machine learning, which comprises the following steps: acquiring operation log information of a user; inputting the operation log information into a behavior detection model to obtain a user behavior attribute output by the behavior detection model; the behavior detection model is obtained by training by taking sample log information as a sample and taking user behavior attribute sample data corresponding to the sample log information as a label; the behavior detection model internally comprises a plurality of neural network models, each neural network model outputs a reference behavior attribute based on the operation log information, and the user behavior attribute is determined based on the weight parameters of the plurality of neural network models and the corresponding output reference behavior attribute.

According to the user abnormal behavior detection method based on machine learning provided by the invention, the sample log information comprises an abnormal behavior sample and a normal behavior sample, and the acquisition process of the sample log information comprises the following steps: acquiring a sample log sequence of a target user in a target time period; extracting the abnormal behavior sample from the sample log sequence; based on the abnormal behavior samples, sampling at intervals of target duration in the sample log sequence, and extracting the normal behavior samples, wherein the duration interval between the normal behavior samples and any abnormal behavior sample is greater than the target duration.

According to the method for detecting the abnormal behaviors of the user based on the machine learning, provided by the invention, the training process of the behavior detection model comprises the following steps: extracting log dimensional features from the sample log information; performing time domain aggregation processing on the log dimension characteristics to obtain time domain dimension characteristics; and training a behavior detection model based on the time domain dimension characteristics.

According to the method for detecting the abnormal behavior of the user based on the machine learning, provided by the invention, the time domain aggregation processing is carried out on the log dimensional characteristics to obtain the time domain dimensional characteristics, and the method comprises the following steps: and performing time domain aggregation processing on the log dimensional features based on the feature frequency information and the feature category information to obtain the time domain dimensional features.

According to the method for detecting the abnormal behaviors of the user based on the machine learning, provided by the invention, the training process of the behavior detection model comprises the following steps: respectively inputting the sample log information into a plurality of neural network models of the behavior detection model to obtain a behavior attribute detection result output by each neural network model; determining a weight parameter of each neural network model based on each behavior attribute detection result and user behavior attribute sample data; and establishing the behavior detection model based on a plurality of neural network models and the corresponding weight parameters.

According to the method for detecting the abnormal user behavior based on the machine learning, provided by the invention, the plurality of neural network models comprise: at least two of the XGB model, the LightGBM model, the RF model, the MLP model, and the LSTM model.

The invention also provides a user abnormal behavior detection device based on machine learning, which comprises: the acquisition module is used for acquiring operation log information of a user; the detection module is used for inputting the operation log information into a behavior detection model to obtain the user behavior attribute output by the behavior detection model; the behavior detection model is obtained by training by taking sample log information as a sample and taking user behavior attribute sample data corresponding to the sample log information as a label; the behavior detection model internally comprises a plurality of neural network models, each neural network model outputs a reference behavior attribute based on the operation log information, and the user behavior attribute is determined based on the weight parameters of the plurality of neural network models and the corresponding output reference behavior attribute.

The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein when the processor executes the program, the processor realizes the method for detecting the abnormal user behavior based on the machine learning.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for detecting abnormal behavior of a user based on machine learning as described in any one of the above.

The present invention also provides a computer program product comprising a computer program, which when executed by a processor implements the method for detecting abnormal user behavior based on machine learning as described in any one of the above.

According to the method and the device for detecting the abnormal behaviors of the user based on the machine learning, provided by the invention, the operation log information of the user is processed through the behavior detection model formed by combining a plurality of neural network models, and weight information is given to various neural network models in the processing process, so that the accuracy rate of obtaining the behavior attributes of the user is higher, the efficiency of detecting the abnormal behaviors of the user can be improved, and the abnormal behaviors of the user can be found in time conveniently.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of a method for detecting abnormal user behavior based on machine learning according to the present invention;

FIG. 2 is a schematic structural diagram of a device for detecting abnormal user behavior based on machine learning according to the present invention;

fig. 3 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The following describes a method and an apparatus for detecting abnormal user behavior based on machine learning according to the present invention with reference to fig. 1 to 3.

As shown in fig. 1, the present invention provides a method for detecting abnormal behaviors of a user based on machine learning, which includes the following steps 110 to 120.

In step 110, operation log information of the user is obtained.

It can be understood that a plurality of electronic devices that communicate with each other may be included in the device cluster, for example, a plurality of office hosts may be interconnected in an enterprise lan, each office host is used by one employee, each employee may perform a corresponding operation on the office host, for example, may send and receive a work mail on the office host, when a user operates on the electronic device, the operation log information may be stored at a local end or a cloud end of the server of the electronic device, and the operation log information of the user may reflect an operation behavior of the user to some extent.

The operation log information of the user is obtained here, for example, the operation log information of the user may be obtained in real time, or the operation log information generated by the user in the history operation process may be obtained.

And 120, inputting the operation log information into the behavior detection model to obtain the user behavior attribute output by the behavior detection model.

The behavior detection model is obtained by training by taking sample log information as a sample and taking user behavior attribute sample data corresponding to the sample log information as a label.

It can be understood that the behavior detection model is a neural network model, the behavior detection model may be a convolutional neural network model, a full convolutional neural network model, or a residual neural network model, and may also be other types of neural networks, and the behavior detection model may also be formed by aggregating a plurality of neural network models, where the specific type of the behavior detection model is not limited herein.

The Neural network Model belongs to Artificial Neural Networks (ANNs), also called Neural Networks (NNs) for short, or Connection models (Connection models), and is an algorithmic mathematical Model simulating animal Neural network behavior characteristics and performing distributed parallel information processing. The network achieves the aim of processing information by adjusting the mutual connection relationship among a large number of nodes in the network depending on the complexity of the system.

The behavior detection model can be trained through a large amount of sample log information and pre-marked user behavior attribute sample data, and the training accuracy of the behavior detection model can be ensured in a supervised learning mode.

The deep learning neural network used by the behavior detection model can pick out the features in the input sample log information, each feature is used for obtaining an output result, each output result is compared with a sample label, the features meeting the requirements after comparison can be reserved, the features not meeting the requirements after comparison are ignored through the Loss parameter, the core features needing to be memorized can be finally learned through continuous iterative training of a large amount of input sample log information, different core features are classified, and finally the newly input operation log information can be distinguished according to the core features.

Before the behavior detection model is trained, the filter of the convolution layer of the deep learning neural network is completely random, and is not activated for any feature, namely, any feature can not be detected, in the training process, the weight of the blank filter is modified to enable the blank filter to detect a specific scene, which is a supervised learning mode, and based on the supervised learning mode, the deep learning neural network can learn the core features required by self so as to judge the newly input operation log information according to the core features.

And the behavior detection model internally comprises a plurality of neural network models, each neural network model outputs a reference behavior attribute based on the operation log information, and the user behavior attribute is determined based on the weight parameters of the plurality of neural network models and the corresponding output reference behavior attribute.

It can be understood that the behavior detection model may be constructed by combining a plurality of neural network models, for example, the behavior detection model may be constructed by four different neural network models, and the user behavior attribute output by the behavior detection model may be a result obtained by the plurality of neural network models together.

Specifically, in the application and test process of the behavior detection model, each neural network model in the behavior detection model can process the input operation log information to obtain a reference behavior attribute, the behavior detection model can use the reference behavior attribute according to the weight parameter of each neural network model, for example, the reference behavior attributes output by various neural network models can be weighted and summed to obtain the reference behavior attribute, and thus the obtained reference behavior attribute is obtained by the participation of various neural networks in the processing, the error rate is low, and the accuracy is high.

Correspondingly, in the training process of the behavior detection model, each neural network model processes the once output sample log information to obtain a behavior attribute detection result, the behavior attribute detection result output by each neural network model can be compared with the user behavior attribute sample data to adjust the weight parameters, when the training samples are enough, the weight parameters of each neural network model gradually tend to be accurate, the finally trained weight parameters can be used for constructing the behavior detection model, and the behavior detection model can be used for processing the operation log information to accurately and efficiently obtain the user behavior attributes.

It is worth mentioning that the machine learning algorithm has a good detection effect on abnormal attack patterns, and relatively speaking, the machine learning model is light, and for large-scale log data, the overall model performance is superior to most deep learning models, so that abnormal behavior detection based on machine learning also becomes a hotspot in recent years.

According to the user abnormal behavior detection method based on machine learning, the operation log information of the user is processed through the behavior detection model formed by combining the plurality of neural network models, and weight information is given to the neural network models in the processing process, so that the accuracy rate of obtaining the user behavior attributes is higher, the efficiency of detecting the user abnormal behaviors can be improved, and the abnormal behaviors of the user can be found in time conveniently.

In some embodiments, the sample log information includes abnormal behavior samples and normal behavior samples, and the obtaining of the sample log information includes: acquiring a sample log sequence of a target user in a target time period; extracting an abnormal behavior sample from the sample log sequence; based on the abnormal behavior samples, sampling is carried out in the sample log sequence at intervals of target duration, normal behavior samples are extracted, and the duration interval between the normal behavior samples and any abnormal behavior sample is larger than the target duration.

It can be understood that, in a fixed community, the number of normal behaviors and the number of abnormal behaviors in daily operations of a user are large, so that an abnormal behavior sample is easy to extract, the proportion of the abnormal behavior sample in sample log information needs to be enlarged, a sample log sequence of a target user in a target time period can be obtained, the target time period can be the whole time period from the login device to the logout device of the target user, the time period can be from 9 a.m. to 5 a.m., for example, the target user starts an office host at 9 a.m., and the user closes the office host at 5 a.m.

The abnormal behavior samples can be extracted from the sample log sequence, namely, the process of labeling the sample log sequence is carried out, at the moment, the normal behavior samples can be selected from the sample log sequence by taking the time point corresponding to the abnormal behavior samples as the reference and taking a certain target time length as a time interval, and then the time interval between the selected normal behavior samples and any one abnormal behavior sample is longer than the target time length, so that the quantity proportion of the normal behavior samples can be effectively reduced, the quantity proportion of the abnormal behavior samples is improved, and the proportion of the normal behavior samples and the proportion of the abnormal behavior samples can be relatively balanced.

For example, the target period may be 1 week, and the users with abnormal behavior samples may be selected from all the users, and then 50 users may be taken in each department of the community. And for the behavior logs of the alternative users, if the log is abnormal, putting the log into a training set, generating a random number between 3 and 7, if the log is normal, checking whether the interval between the log and the last time is greater than the number of days of the random number produced at the interval, if so, adopting the log, and if not, regenerating the random number of the user.

Specifically, the proportion of positive and negative samples of user log information in a CERT data set can be preliminarily counted, the positive samples are abnormal behavior samples, the negative samples are normal behavior samples, the fact that the initial data have great data inclination can be found, the proportion of the positive and negative samples can reach 1:2000, and therefore great influence can be brought to training of a behavior detection model.

In order to maintain the integrity of data, considering that the sample log sequence of a normal user has high repeatability, the data is processed in an undersampling mode, so that the proportion of abnormal behavior samples in the sample log information can be improved by reducing the number of normal behavior samples in the sample log information, and the accuracy of training a behavior detection model can be improved.

In some embodiments, the training process of the behavior detection model comprises: extracting log dimension characteristics from sample log information; performing time domain polymerization processing on the log dimension characteristics to obtain time domain dimension characteristics; and training the behavior detection model based on the time domain dimension characteristics.

It can be understood that log dimensional features, which are also called node dimensional features, may be extracted from sample log information, and in this step, features included in each log are extracted according to original sample log information, so that the obtained log dimensional features are log dimensional features.

Secondly, performing time domain aggregation processing on the log dimension features, for example, performing feature aggregation of a day dimension, a week dimension or a session dimension to obtain time domain dimension features.

For example, events that occur may be aggregated in units of user _ id with time as an axis, and user behavior may be aggregated by certain time periods. The time unit of aggregation may be Day, Week, or Session, which is the basic unit of time domain aggregation process, and the expected constituent input may be expressed as: user _ id, user _ context _ information, user _ action _ feature.

In some embodiments, performing time domain aggregation processing on the log dimension feature to obtain a time domain dimension feature includes: and performing time domain aggregation processing on the log dimension characteristics based on the characteristic frequency information and the characteristic category information to obtain time domain dimension characteristics.

It can be understood that, in the process of performing time domain aggregation processing on the log dimension feature, time domain aggregation processing may be performed on the log dimension feature based on the feature frequency information and the feature category information to obtain the time domain dimension feature.

Time domain dimension characteristics obtained by performing time domain aggregation processing based on the characteristic frequency information and the characteristic category information are roughly classified into two types: frequency characteristics and monomer size class characteristics. The frequency characteristics mainly include characteristics corresponding to the frequency of different types of behaviors of the user within a certain period of specified time, for example, when the detection object is an abnormal operation behavior of a company employee, the time dimension characteristics may include: the number of email sending, the number of times of file access or webpage access after work, and the like. The single size class characteristics mainly represent the characteristics of the number or size of the email attachments sent by the user, the number of the email characters or the number of words in the http access webpage and the like.

In a detection scenario for abnormal behaviors of employees of a company, 96 session dimensional features can be extracted from the node dimensional features, for example, 6 features related to a user, 26 features related to an http website, 15 features related to a file, 24 features related to an email, 16 features related to a device, 3 features related to a department of department, and 6 features related to a superordinate supervisor.

The method can extract features of a single log dimension (node dimension) first, and then aggregate the features of the node dimension into features of a session dimension, wherein the features mainly include statistical frequency and categories, and the specific feature meaning is as follows: user represents the serial number code of user id, day represents the number of days from the starting point time calculated with the minimum value of time among all users as the starting point, act represents the serial number code of activity ('login' 1, 'logoff' 2, 'connect' 3, 'disconnect' 4, 'http' 5, 'email' 6, 'file' 7), pc represents pc type (0: own pc; 1: shared pc; 2: other 'pc; 3: super's pc), time represents time stamp, taking ms as a unit, taking the first activity of a user as the starting time, usb _ dur representing the duration of usb device insertion equipment, file _ tree _ len representing the path length of a file, file _ type representing the type of the file, file _ len representing the memory size of the file, file __ nwords representing the number of words in the file, disk representing that the file operation is performed in a system disk C disk as 1, the R disk as 2, and the other disks as 0; file _ depth represents the file tree depth; file _ act represents the file operation type (open: 1; copy: 2; write: 3; delete: 4); to _ usb means write to usb (copy file to usb); from _ usb represents reading from usb (copying the file from usb to pc end); http _ type represents the type of website visited (other 1, socnet 2, closed 3, job4, leak 5, hack 6); url _ len represents the url length of access; url _ len represents the domain name tree depth of url; http _ c _ len represents the length of domain; http _ c _ nwords represents the number of words in domain; send _ mail indicates whether to send mail (send is 1); the receive _ mail indicates whether a file is received (either receive or view is 1); n _ des represents the sum of the number of people who copy cc and copy bcc of the mail; n _ atts represents the number of attachments in the mail; the Xemail indicates whether the received file is a non-copy file (direct transmission); n _ exdes represents the number of people that sent non-copy to the destination mailbox; n _ bccdes represents the number of people sending a blind copy to the destination mailbox; exbcmail indicates whether the received mail is a blind copy mail; email _ size represents the mail size; email _ text _ SLen represents the mail text length; email _ text _ words represents the number of words in the body of the mail; e _ att _ other indicates that there are other types in the mail attachment; e _ att _ comp indicates that there is a compressed file in the mail attachment; e _ att _ pho represents that a picture file exists in the mail attachment; e _ att _ doc represents that doc documents exist in the mail attachment; e _ att _ txt represents that a txt document exists in the mail attachment; e _ att _ exe indicates that an executable file exists in the mail attachment; e _ att _ gather indicates the number of other files in the mail attachment; e _ att _ scomp represents the number of compressed files in the mail attachment; e _ att _ sdoc represents the number of doc documents in the mail attachment; e _ att _ stxt represents the number of txt documents in the mail attachment; e _ att _ sexe represents the number of executable files in the mail attachment.

In some embodiments, the training process of the behavior detection model comprises: respectively inputting the sample log information into a plurality of neural network models of the behavior detection model to obtain a behavior attribute detection result output by each neural network model; determining a weight parameter of each neural network model based on each behavior attribute detection result and user behavior attribute sample data; and establishing a behavior detection model based on the plurality of neural network models and the corresponding weight parameters.

It can be understood that, in the course of training the behavior detection model, the sample log information may be respectively input into a plurality of neural network models corresponding to the behavior detection model, each neural network model may output a behavior attribute detection result, the behavior attribute detection result output by each neural network model is compared with the user behavior attribute sample data, and the weight parameter of each neural network model is determined according to the comparison result, and when constructing the behavior detection model, the final behavior detection model may be determined by using each neural network model and the corresponding weight parameter, for example, the action proportion exerted by each neural network model in the behavior detection process may be determined according to the weight parameter, so as to determine the user behavior attribute output by the behavior detection model.

In some embodiments, the plurality of neural network models includes: at least two of the XGB model, the LightGBM model, the RF model, the MLP model, and the LSTM model.

For example, bagging model fusion may be performed based on XGB, LightGBM, RF, MLP, and LSTM model results. The obtained features can be added into five single neural network models respectively to obtain corresponding accuracy auc and recall rate recall, and then weight information of weighted voting during model fusion is calculated according to the accuracy auc of the five single models to obtain final accuracy auc and recall rate recall.

The features are added into five single models, the accuracy rates auc of the five obtained neural network models are respectively 0.768, 0.859, 0.713, 0.783 and 0.836, and the sum of the accuracy rates auc of the five neural network models is 3.959.

And calculating the weights of the five neural network models in voting according to the accuracy auc as RF: 0.194, MLP: 0.217, XGB: 0.180, LightGBM: 0.198, LSTM: 0.211. after model fusion, namely weighted voting, is carried out according to the weights, the obtained accuracy auc is 0.869, and the recall rate recall is 0.824. Based on XGB, LightGBM, RF, MLP and LSTM model results, bagging model fusion is carried out, and the accuracy of the obtained behavior detection model is higher.

For example, in the process of detecting abnormal behaviors of employees of a company, four abnormal behaviors can be detected: a certain user uses mobile equipment after going off the work and uploads data to a database website which is difficult to legally supervise; or the user browses the recruitment website and copies the data to the mobile equipment before leaving the job; or the system administrator is not satisfied with the work, a keyboard recorder keylogger is installed in a computer of a designated person, and the user logs in a mailbox by the identity of the designated person on the next day and sends mails in groups to damage the benefits of the company and leave the company quickly by recording passwords; or the user logs in to another person's device and has searched for some relevant files to be sent to the home mailbox through the corporate mailbox and continue the activity for some future time.

The following describes the device for detecting abnormal user behavior based on machine learning according to the present invention, and the device for detecting abnormal user behavior based on machine learning described below and the method for detecting abnormal user behavior based on machine learning described above may be referred to in correspondence.

As shown in fig. 2, the present invention further provides a device for detecting abnormal user behavior based on machine learning, which includes: the device comprises an acquisition module and a detection module.

The obtaining module 210 is configured to obtain operation log information of a user.

The detection module 220 is configured to input the operation log information into the behavior detection model to obtain the user behavior attribute output by the behavior detection model.

The behavior detection model is obtained by training by taking sample log information as a sample and taking user behavior attribute sample data corresponding to the sample log information as a label; the behavior detection model internally comprises a plurality of neural network models, each neural network model outputs a reference behavior attribute based on the operation log information, and the user behavior attribute is determined based on the weight parameters of the plurality of neural network models and the corresponding output reference behavior attribute.

Fig. 3 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 3: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and a communication bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 communicate with each other via the communication bus 340. The processor 310 may invoke logic instructions in the memory 330 to perform a machine learning based user anomalous behavior detection method comprising: acquiring operation log information of a user; inputting the operation log information into a behavior detection model to obtain a user behavior attribute output by the behavior detection model; the behavior detection model is obtained by training by taking sample log information as a sample and taking user behavior attribute sample data corresponding to the sample log information as a label; the behavior detection model internally comprises a plurality of neural network models, each neural network model outputs a reference behavior attribute based on the operation log information, and the user behavior attribute is determined based on the weight parameters of the plurality of neural network models and the corresponding output reference behavior attribute.

In addition, the logic instructions in the memory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present invention further provides a computer program product, where the computer program product includes a computer program, the computer program can be stored on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, a computer can execute the method for detecting abnormal user behavior based on machine learning provided by the above methods, where the method includes: acquiring operation log information of a user; inputting the operation log information into a behavior detection model to obtain a user behavior attribute output by the behavior detection model; the behavior detection model is obtained by training by taking sample log information as a sample and taking user behavior attribute sample data corresponding to the sample log information as a label; the behavior detection model internally comprises a plurality of neural network models, each neural network model outputs a reference behavior attribute based on the operation log information, and the user behavior attribute is determined based on the weight parameters of the plurality of neural network models and the corresponding output reference behavior attribute.

In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the method for detecting abnormal behavior of a user based on machine learning provided by the above methods, the method including: acquiring operation log information of a user; inputting the operation log information into a behavior detection model to obtain a user behavior attribute output by the behavior detection model; the behavior detection model is obtained by training by taking sample log information as a sample and taking user behavior attribute sample data corresponding to the sample log information as a label; the behavior detection model internally comprises a plurality of neural network models, each neural network model outputs a reference behavior attribute based on the operation log information, and the user behavior attribute is determined based on the weight parameters of the plurality of neural network models and the corresponding output reference behavior attribute.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A user abnormal behavior detection method based on machine learning is characterized by comprising the following steps:

acquiring operation log information of a user;

inputting the operation log information into a behavior detection model to obtain a user behavior attribute output by the behavior detection model;

2. The method for detecting abnormal behaviors of users based on machine learning according to claim 1, wherein the sample log information includes abnormal behavior samples and normal behavior samples, and the obtaining process of the sample log information includes:

acquiring a sample log sequence of a target user in a target time period;

extracting the abnormal behavior sample from the sample log sequence;

based on the abnormal behavior samples, sampling at intervals of target duration in the sample log sequence, and extracting the normal behavior samples, wherein the duration interval between the normal behavior samples and any abnormal behavior sample is greater than the target duration.

3. The method for detecting abnormal user behaviors based on machine learning according to claim 1, wherein the training process of the behavior detection model comprises:

extracting log dimensional features from the sample log information;

performing time domain aggregation processing on the log dimension characteristics to obtain time domain dimension characteristics;

and training a behavior detection model based on the time domain dimension characteristics.

4. The method for detecting abnormal user behaviors based on machine learning according to claim 3, wherein the time domain aggregation processing is performed on the log dimensional features to obtain time domain dimensional features, and the method comprises the following steps:

and performing time domain aggregation processing on the log dimensional features based on the feature frequency information and the feature category information to obtain the time domain dimensional features.

5. The method for detecting abnormal behaviors of users based on machine learning according to any one of claims 1 to 4, wherein the training process of the behavior detection model comprises:

respectively inputting the sample log information into a plurality of neural network models of the behavior detection model to obtain a behavior attribute detection result output by each neural network model;

determining a weight parameter of each neural network model based on each behavior attribute detection result and user behavior attribute sample data;

and establishing the behavior detection model based on a plurality of neural network models and the corresponding weight parameters.

6. The method of claim 5, wherein the neural network models comprise: at least two of the XGB model, the LightGBM model, the RF model, the MLP model, and the LSTM model.

7. A device for detecting abnormal user behavior based on machine learning, comprising:

the acquisition module is used for acquiring operation log information of a user;

the detection module is used for inputting the operation log information into a behavior detection model to obtain the user behavior attribute output by the behavior detection model;

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for detecting abnormal behavior of a user based on machine learning according to any one of claims 1 to 6 when executing the program.

9. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the method for detecting abnormal behavior of a user based on machine learning according to any one of claims 1 to 6.

10. A computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements the method for machine learning based abnormal behavior detection of a user according to any one of claims 1 to 6.