CN115659326A - User behavior baseline prediction method and user behavior baseline prediction model training method - Google Patents

User behavior baseline prediction method and user behavior baseline prediction model training method Download PDF

Info

Publication number
CN115659326A
CN115659326A CN202211236985.9A CN202211236985A CN115659326A CN 115659326 A CN115659326 A CN 115659326A CN 202211236985 A CN202211236985 A CN 202211236985A CN 115659326 A CN115659326 A CN 115659326A
Authority
CN
China
Prior art keywords
behavior
user
user behavior
baseline
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211236985.9A
Other languages
Chinese (zh)
Inventor
李云龙
谭学士
陈祚松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qax Technology Group Inc
Secworld Information Technology Beijing Co Ltd
Original Assignee
Qax Technology Group Inc
Secworld Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qax Technology Group Inc, Secworld Information Technology Beijing Co Ltd filed Critical Qax Technology Group Inc
Priority to CN202211236985.9A priority Critical patent/CN115659326A/en
Publication of CN115659326A publication Critical patent/CN115659326A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention provides a user behavior baseline prediction method and a user behavior baseline prediction model training method, and relates to the technical field of network security, wherein the user behavior baseline prediction method comprises the following steps: acquiring the real behavior of a target entity from a log according to a user behavior extraction rule; determining a predicted behavior baseline of a target entity according to a user behavior baseline prediction model; an alarm event is generated when the true behavior deviates from the predicted behavior baseline. By the method, the user behavior baseline prediction model can learn the historical real behavior rule of the target entity, so that the influence of sporadic user behavior abnormal conditions on the user behavior baseline prediction model is reduced, and the predicted behavior baseline is more consistent with the historical real behavior of the target entity; and under the condition that the real behavior deviates from the predicted behavior baseline, generating an alarm event, and reducing the false alarm rate of the abnormal behavior of the user.

Description

User behavior baseline prediction method and user behavior baseline prediction model training method
Technical Field
The invention relates to the technical field of network security, in particular to a user behavior baseline prediction method and a user behavior baseline prediction model training method.
Background
At present, for the discovery of abnormal behaviors of users, a customized modeling method is generally used for modeling a certain behavior to construct a user portrait.
In the prior art, for the prediction of the behavior baseline, a traditional statistical method is usually adopted to calculate an average value and a threshold value, and when the behavior baseline deviates from the threshold value, an alarm is given. However, since the user behavior sometimes changes suddenly, such as going on a business trip and changing work stations, the threshold calculated by the statistical method is too sensitive to such a situation, it is difficult to eliminate the abnormal value, the result of the threshold is affected, and the false alarm rate of the abnormal behavior is high.
Disclosure of Invention
Aiming at the problems in the prior art, the embodiment of the invention provides a user behavior baseline prediction method and a user behavior baseline prediction model training method.
Specifically, the embodiment of the invention provides the following technical scheme:
in a first aspect, an embodiment of the present invention provides a user behavior baseline prediction method, including:
acquiring the real behavior of a target entity from a log according to a user behavior extraction rule; wherein the target entity is a user group consisting of at least one user;
determining a predicted behavior baseline of the target entity according to a user behavior baseline prediction model; the user behavior baseline prediction model is obtained by training a target machine learning model according to a training data set and a verification data set extracted from the historical data of the real behaviors;
generating an alert event when the actual behavior deviates from the predicted behavior baseline.
Optionally, the user behavior extraction rule is used to indicate a mapping relationship between a preset field in the log and an entity attribute in the real behavior.
Optionally, the method further comprises:
constructing a personal baseline for a first user within the target entity; the individual baseline comprises a first group threshold and/or a first individual threshold, the first group threshold is used for measuring whether the group where the first user is located is outlier, and the first individual threshold is used for measuring whether the first user is outlier;
issuing a first alert when the first user's true behavior exceeds the first group threshold; the first alarm is used for indicating the group outlier where the first user is;
issuing a second alert when the true behavior of the first user exceeds the first personal threshold; the second alert is to indicate the first user is outlier.
Optionally, the constructing a personal baseline for a first user within the target entity comprises:
under the condition that the real behavior of the target entity conforms to normal distribution, determining a first group threshold of the target entity according to historical data of the real behavior of the target entity;
determining a first personal threshold for the first user based on historical data of the first user's true behavior;
constructing the personal baseline according to the real behavior of the target entity, the first population threshold, and the first person threshold.
In a second aspect, an embodiment of the present invention further provides a method for training a user behavior baseline prediction model, including:
acquiring the real behavior of a target entity from a log according to a user behavior extraction rule; the target entity is a user group consisting of at least one user; the user behavior extraction rule is used for indicating the mapping relation between a specific field in the log and the entity attribute in the real behavior;
performing machine learning feature extraction on the historical data of the real behaviors to obtain a training data set and a verification data set;
training a target machine learning model according to the training data set and the verification data set to generate a user behavior baseline prediction model; the user behavior baseline prediction model is applied to the user behavior baseline prediction method according to the first aspect.
In a third aspect, an embodiment of the present invention further provides a device for predicting a user behavior baseline, including:
the first acquisition module is used for acquiring the real behavior of the target entity from the log according to the user behavior extraction rule;
the determining module is used for determining a predicted behavior baseline of the target entity according to the user behavior baseline prediction model; the user behavior baseline prediction model is obtained by training a target machine learning model according to a training data set and a verification data set extracted from the historical data of the real behaviors;
a generating module for generating an alarm event when the real behavior deviates from the predicted behavior baseline.
In a fourth aspect, an embodiment of the present invention further provides a device for training a user behavior baseline predictive model, including:
the second acquisition module is used for acquiring the real behavior of the target entity from the log according to the user behavior extraction rule; the target entity is a user group consisting of at least one user; the user behavior extraction rule is used for indicating the mapping relation between a specific field in the log and the entity attribute in the real behavior;
the extraction module is used for extracting machine learning characteristics of the historical data of the real behaviors to obtain a training data set and a verification data set;
the training module is used for training a target machine learning model according to the training data set and the verification data set to generate a user behavior baseline prediction model; the user behavior baseline prediction model is applied to the user behavior baseline prediction method according to the first aspect.
In a fifth aspect, an embodiment of the present invention further provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the user behavior baseline prediction method according to the first aspect, or implements the user behavior baseline prediction model training method according to the second aspect when executing the program.
In a sixth aspect, the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the user behavior baseline prediction method according to the first aspect, or implements the user behavior baseline prediction model training method according to the second aspect.
In a seventh aspect, an embodiment of the present invention further provides a computer program product, where executable instructions are stored on the computer program product, and when executed by a processor, the instructions cause the processor to implement the user behavior baseline prediction method according to the first aspect, or implement the user behavior baseline prediction model training method according to the second aspect.
According to the user behavior baseline prediction method provided by the embodiment of the invention, the real behavior of the target entity is collected from the log through the user behavior extraction rule; then, a predicted behavior baseline of the target entity is determined according to the user behavior baseline prediction model, and the user behavior baseline prediction model is obtained after training the target machine learning model according to a training data set and a verification data set extracted from historical data of real behaviors, so that the user behavior baseline prediction model can fully learn the historical real behavior rules of the target entity, thereby reducing the influence of sporadic user behavior abnormal conditions on the user behavior baseline prediction model, enabling the robustness of the user behavior baseline prediction model to be stronger, further enabling the predicted behavior baseline to be more accurate and better conforming to the historical real behaviors of the target entity; and under the condition that the real behavior deviates from the predicted behavior baseline, generating an alarm event, thereby reducing the false alarm rate of the abnormal behavior of the user.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of a user behavior baseline prediction method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a login behavior of a real user according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a user login behavior baseline provided by an embodiment of the present invention;
FIG. 4 is a schematic illustration of a personal baseline provided by an embodiment of the present invention;
FIG. 5 is a schematic diagram of a normal distribution provided by an embodiment of the present invention;
FIG. 6 is a second flowchart illustrating a method for predicting a user behavior baseline according to an embodiment of the invention;
FIG. 7 is a flowchart illustrating a method for training a user behavior baseline predictive model according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a user behavior baseline prediction apparatus according to an embodiment of the present invention;
FIG. 9 is a schematic structural diagram of a training apparatus for a user behavior baseline predictive model according to an embodiment of the present invention;
fig. 10 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the prior art, a traditional statistical method is usually adopted for threshold prediction of a behavior baseline, an average value is calculated, a threshold is further obtained, and when the behavior baseline deviates from the threshold, an alarm is given. However, because the user behavior sometimes changes suddenly, such as going on business and changing work posts, the threshold calculated by the statistical method is too sensitive to the situation, and it is difficult to eliminate the abnormal value, which affects the result of the threshold, and causes a high false alarm rate of the abnormal behavior; in addition, it is not enough to perform anomaly detection from the change of the user behavior baseline singly, and the customized modeling is performed on the user behavior of each user, so that the labor cost and the time cost are high.
In summary, in order to solve the above technical problems, the present invention provides a user behavior baseline prediction method and a user behavior baseline prediction model training method.
Fig. 1 is a schematic flow chart of a user behavior baseline prediction method provided in an embodiment of the present invention, and as shown in fig. 1, the user behavior baseline prediction method includes the following steps:
step 101, collecting the real behavior of a target entity from a log according to a user behavior extraction rule; the target entity is a user group formed by at least one user.
It should be noted that the user behavior baseline prediction method provided by the embodiment of the present invention is applicable to a user behavior anomaly detection scenario. The execution subject of the method may be a user behavior baseline prediction apparatus, such as an electronic device, or a control module in the user behavior baseline prediction apparatus for executing the user behavior baseline prediction method. The electronic device may include a mobile phone, a tablet computer, a desktop computer, or the like.
In this embodiment, first, a target entity is defined according to a log type, where the target entity is a user group formed by at least one user, and the log type is, for example, a log of a terminal; and then setting a user behavior extraction rule, namely extracting the real behavior of the target entity from the log according to the user behavior extraction rule.
Optionally, the user behavior extraction rule is used to indicate a mapping relationship between a preset field in the log and an entity attribute in the real behavior.
In practical application, taking the fact that the real behavior needing to be collected from the log is the "behavior of the user logging in the terminal" as an example, the real behavior of the collected target entity is explained as follows:
firstly, determining the type of the log as a terminal login log according to a real behavior (namely a behavior of logging in the terminal by a user); then, defining a target entity, wherein the target entity is a user group consisting of at least one user; for example, each user in the target entity is in a security Domain (also called AD Domain), the target entity is defined as an AD Domain, and all users in the AD Domain are referred to by the target entity.
Because the user behavior extraction rule is used for indicating the mapping relation between the preset field in the log and the entity attribute in the real behavior, after the target entity is determined, the preset field in the log of the terminal can be mapped to the entity attribute in the real behavior through the user behavior extraction rule;
for example, by using the behavior extraction rule, preset fields such as job number, name, department, behavior event, behavior result (e.g., whether login is successful) and the like can be extracted, and corresponding filtering can be performed to map the extracted preset fields into entity attributes of the real behavior, so that the real behavior of the target entity is acquired.
FIG. 2 is a diagram illustrating a login behavior of a real user according to an embodiment of the present invention;
the trend of the user behavior is exemplarily represented in fig. 2 by one point every five minutes, that is, fig. 2 shows the statistics corresponding to the actual user login failure behavior of the target entity.
Wherein the abscissa represents a preset time period; the ordinate represents the statistic of the target entity terminal login failure behavior within a preset time period.
It should be noted that, since the behavior of the user logging in the terminal may have three different results, i.e., success, failure, and attempt, different real user logging behaviors may also be constructed according to different behavior results.
102, determining a predicted behavior baseline of the target entity according to a user behavior baseline prediction model; the user behavior baseline prediction model is obtained by training a target machine learning model according to a training data set and a verification data set extracted from the historical data of the real behaviors.
In this embodiment, after the real behavior of the target entity is collected from the log, the predicted behavior baseline of the target entity needs to be determined according to the user behavior baseline prediction model, and the predicted behavior baseline can successfully predict the behavior of the target entity in a future preset time period.
Specifically, the user behavior baseline prediction model is trained in the following way:
firstly, extracting behavior characteristics from historical data of real behaviors, wherein the behavior characteristics can be, for example, the day of the week, whether the behavior is a holiday, a result at the moment of the previous 1 day, a result at the moment of the previous 2 days, a result at the moment of the previous n days, a result at the moment of the last week and a result at the moment of the last week when the behavior of the terminal login failure occurs;
and then, carrying out feature calculation on the features, such as calculation of mean, variance, maximum value, median and minimum value, and then carrying out feature intersection on the features to generate a feature-calculated data set. And (3) carrying out feature calculation on the data set according to the following steps of 9:1, splitting into 10 parts, using 9 parts as a training data set for training, and using 1 part as a verification data set for verification;
and finally, training a target machine learning model by using the training data set and the verification data set until the model converges to obtain a user behavior baseline prediction model, wherein the target machine learning model can be an eXtreme Gradient Boosting (XGboost) model.
And 103, generating an alarm event when the real behavior deviates from the predicted behavior baseline.
After the real behavior of the target entity and the predicted behavior baseline of the target entity are obtained, the behavior of the target entity is represented to be abnormal under the condition that the real behavior deviates from the predicted behavior baseline, and an alarm event is generated.
As shown in fig. 3, fig. 3 is a schematic diagram of a user login behavior baseline provided by the embodiment of the present invention;
in fig. 3, the solid line represents the real user login behavior of the target entity; the dotted line represents the baseline of predicted user login behavior of the target entity; the abscissa represents a preset time period; the ordinate represents the statistic of the target entity terminal login failure behavior within a preset time period.
As can be seen from fig. 3, in the time period from 14/8/00 to 15/8/00, the true user login behavior baseline deviates from the predicted user login behavior baseline, i.e., the behavior anomaly indicating the user terminal login failure in this time period, and generates an alarm event.
According to the user behavior baseline prediction method provided by the embodiment of the invention, the real behavior of the target entity is collected from the log through the user behavior extraction rule; then, a predicted behavior baseline of the target entity is determined according to the user behavior baseline prediction model, and the user behavior baseline prediction model is obtained after training the target machine learning model according to a training data set and a verification data set extracted from historical data of real behaviors, so that the user behavior baseline prediction model can fully learn the historical real behavior rules of the target entity, thereby reducing the influence of sporadic user behavior abnormal conditions on the user behavior baseline prediction model, enabling the robustness of the user behavior baseline prediction model to be stronger, further enabling the predicted behavior baseline to be more accurate and better conforming to the historical real behaviors of the target entity; and generating an alarm event under the condition that the real behavior deviates from the predicted behavior baseline, so that the false alarm rate of the abnormal behavior of the user can be reduced.
Optionally, while constructing the predicted behavior baseline of the target entity, an individual baseline for the first user within the target entity may also be constructed, and the individual baseline may be used to detect whether the behavior of the first user is outlier, and to detect whether the group in which the first user is located is outlier.
Specifically, detecting whether a first user or a group in which the first user is located in a target entity is outlier can be achieved through the following method, which specifically includes steps [1] to [3]:
step [1], constructing a personal baseline for a first user within the target entity; the individual baseline comprises a first group threshold and/or a first individual threshold, the first group threshold is used for measuring whether the group where the first user is located is outlier, and the first individual threshold is used for measuring whether the first user is outlier;
step [2], when the real behavior of the first user exceeds the first group threshold, sending a first alarm; the first alarm is used for representing the group outlier of the first user;
step [3], issuing a second alert when the real behavior of the first user exceeds the first personal threshold; the second alert is to indicate the first user is outlier.
As shown in fig. 4, fig. 4 is a schematic diagram of a personal baseline provided by an embodiment of the present invention, in fig. 4, the abscissa represents the real behaviors of the first user, such as "abnormal data volume of core code of the git repository", "abnormal data volume of CRM sensitive data operations", "abnormal number of AD authentication (ESG) failures"; the ordinate represents the true behavior statistics; the dots represent real behavior statistics corresponding to the first user; the histogram represents a population threshold corresponding to the population in which the first user is located.
As can be seen from fig. 4, the statistic of the true behavior "AD authentication (ESG) failure number is abnormal" of the first user is higher than the group threshold corresponding to the group in which the first user is located.
When the real user behavior of the first user exceeds a first group threshold value, sending a first alarm to indicate that the group in which the first user is located is outlier; when the actual user behavior of the first user exceeds the first personal threshold, a second alert is issued indicating that the first user is out of group.
In the above embodiment, by constructing a personal baseline for a first user in a target entity, it is possible to detect whether the behavior of the first user is outlier, and to detect whether the group in which the first user is located is outlier; by utilizing the personal baseline, the prediction of the behavior baseline of the target entity by the user behavior baseline prediction model can be assisted, so that the abnormal behavior of the target entity can be better discovered.
Optionally, the constructing of the personal baseline for the first user in the target entity may be implemented by the following steps, specifically including step 1) to step 3):
step 1), under the condition that the real behavior of the target entity conforms to normal distribution, determining a first group threshold of the target entity according to historical data of the real behavior of the target entity;
step 2), determining a first personal threshold of the first user according to the historical data of the real behaviors of the first user;
step 3), constructing the personal baseline according to the real behaviors, the first group threshold and the first person threshold.
In this embodiment, it is first required to determine whether the real behavior of the target entity conforms to the normal distribution, and when the real behavior of the target entity conforms to the normal distribution, the first group threshold and the first individual threshold are determined by using the 3sigma principle of the normal distribution, and it is determined whether the first user and the real behavior corresponding to the group in which the first user is located are abnormal based on the first group threshold and the first individual threshold.
As shown in fig. 5, fig. 5 is a schematic diagram of a normal distribution provided by an embodiment of the present invention, in fig. 5, two parameters of the normal distribution, i.e., an expected (mean) μ and a standard deviation σ, and σ 2 is a variance. The sigma 3 principle is: the probability of the numerical distribution in (μ -3 σ, μ +3 σ) is 0.9974, so the number of behaviors other than this probability is determined as abnormal behaviors, that is, μ -3 σ and μ +3 σ are determined as outliers.
That is, determining a first population threshold of the target entity according to the historical data of the actual behavior of the target entity, wherein the first population threshold is an outlier (μ -3 σ and μ +3 σ) determined according to the historical data of the actual behavior; when the real behavior of the first user exceeds a first group threshold, a first alarm indicating that the group in which the first user is located is outlier is issued.
Similarly, according to the historical data of the real behaviors of the first user in the target entity, a first personal threshold of the first user is also determined, wherein the first personal threshold is an outlier (mu-3 sigma and mu +3 sigma) determined according to the historical data of the real behaviors of the first user; a second alert is issued indicating that the first user is outlier when the first user's true behavior exceeds the first person threshold.
And under the condition that the real behaviors of the target entities do not accord with normal distribution, determining behavior abnormal points by adopting a method of a box type graph.
And finally, constructing and obtaining an individual baseline according to the real behaviors, the first group threshold and the first individual threshold.
In the above embodiment, by constructing a personal baseline for a first user in a target entity, it is possible to detect whether the behavior of the first user is outlier, and to detect whether the group in which the first user is located is outlier; by utilizing the personal baseline, the prediction of the behavior baseline of the target entity by the user behavior baseline prediction model can be assisted, so that the abnormal behavior of the target entity can be better discovered.
Optionally, the user behavior baseline prediction model is trained in a preset time period (for example, in the morning of each day) to predict the user behavior baseline; the user can customize the real behavior, then start the base line prediction and the individual/group outlier analysis function, and when the outlier user occurs and the situation of deviating from the base line, the alarm can be generated in real time.
Fig. 6 is a second schematic flowchart of the user behavior baseline prediction method according to the embodiment of the present invention, as shown in fig. 6, the user behavior baseline prediction method includes the following steps:
step 601, collecting the real behaviors of a target entity from the log according to the user behavior extraction rule, wherein the target entity is a user group consisting of at least one user.
Step 602, determining a predicted behavior baseline of the target entity according to the user behavior baseline prediction model.
Specifically, the user behavior baseline prediction model is obtained by training a target machine learning model according to a training data set and a verification data set extracted from historical data of real behaviors.
Step 603, generating an alarm event when the true performance deviates from the baseline of the predicted behavior.
And step 604, determining a first group threshold of the target entity according to the historical data of the real behavior of the target entity under the condition that the real behavior of the target entity conforms to the normal distribution.
Step 605, determining a first personal threshold of the first user according to the historical data of the real behavior of the first user.
Step 606, constructing a personal baseline according to the real behavior, the first population threshold and the first person threshold.
And 607, when the real behavior of the first user exceeds a first group threshold, sending a first alarm, wherein the first alarm is used for indicating the group outlier in which the first user is located.
Step 608, when the real behavior of the first user exceeds the first personal threshold, a second alarm is issued, wherein the second alarm is used for indicating the first user is out of group.
It should be noted that the execution order of step 607 and step 608 is not sequential.
Fig. 7 is a schematic flow chart of a user behavior baseline prediction model training method according to an embodiment of the present invention, and as shown in fig. 7, the user behavior baseline prediction model training method includes the following steps:
step 701, collecting the real behavior of a target entity from a log according to a user behavior extraction rule; the target entity is a user group consisting of at least one user; the user behavior extraction rule is used for indicating the mapping relation between a specific field in the log and the entity attribute in the real behavior;
step 702, performing machine learning feature extraction on the historical data of the real behaviors to obtain a training data set and a verification data set;
in this embodiment, first, behavior feature extraction is performed on historical data of real behaviors, where the behavior feature may be, for example, the day of the week, whether the behavior is a holiday, the result of the previous 1 day, the result of the previous 2 days, the result of the previous n days, the result of the previous week, and the result of the previous week when the behavior in which the terminal login failure occurs;
and then, carrying out feature calculation on the features, such as calculation of mean, variance, maximum value, median and minimum value, and then carrying out feature intersection on the features to generate a feature-calculated data set. And (3) carrying out feature calculation on the data set according to the following steps of 9: the ratio of 1 was split into 10 parts, 9 parts were used as training data set for training, and 1 part was used as validation data set for validation.
And 703, training a target machine learning model according to the training data set and the verification data set to generate a user behavior baseline prediction model.
It should be noted that the user behavior baseline prediction model may be an XGBoost model, and may be applied in the scenario of the user behavior baseline prediction method.
According to the user behavior baseline prediction model training method provided by the embodiment of the invention, the real behaviors corresponding to the target entity are collected from the log through the user behavior extraction rule, and the machine learning characteristic extraction is carried out on the historical data of the real behaviors to obtain a training data set and a verification data set; then training the target machine learning model according to the training data set and the verification data set to generate a user behavior baseline prediction model; therefore, the user behavior baseline prediction model can sufficiently learn the historical real behavior rules of the target entity, so that the influence of sporadic user behavior abnormal conditions on the user behavior baseline prediction model is reduced, the robustness of the user behavior baseline prediction model is higher, the predicted behavior baseline is more accurate, and the historical real behavior of the target entity is better met.
The user behavior baseline prediction device provided by the invention is described below, and the user behavior baseline prediction device described below and the user behavior baseline prediction method described above can be referred to correspondingly.
Fig. 8 is a schematic structural diagram of a user behavior baseline prediction apparatus according to an embodiment of the present invention, and as shown in fig. 8, the user behavior baseline prediction apparatus 800 includes: a first acquisition module 801, a determination module 802 and a generation module 803; wherein the content of the first and second substances,
a first collecting module 801, configured to collect, according to a user behavior extraction rule, a real behavior of a target entity from a log; wherein the target entity is a user group consisting of at least one user;
a determining module 802, configured to determine a predicted behavior baseline of the target entity according to a user behavior baseline prediction model; the user behavior baseline prediction model is obtained by training a target machine learning model according to a training data set and a verification data set extracted from the historical data of the real behaviors;
a generating module 803, configured to generate an alarm event when the actual behavior deviates from the predicted behavior baseline.
According to the user behavior baseline prediction device provided by the embodiment of the invention, the real behavior of the target entity is collected from the log through the user behavior extraction rule; then, a predicted behavior baseline of the target entity is determined according to the user behavior baseline prediction model, and the user behavior baseline prediction model is obtained after training the target machine learning model according to a training data set and a verification data set extracted from historical data of real behaviors, so that the user behavior baseline prediction model can fully learn the historical real behavior rules of the target entity, thereby reducing the influence of sporadic user behavior abnormal conditions on the user behavior baseline prediction model, enabling the robustness of the user behavior baseline prediction model to be stronger, further enabling the predicted behavior baseline to be more accurate and better conforming to the historical real behaviors of the target entity; and generating an alarm event under the condition that the real behavior deviates from the predicted behavior baseline, so that the false alarm rate of the abnormal behavior of the user can be reduced.
Optionally, the user behavior extraction rule is used to indicate a mapping relationship between a preset field in the log and an entity attribute in the real behavior.
Optionally, the apparatus further comprises:
a construction module for constructing a personal baseline for a first user within the target entity; the individual baseline comprises a first group threshold and/or a first individual threshold, the first group threshold is used for measuring whether the group where the first user is located is outlier, and the first individual threshold is used for measuring whether the first user is outlier;
a first alarm module, configured to send a first alarm when a real behavior of the first user exceeds the first group threshold; the first alarm is used for representing the group outlier of the first user;
the second warning module is used for sending out a second warning when the real behavior of the first user exceeds the first personal threshold; the second alert is to indicate the first user is outlier.
Optionally, the building module is further configured to:
under the condition that the real behavior of the target entity conforms to normal distribution, determining a first group threshold of the target entity according to historical data of the real behavior of the target entity;
determining a first personal threshold for the first user based on historical data of the first user's true behavior;
constructing the personal baseline according to the real behavior of the target entity, the first population threshold, and the first person threshold.
The user behavior baseline prediction model training device provided by the invention is described below, and the user behavior baseline prediction model training device described below and the user behavior baseline prediction model training method described above can be referred to correspondingly.
Fig. 9 is a schematic structural diagram of a user behavior baseline predictive model training apparatus according to an embodiment of the present invention, and as shown in fig. 9, the user behavior baseline predictive model training apparatus 900 includes: a second acquisition module 901, an extraction module 902 and a training module 903; wherein the content of the first and second substances,
a second collecting module 901, configured to collect, according to the user behavior extraction rule, the real behavior of the target entity from the log; the target entity is a user group consisting of at least one user; the user behavior extraction rule is used for indicating the mapping relation between a specific field in the log and the entity attribute in the real behavior;
an extracting module 902, configured to perform machine learning feature extraction on the historical data of the real behavior to obtain a training data set and a verification data set;
a training module 903, configured to train a target machine learning model according to the training data set and the verification data set, and generate a user behavior baseline prediction model; the user behavior baseline prediction model is applied to the user behavior baseline prediction method.
According to the user behavior baseline prediction model training device provided by the embodiment of the invention, the real behaviors corresponding to the target entity are collected from the log through the user behavior extraction rule, and the machine learning characteristic extraction is carried out on the historical data of the real behaviors to obtain a training data set and a verification data set; then training the target machine learning model according to the training data set and the verification data set to generate a user behavior baseline prediction model; therefore, the user behavior baseline prediction model can sufficiently learn the historical real behavior rules of the target entity, so that the influence of sporadic user behavior abnormal conditions on the user behavior baseline prediction model is reduced, the robustness of the user behavior baseline prediction model is higher, the predicted behavior baseline is more accurate, and the historical real behavior of the target entity is better met.
Fig. 10 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 10, the electronic device may include: a processor (processor) 1010, a communication Interface (Communications Interface) 1020, a memory (memory) 1030, and a communication bus 1040, wherein the processor 1010, the communication Interface 1020, and the memory 1030 communicate with each other via the communication bus 1040. Processor 1010 may call logic instructions in memory 1030 to perform the following method: acquiring the real behavior of a target entity from a log according to a user behavior extraction rule; wherein the target entity is a user group consisting of at least one user; determining a predicted behavior baseline of the target entity according to a user behavior baseline prediction model; the user behavior baseline prediction model is obtained by training a target machine learning model according to a training data set and a verification data set extracted from the historical data of the real behaviors; generating an alarm event when the real behavior deviates from the predicted behavior baseline;
or the following method is executed: acquiring the real behavior of a target entity from a log according to a user behavior extraction rule; the target entity is a user group consisting of at least one user; the user behavior extraction rule is used for indicating the mapping relation between a specific field in the log and the entity attribute in the real behavior; performing machine learning feature extraction on the historical data of the real behaviors to obtain a training data set and a verification data set; and training a target machine learning model according to the training data set and the verification data set to generate a user behavior baseline prediction model.
Furthermore, the logic instructions in the memory 1030 can be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the following method: acquiring the real behavior of a target entity from a log according to a user behavior extraction rule; wherein the target entity is a user group consisting of at least one user; determining a predicted behavior baseline of the target entity according to a user behavior baseline prediction model; the user behavior baseline prediction model is obtained by training a target machine learning model according to a training data set and a verification data set extracted from the historical data of the real behaviors; generating an alarm event when the real behavior deviates from the predicted behavior baseline;
or the following method is executed: acquiring the real behavior of a target entity from a log according to a user behavior extraction rule; the target entity is a user group consisting of at least one user; the user behavior extraction rule is used for indicating the mapping relation between a specific field in the log and the entity attribute in the real behavior; performing machine learning feature extraction on the historical data of the real behaviors to obtain a training data set and a verification data set; and training a target machine learning model according to the training data set and the verification data set to generate a user behavior baseline prediction model.
In yet another aspect, an embodiment of the present invention further provides a computer program product, which includes a computer program stored on a non-transitory computer readable storage medium, the computer program including program instructions, which when executed by a computer, implement the following method: acquiring the real behavior of a target entity from a log according to a user behavior extraction rule; wherein the target entity is a user group consisting of at least one user; determining a predicted behavior baseline of the target entity according to a user behavior baseline prediction model; the user behavior baseline prediction model is obtained by training a target machine learning model according to a training data set and a verification data set extracted from the historical data of the real behaviors; generating an alarm event when the real behavior deviates from the predicted behavior baseline;
or the following method is executed: acquiring the real behavior of a target entity from a log according to a user behavior extraction rule; the target entity is a user group consisting of at least one user; the user behavior extraction rule is used for indicating the mapping relation between a specific field in the log and the entity attribute in the real behavior; performing machine learning feature extraction on the historical data of the real behaviors to obtain a training data set and a verification data set; and training a target machine learning model according to the training data set and the verification data set to generate a user behavior baseline prediction model.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A user behavior baseline prediction method is characterized by comprising the following steps:
acquiring the real behavior of a target entity from a log according to a user behavior extraction rule; wherein the target entity is a user group consisting of at least one user;
determining a predicted behavior baseline of the target entity according to a user behavior baseline prediction model; the user behavior baseline prediction model is obtained by training a target machine learning model according to a training data set and a verification data set extracted from the historical data of the real behaviors;
generating an alert event when the actual behavior deviates from the predicted behavior baseline.
2. The method of claim 1, wherein the user behavior extraction rule is used to indicate a mapping relationship between a preset field in the log and an entity attribute in the real behavior.
3. The method for baseline prediction of user behavior according to claim 1 or 2, further comprising:
constructing a personal baseline for a first user within the target entity; the individual baseline comprises a first group threshold and/or a first individual threshold, the first group threshold is used for measuring whether the group where the first user is located is outlier, and the first individual threshold is used for measuring whether the first user is outlier;
issuing a first alert when the first user's true behavior exceeds the first group threshold; the first alarm is used for representing the group outlier of the first user;
issuing a second alert when the first user's true behavior exceeds the first personal threshold; the second alert is to indicate the first user is outlier.
4. The method of claim 3, wherein the constructing a personal baseline for the first user within the target entity comprises:
under the condition that the real behavior of the target entity conforms to normal distribution, determining a first group threshold of the target entity according to historical data of the real behavior of the target entity;
determining a first personal threshold for the first user based on historical data of the first user's true behavior;
constructing the personal baseline according to the real behavior of the target entity, the first population threshold, and the first person threshold.
5. A user behavior baseline prediction model training method is characterized by comprising the following steps:
acquiring the real behavior of a target entity from a log according to a user behavior extraction rule; the target entity is a user group consisting of at least one user; the user behavior extraction rule is used for indicating the mapping relation between a specific field in the log and the entity attribute in the real behavior;
performing machine learning feature extraction on the historical data of the real behaviors to obtain a training data set and a verification data set;
training a target machine learning model according to the training data set and the verification data set to generate a user behavior baseline prediction model; the user behavior baseline prediction model is applied to the user behavior baseline prediction method according to any one of claims 1 to 4.
6. A user behavior baseline prediction apparatus, comprising:
the first acquisition module is used for acquiring the real behavior of the target entity from the log according to the user behavior extraction rule;
the determining module is used for determining a predicted behavior baseline of the target entity according to the user behavior baseline prediction model; the user behavior baseline prediction model is obtained by training a target machine learning model according to a training data set and a verification data set extracted from the historical data of the real behaviors;
a generating module for generating an alarm event when the real behavior deviates from the predicted behavior baseline.
7. A device for training a baseline predictive model of user behavior, comprising:
the second acquisition module is used for acquiring the real behavior of the target entity from the log according to the user behavior extraction rule; the target entity is a user group consisting of at least one user; the user behavior extraction rule is used for indicating the mapping relation between a specific field in the log and the entity attribute in the real behavior;
the extraction module is used for performing machine learning characteristic extraction on the historical data of the real behaviors to obtain a training data set and a verification data set;
the training module is used for training a target machine learning model according to the training data set and the verification data set to generate a user behavior baseline prediction model; the user behavior baseline prediction model is applied to the user behavior baseline prediction method according to any one of claims 1 to 4.
8. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the user behavior baseline prediction method of any one of claims 1 to 4, or implements the user behavior baseline prediction model training method of claim 5 when executing the program.
9. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the user behavior baseline prediction method according to any one of claims 1 to 4, or implements the user behavior baseline prediction model training method according to claim 5.
10. A computer program product having executable instructions stored thereon, which instructions, when executed by a processor, cause the processor to carry out a method of baseline prediction of user behavior according to any one of claims 1 to 4, or a method of training a model of baseline prediction of user behavior according to claim 5.
CN202211236985.9A 2022-10-10 2022-10-10 User behavior baseline prediction method and user behavior baseline prediction model training method Pending CN115659326A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211236985.9A CN115659326A (en) 2022-10-10 2022-10-10 User behavior baseline prediction method and user behavior baseline prediction model training method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211236985.9A CN115659326A (en) 2022-10-10 2022-10-10 User behavior baseline prediction method and user behavior baseline prediction model training method

Publications (1)

Publication Number Publication Date
CN115659326A true CN115659326A (en) 2023-01-31

Family

ID=84988000

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211236985.9A Pending CN115659326A (en) 2022-10-10 2022-10-10 User behavior baseline prediction method and user behavior baseline prediction model training method

Country Status (1)

Country Link
CN (1) CN115659326A (en)

Similar Documents

Publication Publication Date Title
CN112417439B (en) Account detection method, device, server and storage medium
CN111784348B (en) Account risk identification method and device
CN112800116B (en) Method and device for detecting abnormity of service data
CN113298638B (en) Root cause positioning method, electronic equipment and storage medium
CN109145030B (en) Abnormal data access detection method and device
CN112819611A (en) Fraud identification method, device, electronic equipment and computer-readable storage medium
CN116701130A (en) Dynamic baseline optimization method and device based on index portrait and electronic equipment
CN110191097B (en) Method, system, equipment and storage medium for detecting security of login page
CN111611519A (en) Method and device for detecting personal abnormal behaviors
CN107871213B (en) Transaction behavior evaluation method, device, server and storage medium
CN108108299B (en) User interface testing method and device
CN111026087B (en) Weight-containing nonlinear industrial system fault detection method and device based on data
CN115659326A (en) User behavior baseline prediction method and user behavior baseline prediction model training method
CN111147441A (en) Method and device for automatically detecting fraud behaviors of online ticket purchasing and readable storage medium
CN110781410A (en) Community detection method and device
CN114050941B (en) Defect account detection method and system based on kernel density estimation
CN112887408B (en) System and method for solving data state sharing of multi-kernel browser
CN111815442B (en) Link prediction method and device and electronic equipment
CN114978474A (en) Method and system for automatically handling user chat risk level
CN114900356A (en) Malicious user behavior detection method and device and electronic equipment
CN111143644B (en) Identification method and device of Internet of things equipment
CN114265757A (en) Equipment anomaly detection method and device, storage medium and equipment
CN112308294A (en) Default probability prediction method and device
CN111598159B (en) Training method, device, equipment and storage medium of machine learning model
CN114461999A (en) Input behavior detection method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination