CN117633787A - Security analysis method and system based on user behavior data - Google Patents

Security analysis method and system based on user behavior data Download PDF

Info

Publication number
CN117633787A
CN117633787A CN202410103051.0A CN202410103051A CN117633787A CN 117633787 A CN117633787 A CN 117633787A CN 202410103051 A CN202410103051 A CN 202410103051A CN 117633787 A CN117633787 A CN 117633787A
Authority
CN
China
Prior art keywords
behavior
state
user
sequence
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410103051.0A
Other languages
Chinese (zh)
Inventor
肖波
林森
毕岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Anling Trusted Network Technology Co ltd
Original Assignee
Beijing Anling Trusted Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Anling Trusted Network Technology Co ltd filed Critical Beijing Anling Trusted Network Technology Co ltd
Priority to CN202410103051.0A priority Critical patent/CN117633787A/en
Publication of CN117633787A publication Critical patent/CN117633787A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Analysis (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Computing Systems (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a safety analysis method and a system based on user behavior data, which relate to the technical field of safety analysis and comprise the steps of collecting user operation log data of all channels and preprocessing to obtain a user behavior sequence; constructing an abnormal behavior detection model based on the user behavior sequence to identify and detect abnormal behaviors; predicting a new user behavior sequence by using the trained optimal abnormal behavior detection model, and determining the final state of the user behavior; early warning is carried out according to the final state of the user behavior, and corresponding response measures are adopted; and (3) establishing a safety monitoring mechanism to continuously monitor and evaluate user behavior data, and timely discovering potential abnormality or safety risk. According to the invention, the user behavior sequence is segmented into the behavior segments with fixed length, so that the accurate understanding of the user behavior mode and the behavior change is realized, and the accuracy of the analysis result is improved.

Description

Security analysis method and system based on user behavior data
Technical Field
The invention relates to the technical field of security analysis, in particular to a security analysis method and system based on user behavior data.
Background
With the vigorous development of the internet, most enterprises establish information systems of unequal numbers. These information systems are faced with illegal operation problems for legitimate users in addition to traditional network security problems. However, the existing technical methods for analyzing user behaviors have some problems, which are mainly expressed in the following aspects: first, existing methods rely primarily on the rule engine for pattern matching, but this requires manual setting of rule bases. Because of the complexity and variability of user behavior, static rules are difficult to cover all abnormal situations, and once new attack means occur, the rule engine fails and abnormal behavior cannot be effectively detected.
Secondly, some methods employ simple statistical analysis methods, and anomaly identification is performed by setting a threshold. However, it is difficult to balance the sensitivity and the false alarm rate, and missing and false alarms are easily caused, thereby affecting the accuracy of the analysis result. In addition, most methods focus on the detection of a single account, and it is difficult to implement global monitoring of a user population, so that abnormal modes in a network cannot be found. Finally, the existing method lacks continuous optimization and feedback mechanism for detection strategy, and is difficult to keep up with the changes of attack techniques.
Disclosure of Invention
The invention is provided in view of the problems of the existing user behavior analysis technical method in terms of rule engine dependency, limitation of a statistical analysis method, single account detection and policy optimization.
Therefore, the problem to be solved by the present invention is how to detect and identify anomalies or risk patterns in user behavior data for effective security monitoring and early warning.
In order to solve the technical problems, the invention provides the following technical scheme:
in a first aspect, an embodiment of the present invention provides a method for security analysis based on user behavior data, which includes collecting user operation log data of each channel and performing preprocessing to obtain a user behavior sequence; constructing an abnormal behavior detection model based on the user behavior sequence to identify and detect abnormal behaviors; predicting a new user behavior sequence by using the trained optimal abnormal behavior detection model, and determining the final state of the user behavior; early warning is carried out according to the final state of the user behavior, and corresponding response measures are adopted; establishing a safety monitoring mechanism to continuously monitor and evaluate user behavior data, and timely discovering potential abnormality or safety risk; the construction of the abnormal behavior detection model based on the user behavior sequence comprises the following steps: dividing a user behavior sequence into behavior fragments with fixed lengths according to time sequence; establishing an abnormal behavior detection model by using a hidden Markov model, and initializing the model; training an abnormal behavior detection model by using a Baum-Welch algorithm, iteratively calculating forward probability, backward probability and state output probability, and re-estimating and updating model parameters; storing the updated model as an optimal abnormal behavior detection model; training the abnormal behavior detection model using the Baum-Welch algorithm includes the steps of: respectively calculating the forward probability and the backward probability of each moment t by using a forward algorithm and a backward algorithm; calculating the output probability of the state i at each t moment according to the forward probability and the backward probability, and simultaneously re-estimating and updating the model parameters according to the output probability; the specific formula for updating the model parameters is as follows:
,/>,/>,/>;
wherein,representing updated state probability vectors +.>Representing the updated state transition probability, +.>Representation ofAfter updating the sign +.>Is>Output probability indicating that the state is j at the initial time, t indicates time,/>Representing the length of the behavior fragment state sequence, +.>Output probability indicating that the state is i at time t, < >>Output probability indicating transition from state i to state j at time t, < >>Output probability of j representing state at time t, +.>Observation symbol indicating time t,/->Representing the observation set +.>Symbols in->Index representing observation symbol->And->All of which represent a hidden state and,representing the number of possible states,/->Representing the number of possible observations.
As a preferred embodiment of the security analysis method based on user behavior data according to the present invention, the method further comprises: the forward probability is calculated as follows:
;
wherein,indicated at t->The time status is +.>Forward probability of>Indicating a state of +.>Forward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states.
The calculation formula of the backward probability is as follows:
;
wherein,indicating a state of +.>Backward probability of>Indicated at t->The time status is +.>Backward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states.
The calculation formula of the output probability is as follows:
;
wherein,representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Indicating a state of +.>Forward probability of>Indicated at t->The time status is +.>Backward probability of>Representing the number of possible states,/->And->All represent hidden states.
As a preferred embodiment of the security analysis method based on user behavior data according to the present invention, the method further comprises:
determining the final state of the user behavior comprises the steps of: collecting new user behavior data, and performing preprocessing and segmentation steps to obtain behavior fragments with fixed lengths; for each behavior segment, predicting the behavior state of the user by using the optimal abnormal behavior detection model, and outputting the most likely state sequenceThe method comprises the steps of carrying out a first treatment on the surface of the State sequence of outputting model->Matching with a set abnormality judgment rule to judge the final state of the behavior segment; updating the final state of the behavior segment to the state label of the behavior segment to obtain the user behavior state sequence +.>The method comprises the steps of carrying out a first treatment on the surface of the The abnormality judgment rule includes the following: if the state at all times is normal behavior, the state of the whole behavior segment is judged to beNormal; if the state at any moment is abnormal behavior, directly judging that the state of the whole behavior segment is abnormal; otherwise, calculating the proportion of suspicious behaviors in the behavior segment +.>: if->Judging that the state of the whole behavior segment is normal; if->Judging the state of the whole behavior fragment to be suspicious; if it isJudging the state of the whole behavior segment to be abnormal; wherein (1)>Representing the length of the behavior fragment state sequence.
As a preferred embodiment of the security analysis method based on user behavior data according to the present invention, the method further comprises: judging the risk level of the user according to the final state of the user behavior segment comprises the following steps: construction of segment risk matrixThe method comprises the steps of carrying out a first treatment on the surface of the Calculating the risk value of the user behavior sequence +.>And action segment total->The method comprises the steps of carrying out a first treatment on the surface of the Risk value of user behavior sequence->Matching with a preset risk rule to judge the risk level of the user and taking corresponding risk response measures; calculating the risk value of the user behavior sequence +.>The method comprises the following steps: for each pairPersonal user behavior state sequence->Behavior segment->Defined->Category->And state->The method comprises the steps of carrying out a first treatment on the surface of the In the segment risk matrix->Find the corresponding risk weight value +.>The method comprises the steps of carrying out a first treatment on the surface of the Risk weight value of all behavior segments in summarized behavior sequence +.>Risk value as a sequence of user actions->
As a preferred embodiment of the security analysis method based on user behavior data according to the present invention, the method further comprises: the risk rules include the following: if it isIf the risk level of the user is judged to be low, early warning is not needed, the user behavior is continuously monitored, and normal service is maintained; if->If the risk level of the user is low and medium, the non-key function of the account is paused in response to the first-level measure, the user is prompted to have potential safety hazard, secondary verification is added to key business operation, artificial verification is introduced, and batch recovery after no damage is determinedFunction and periodic reevaluation; if->Judging the high risk in the risk level of the user, responding to a secondary measure, suspending sensitive operation authority, prompting the user to actively check the risk, requiring the user to submit a problem improvement report, starting enhanced multi-factor identity verification, and periodically re-evaluating part of recovered business after expert checking and evaluating; if->And if the risk level of the user is high, responding to the three-level measures, immediately freezing the account number, notifying a supervision department to perform system internal inspection so as to trace back an evidence chain, requiring the user to comprehensively self-inspect and correct operation, and enabling an expert to perform comprehensive inspection and evaluation to determine that the business is recovered in batches after the risk is eliminated.
In a second aspect, an embodiment of the present invention provides a security analysis system based on user behavior data, which includes a data preprocessing module, configured to collect and preprocess user operation log data of each channel to obtain a user behavior sequence; the model construction module is used for establishing an abnormal behavior detection model by adopting a hidden Markov model and training the abnormal behavior detection model on each behavior segment by using a Baum-Welch algorithm; the state determining module is used for matching the state sequence output by the model with a set abnormality judging rule so as to judge the final state of the behavior segment; and the response measure module is used for judging the risk level of the user according to the final state of the user behavior segment and taking corresponding response measures.
The invention has the beneficial effects that: according to the invention, the user behavior sequence is segmented into the behavior segments with fixed length, so that the accurate understanding of the user behavior mode and behavior change is realized, and the accuracy of the analysis result is improved; meanwhile, the abnormal mode is identified through sequence modeling, so that the risk detection efficiency is remarkably improved, and intelligent and dynamic safety monitoring is realized; in addition, a plurality of risk levels are subdivided, and decision is made by combining a plurality of judgment rules such as fixed threshold values, confidence intervals and the like, so that the robustness of the system is effectively improved; finally, optimizing means such as monitoring feedback and matrix adjustment are added, so that the system performance can be continuously improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a method flow diagram of a security analysis method based on user behavior data.
Fig. 2 is a diagram of a computer device for a security analysis method based on user behavior data.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
Further, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic can be included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
Example 1
Referring to fig. 1 and 2, a first embodiment of the present invention provides a security analysis method based on user behavior data, including,
s1: and collecting operation log data of users of all channels and preprocessing the operation log data to acquire a user behavior sequence.
Preferably, collecting original operation logs of users in different channels, classifying the logs according to user IDs, and merging the logs into a user behavior sequence according to time sequence; cleaning the log, including filtering irrelevant records, deleting repeated records and the like; analyzing the log, extracting behavior key fields, and aggregating the extracted behavior fields to generate a standardized record describing a single behavior; connecting the behavior records of the individual users in time sequence to form a complete user behavior sequence; combining behavior sequences of different users to form a unified behavior sequence data set; and carrying out security desensitization on the data set, deleting the identity identification information, and obtaining a usable behavior sequence set.
S2: an abnormal behavior detection model is constructed based on the user behavior sequence to identify and detect abnormal behavior.
Specifically, the method comprises the following steps:
s2.1: the user behavior sequence is segmented into behavior segments of fixed length in time sequence.
Preferably, definitionRepresenting all possible behavior state sets, +.>Representing all possible observation sets, +.>Representative length is->Behavior fragment state sequence of->Representing the segment observation sequence corresponding to the state.
Specifically, in this embodiment, the user behavior sequence is divided into behavior segments with a fixed length every 10 minutes, and the user behavior states include normal behavior, suspicious behavior and abnormal behavior, and the behavior states are collectedWhere 0 represents normal behavior, 1 represents suspicious behavior, and 2 represents abnormal behavior.
S2.2: and establishing an abnormal behavior detection model by using the hidden Markov model, and initializing the model.
Specifically, the model is initializedAt this time, the state probability vector +.>Initial state transition probability matrix->Initial observation probability matrix->Wherein->Representing the number of possible states,/->Is the number of possible observations and +.>,/>,/>,/>Indicating that time t is +.>State but at t->Time shift to +.>Probability of state->Indicating that time t is +.>State generation observations->Probability of state.
Further, the abnormal behavior detection model inputs a length ofThe output is a state probability vector for each t moment +.>Selecting the state with the highest probability +.>As a result of the state recognition at time t, the entire length +.>Is finally a state result sequence +.>
It should be noted that, the abnormal behavior detection model is established through a normal user operation sequence of learning history, for example, in an ERP system, some common operations such as making a bill, auditing, warehousing, leaving a warehouse, adding a provider, adding a client, querying a client, logging in an account number, modifying personal information and the like are all normal user behaviors; some operations are determined to be suspicious if the operations are too frequent or the amount is abnormally large, and the suspicious behaviors comprise batch inquiring and exporting of client information, batch inquiring and exporting of warehouse-in information, short-time login of an account number in different places, continuous multiple login failures and the like.
Specifically, the abnormal behavior detection model establishes a normal behavior mode by learning a historical normal operation sequence of the user, and when detecting that some operation characteristics of the user have obvious deviation from the normal behavior mode or violate preset rules, the operations are judged to be abnormal suspicious behaviors.
S2.3: training an abnormal behavior detection model by using a Baum-Welch algorithm, iteratively calculating forward probability, backward probability and state output probability, and re-estimating and updating model parameters.
Specifically, the method comprises the following steps:
s2.3.1: the forward and backward probabilities for each time t are calculated using forward and backward algorithms, respectively.
Specifically, the forward probability is calculated as follows:
;
;
wherein,indicating that the state is +_ at the initial time t=1>Forward probability of>The probability vector of the initial state is represented,in state->Down-generated observations->Is>Indicated at t->The time status is +.>Is used to determine the forward probability of (1),indicating a state of +.>Forward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states.
Preferably, the calculation formula of the backward probability is as follows:
;
;
wherein,the backward probability indicating that the last moment state is i is 1 +.>Indicating a state of +.>Backward probability of>Indicated at t->The time status is +.>Backward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states.
S2.3.2: and calculating the output probability of the state i at each t moment according to the forward probability and the backward probability, and simultaneously re-estimating and updating the model parameters according to the output probability.
Specifically, the output probability of the state i at time tThe specific formula of (2) is as follows:
;
wherein,indicating a state of +.>Backward probability of>Indicating a state of +.>Forward probability of>Representing the number of possible states,/->Representing a hidden state.
Further, at tOutput probability of transition from state i to state jThe specific formula of (2) is as follows:
;
wherein,representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Indicating a state of +.>Forward probability of>Indicated at t->The time status is +.>Backward probability of>Representing the number of possible states,/->And->All represent hidden states.
Further, the specific formula for re-estimating and updating the model parameters according to the output probability is as follows:
,/>,/>,/>;
wherein,representing updated state probability vectors +.>Representing the updated state transition probability, +.>Indicating that the sign +.>Is>Output probability indicating that the state is j at the initial time, t indicates time,/>Representing the length of the behavior fragment state sequence, +.>Indicating the time tOutput probability of state i, +.>Output probability indicating transition from state i to state j at time t, < >>Output probability of j representing state at time t, +.>Observation symbol indicating time t,/->Representing the observation set +.>Symbols in->Index representing observation symbol->And->All of which represent a hidden state and,representing the number of possible states,/->Representing the number of possible observations.
S2.4: and saving the updated model as an optimal abnormal behavior detection model.
S3: and carrying out anomaly prediction on the new user behavior sequence by using the optimal anomaly behavior detection model, and determining the final state of the user behavior.
Preferably, the method comprises the following steps:
s3.1: collecting new user behavior data, executingIs a pretreatment step of (a)Obtain a sequence of user actions and execute +.>The step of slicing the user behavior sequence into behavior segments of a fixed length in time sequence.
S3.2: for each behavior segment, predicting the behavior state of the user by using the optimal abnormal behavior detection model, and outputting the most likely state sequence
S3.3: state sequence for outputting modelMatching with the set abnormality judgment rule to judge the final state of the behavior segment.
Preferably, the abnormality judgment rule includes the following: if the state at all moments is normal behavior 0, judging that the state of the whole behavior segment is normal; if the state at any moment is abnormal behavior 2, directly judging that the state of the whole behavior segment is abnormal; otherwise, calculating the proportion of suspicious behaviors in the behavior segment: if->Judging that the state of the whole behavior segment is normal; if->Judging the state of the whole behavior fragment to be suspicious; if->And judging the state of the whole behavior segment as abnormal.
In addition, toAnd->The reason for the threshold parting line is as follows: based on the typical distribution considerations,meaning +.>Is suspicious, which is acceptable for the overall decision to be normal; />Meaning +.>Is suspicious, which may be judged as evidence of abnormality; and use +.>And->As a threshold boundary, the final segment abnormality judgment has intuitionistic interpretation, certain robustness is ensured, and meanwhile, the adjustment and improvement space is provided.
S3.4: updating the final state of the behavior segment to the state label of the behavior segment to obtain the user behavior state sequence
S4: judging the risk level of the user according to the user behavior state sequence, and taking corresponding response measures.
Specifically, the method comprises the following steps:
s4.2: construction of segment risk matrix
Preferably, the segment risk matrixIs +.>,/>Representing the number of fragment status categories (normal, suspicious and abnormal in this embodiment), +.>Representing the number of fragment types (including login, billing, transfer, etc.), -, and the like>Representation type->Fragments in State->And the risk weight is generated based on knowledge graph analysis, and the range is 0-1.
S4.2: calculating risk values of user behavior sequences
Preferably, for each user behavior state sequenceBehavior segment->Determine its category->And state->And in the segment risk matrix->Find the corresponding risk weight value +.>Summarizing what is in the behavior sequenceBehavioral segment->Risk value as a sequence of user actions->
Specifically, user behavior sequence risk valuesThe calculation formula of (2) is as follows:
;
wherein,representing a risk value of the user behavior sequence,/->Representing the total number of behavioral segments>Representing the +.>And (5) a personal behavior segment risk weight value.
S4.3: and matching the risk value of the user behavior sequence with a preset risk rule to judge the risk level of the user, and taking corresponding risk response measures.
If it isIf the risk level of the user is judged to be low, early warning is not needed, the user behavior is continuously monitored, and normal service is maintained;
if it isIf the risk level of the user is low, medium risk is judged, if the risk level of the user is low, the non-key function of the account is suspended in response to the first-level measure, the user is prompted to have potential safety hazards, and the risk is reducedThe business operation adds secondary verification, introduces artificial verification, determines the function of batch recovery after no damage, and periodically reevaluates;
if it isJudging the high risk in the risk level of the user, responding to a secondary measure, suspending sensitive operation authority, prompting the user to actively check the risk, requiring the user to submit a problem improvement report, starting enhanced multi-factor identity verification, and periodically re-evaluating part of recovered business after expert checking and evaluating;
if it isIf the risk level of the user is high, responding to three-level measures, immediately freezing an account number, notifying a supervision department to carry out system internal inspection so as to trace back an evidence chain, requiring the user to comprehensively self-inspect and correct operation, and carrying out comprehensive inspection and evaluation by an expert to determine that the business is recovered in batches after the risk is eliminated;
s5: and (3) establishing a safety monitoring mechanism to continuously monitor and evaluate user behavior data, and timely discovering potential abnormality or safety risk.
Further, the embodiment also provides a safety analysis system based on the user behavior data, which comprises a data preprocessing module, a data analysis module and a data analysis module, wherein the data preprocessing module is used for collecting and preprocessing the user operation log data of each channel so as to acquire a user behavior sequence; the model construction module is used for establishing an abnormal behavior detection model by adopting a hidden Markov model and training the abnormal behavior detection model on each behavior segment by using a Baum-Welch algorithm; the state determining module is used for matching the state sequence output by the model with a set abnormality judging rule so as to judge the final state of the behavior segment; and the response measure module is used for judging the risk level of the user according to the final state of the user behavior segment and taking corresponding response measures.
In conclusion, the invention realizes accurate understanding of the user behavior mode and behavior change by dividing the user behavior sequence into the behavior fragments with fixed length, thereby improving the accuracy of analysis results; meanwhile, the abnormal mode is identified through sequence modeling, so that the risk detection efficiency is remarkably improved, and intelligent and dynamic safety monitoring is realized; in addition, a plurality of risk levels are subdivided, and decision is made by combining a plurality of judgment rules such as fixed threshold values, confidence intervals and the like, so that the robustness of the system is effectively improved; finally, optimizing means such as monitoring feedback and matrix adjustment are added, so that the system performance can be continuously improved.
Example 2
Referring to fig. 1 and 2, in order to verify the advantageous effects of the present invention, scientific demonstration is performed through economic benefit calculation and simulation experiments for the second embodiment of the present invention.
Specifically, taking a certain ERP system as an example, extracting operation logs of 100 users on the system in 6 months of 2022 through the background of the ERP system, classifying the logs according to user IDs to generate 100 complete behavior sequences of the users, including operations of making a bill, inquiring a provider, inquiring a client and the like, and dividing the behavior sequences of the users into behavior fragments with the time length of 10 minutes, wherein part of data are shown in table 1.
TABLE 1 partial behavior segments for users
Preferably, a set of user behavior states is definedAnd 0 represents normal behavior, 1 represents suspicious behavior, 2 represents abnormal behavior, initializing model parameters of an abnormal behavior detection model, training the model by using a Baum-Welch algorithm, and carrying out parameter convergence after 10 iterations to obtain an optimal abnormal behavior detection model.
Further, collecting new user logs, generating 12 behavior segments with the length of 10 minutes, and each behavior segment state sequenceLength of->15, with optimal abnormal behavior detectionPredicting a state sequence of each segment by the model, and selecting a state with the highest probability as a state recognition result to obtain a segment 1: />Fragment 2: />,., fragment 12: />. Each state sequence->Matching with the set abnormality judgment rule to judge the user behavior state sequence as +.>
Further, constructing a segment risk matrixCalculating the risk value of the user behavior sequence +.>And (2) additionally->Matching a risk rule, judging that the risk level of a user is low and medium, responding to a secondary measure, suspending sensitive operation authority, prompting the user to actively check the risk, requiring the user to submit a problem improvement report, starting enhanced multi-factor identity verification, checking and evaluating part of recovered business by an expert, and periodically re-evaluating; the risk level is re-assessed until the risk level falls to a low risk.
Preferably, the comparative index of the process of the present invention to the conventional process is shown in Table 2.
TABLE 2 index comparison Table of the inventive method and the conventional method
Preferably, as can be seen from table 2, compared with the conventional method, the method of the present invention has significant advantages in terms of accuracy, sensitivity, misjudgment rate, real-time performance, expandability, system compatibility, etc.
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered in the scope of the claims of the present invention.

Claims (6)

1. A safety analysis method based on user behavior data is characterized in that: comprising the steps of (a) a step of,
collecting user operation log data of each channel and preprocessing the user operation log data to obtain a user behavior sequence;
constructing an abnormal behavior detection model based on the user behavior sequence to identify and detect abnormal behaviors;
predicting a new user behavior sequence by using the trained optimal abnormal behavior detection model, and determining the final state of the user behavior;
early warning is carried out according to the final state of the user behavior, and corresponding response measures are adopted;
establishing a safety monitoring mechanism to continuously monitor and evaluate user behavior data, and timely discovering potential abnormality or safety risk;
the construction of the abnormal behavior detection model based on the user behavior sequence comprises the following steps:
dividing a user behavior sequence into behavior fragments with fixed lengths according to time sequence;
establishing an abnormal behavior detection model by using a hidden Markov model, and initializing the model;
training an abnormal behavior detection model by using a Baum-Welch algorithm, iteratively calculating forward probability, backward probability and state output probability, and re-estimating and updating model parameters;
storing the updated model as an optimal abnormal behavior detection model;
the training of the abnormal behavior detection model by using the Baum-Welch algorithm comprises the following steps:
respectively calculating the forward probability and the backward probability of each moment t by using a forward algorithm and a backward algorithm;
calculating the output probability of the state i at each t moment according to the forward probability and the backward probability, and simultaneously re-estimating and updating the model parameters according to the output probability;
the specific formula for updating the model parameters is as follows:
,/>,/>,/>;
wherein,representing updated state probability vectors +.>Representing the updated state transition probability, +.>Indicating that the sign +.>Is>Output probability indicating that the state is j at the initial time, t indicates time,/>Representing the length of the behavior fragment state sequence, +.>Output probability indicating that the state is i at time t, < >>Output probability indicating transition from state i to state j at time t, < >>Output probability of j representing state at time t, +.>Observation symbol indicating time t,/->Representing the observation set +.>Symbols in->Index representing observation symbol->And->All of which represent a hidden state and,representing the number of possible states,/->Representing the number of possible observations.
2. The security analysis method based on user behavior data according to claim 1, wherein: the forward probability is calculated as follows:
;
wherein,indicated at t->The time status is +.>Forward probability of>Indicating a state of +.>Forward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is used to determine the transmission probability of (1),representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states;
the calculation formula of the backward probability is as follows:
;
wherein,indicating a state of +.>Backward probability of>Indicated at t->The time status is +.>Backward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is used to determine the transmission probability of (1),representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states;
the calculation formula of the output probability is as follows:
;
wherein,representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Indicating a state of +.>Forward probability of>Indicated at t->The time status is +.>Is used to determine the backward probability of (1),representing the number of possible states,/->And->All represent hidden states.
3. The security analysis method based on user behavior data according to claim 2, wherein: said determining the final state of the user behavior comprises the steps of:
collecting new user behavior data, and performing preprocessing and segmentation steps to obtain behavior fragments with fixed lengths;
for each behavior segment, predicting the behavior state of the user by using the optimal abnormal behavior detection model, and outputting the most likely state sequence
State sequence for outputting modelMatching with a set abnormality judgment rule to judge the final state of the behavior segment;
updating the final state of the behavior segment to the state label of the behavior segment to obtain the user behavior state sequence
The abnormality judgment rule includes the following:
if the state at all moments is normal behavior, judging that the state of the whole behavior segment is normal;
if the state at any moment is abnormal behavior, directly judging that the state of the whole behavior segment is abnormal;
otherwise, calculating the proportion of suspicious behaviors in the behavior segment
If it isJudging that the state of the whole behavior segment is normal;
if it isJudging the state of the whole behavior fragment to be suspicious;
if it isJudging the state of the whole behavior segment to be abnormal;
wherein,representing the length of the behavior fragment state sequence.
4. A security analysis method based on user behavior data according to claim 3, wherein: the step of judging the risk level of the user according to the final state of the user behavior segment comprises the following steps:
construction of segment risk matrix
Calculating risk values of user behavior sequencesAnd action segment total->
Sequence risk values of user behaviorsMatching with a preset risk rule to judge the risk level of the user and taking corresponding risk response measures;
the calculation of risk values of a user behavior sequenceThe method comprises the following steps:
for each user behavior state sequenceBehavior segment->Confirm->Category of->And state->
In-segment risk matrixFind the corresponding risk weight value +.>
Summarizing risk weight values of all behavior fragments in behavior sequenceRisk value as a sequence of user actions->
5. The method for security analysis based on user behavior data according to claim 4, wherein: the risk rule includes the following:
if it isIf the risk level of the user is judged to be low, early warning is not needed, the user behavior is continuously monitored, and normal service is maintained;
if it isIf the risk level of the user is low and medium, responding to a first-level measure, suspending the non-key function of the account number, prompting the user to have potential safety hazard, adding secondary verification to key business operation, introducing artificial verification, determining the function of batch recovery after no damage, and periodically reevaluating;
if it isJudging the medium and high risk of the user risk level, responding to the secondary measure, suspending the sensitive operation authority, prompting the user to actively check the risk, and asking the user to submit the problemImproving the report, enabling the enhanced multi-factor authentication, and checking part of recovered business after evaluation by an expert to periodically re-evaluate;
if it isAnd if the risk level of the user is high, responding to the three-level measures, immediately freezing the account number, notifying a supervision department to perform system internal inspection so as to trace back an evidence chain, requiring the user to comprehensively self-inspect and correct operation, and enabling an expert to perform comprehensive inspection and evaluation to determine that the business is recovered in batches after the risk is eliminated.
6. A security analysis system employing the security analysis method based on user behavior data according to any one of claims 1 to 5, characterized in that: comprising the steps of (a) a step of,
the data preprocessing module is used for collecting user operation log data of each channel and preprocessing the data so as to acquire a user behavior sequence;
the model construction module is used for establishing an abnormal behavior detection model by adopting a hidden Markov model and training the abnormal behavior detection model on each behavior segment by using a Baum-Welch algorithm;
the state determining module is used for matching the state sequence output by the model with a set abnormality judging rule so as to judge the final state of the behavior segment;
and the response measure module is used for judging the risk level of the user according to the final state of the user behavior segment and taking corresponding response measures.
CN202410103051.0A 2024-01-25 2024-01-25 Security analysis method and system based on user behavior data Pending CN117633787A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410103051.0A CN117633787A (en) 2024-01-25 2024-01-25 Security analysis method and system based on user behavior data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410103051.0A CN117633787A (en) 2024-01-25 2024-01-25 Security analysis method and system based on user behavior data

Publications (1)

Publication Number Publication Date
CN117633787A true CN117633787A (en) 2024-03-01

Family

ID=90023795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410103051.0A Pending CN117633787A (en) 2024-01-25 2024-01-25 Security analysis method and system based on user behavior data

Country Status (1)

Country Link
CN (1) CN117633787A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040012285A (en) * 2002-08-02 2004-02-11 한국정보보호진흥원 System And Method For Detecting Intrusion Using Hidden Markov Model
CN108881194A (en) * 2018-06-07 2018-11-23 郑州信大先进技术研究院 Enterprises user anomaly detection method and device
CN114218998A (en) * 2021-11-02 2022-03-22 国家电网有限公司信息通信分公司 Power system abnormal behavior analysis method based on hidden Markov model
CN116680572A (en) * 2023-06-29 2023-09-01 厦门她趣信息技术有限公司 Abnormal user detection method based on time sequence behavior sequence

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040012285A (en) * 2002-08-02 2004-02-11 한국정보보호진흥원 System And Method For Detecting Intrusion Using Hidden Markov Model
CN108881194A (en) * 2018-06-07 2018-11-23 郑州信大先进技术研究院 Enterprises user anomaly detection method and device
CN114218998A (en) * 2021-11-02 2022-03-22 国家电网有限公司信息通信分公司 Power system abnormal behavior analysis method based on hidden Markov model
CN116680572A (en) * 2023-06-29 2023-09-01 厦门她趣信息技术有限公司 Abnormal user detection method based on time sequence behavior sequence

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
RABINER L R: "A Tutorial on Hidden Markov Models andSelected Applications in Speech Recognition", PROCEEDINGOF IEEE, vol. 77, no. 2, 28 February 1989 (1989-02-28), pages 257 - 286 *
邬书跃;田新广;: "基于隐马尔可夫模型的用户行为异常检测新方法", 通信学报, no. 04, 15 April 2007 (2007-04-15) *

Similar Documents

Publication Publication Date Title
CN111475804B (en) Alarm prediction method and system
CN111614491B (en) Power monitoring system oriented safety situation assessment index selection method and system
CN112804196A (en) Log data processing method and device
CN117220978B (en) Quantitative evaluation system and evaluation method for network security operation model
CN113918367A (en) Large-scale system log anomaly detection method based on attention mechanism
CN112910859A (en) Internet of things equipment monitoring and early warning method based on C5.0 decision tree and time sequence analysis
Dou et al. Pc 2 a: predicting collective contextual anomalies via lstm with deep generative model
CN116366376B (en) APT attack traceability graph analysis method
CN117439916A (en) Network security test evaluation system and method
CN111898129B (en) Malicious code sample screener and method based on Two-Head anomaly detection model
Hendry et al. Intrusion signature creation via clustering anomalies
Weiss Predicting telecommunication equipment failures from sequences of network alarms
CN112039907A (en) Automatic testing method and system based on Internet of things terminal evaluation platform
CN117014193A (en) Unknown Web attack detection method based on behavior baseline
CN115277178B (en) Abnormality monitoring method, device and storage medium based on enterprise network flow
CN116187423A (en) Behavior sequence anomaly detection method and system based on unsupervised algorithm
CN117633787A (en) Security analysis method and system based on user behavior data
Baig et al. One-dependence estimators for accurate detection of anomalous network traffic
CN114039837A (en) Alarm data processing method, device, system, equipment and storage medium
Su et al. A network anomaly detection method based on genetic algorithm
Prihantono et al. Model-Based Feature Selection for Developing Network Attack Detection and Alerting System
Kanth Gaussian Naıve Bayes based intrusion detection system
CN113221110B (en) Remote access Trojan intelligent analysis method based on meta-learning
CN118468988B (en) Terminal data leakage event prediction method and system based on horizontal federal learning
TWI789003B (en) Service anomaly detection and alerting method, apparatus using the same, storage media for storing the same, and computer software program for generating service anomaly alert

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination