CN117633787A - Security analysis method and system based on user behavior data - Google Patents
Security analysis method and system based on user behavior data Download PDFInfo
- Publication number
- CN117633787A CN117633787A CN202410103051.0A CN202410103051A CN117633787A CN 117633787 A CN117633787 A CN 117633787A CN 202410103051 A CN202410103051 A CN 202410103051A CN 117633787 A CN117633787 A CN 117633787A
- Authority
- CN
- China
- Prior art keywords
- behavior
- state
- user
- sequence
- probability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 29
- 230000006399 behavior Effects 0.000 claims abstract description 201
- 206010000117 Abnormal behaviour Diseases 0.000 claims abstract description 46
- 238000001514 detection method Methods 0.000 claims abstract description 40
- 230000005856 abnormality Effects 0.000 claims abstract description 16
- 230000004044 response Effects 0.000 claims abstract description 15
- 238000007781 pre-processing Methods 0.000 claims abstract description 12
- 238000012544 monitoring process Methods 0.000 claims abstract description 10
- 230000007246 mechanism Effects 0.000 claims abstract description 4
- 238000000034 method Methods 0.000 claims description 32
- 239000012634 fragment Substances 0.000 claims description 27
- 230000007704 transition Effects 0.000 claims description 17
- 238000004422 calculation algorithm Methods 0.000 claims description 14
- 230000002159 abnormal effect Effects 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 9
- 238000012795 verification Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000010276 construction Methods 0.000 claims description 8
- 239000011159 matrix material Substances 0.000 claims description 8
- 238000007689 inspection Methods 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 5
- 238000011156 evaluation Methods 0.000 claims description 4
- 230000009471 action Effects 0.000 claims description 3
- 230000008014 freezing Effects 0.000 claims description 3
- 238000007710 freezing Methods 0.000 claims description 3
- 238000011084 recovery Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 2
- 230000005540 biological transmission Effects 0.000 claims 2
- 230000008859 change Effects 0.000 abstract description 3
- 230000006872 improvement Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000007796 conventional method Methods 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000000586 desensitisation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008713 feedback mechanism Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/552—Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2433—Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Security & Cryptography (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Analysis (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Mathematical Physics (AREA)
- Mathematical Optimization (AREA)
- Computing Systems (AREA)
- Pure & Applied Mathematics (AREA)
- Computer Hardware Design (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a safety analysis method and a system based on user behavior data, which relate to the technical field of safety analysis and comprise the steps of collecting user operation log data of all channels and preprocessing to obtain a user behavior sequence; constructing an abnormal behavior detection model based on the user behavior sequence to identify and detect abnormal behaviors; predicting a new user behavior sequence by using the trained optimal abnormal behavior detection model, and determining the final state of the user behavior; early warning is carried out according to the final state of the user behavior, and corresponding response measures are adopted; and (3) establishing a safety monitoring mechanism to continuously monitor and evaluate user behavior data, and timely discovering potential abnormality or safety risk. According to the invention, the user behavior sequence is segmented into the behavior segments with fixed length, so that the accurate understanding of the user behavior mode and the behavior change is realized, and the accuracy of the analysis result is improved.
Description
Technical Field
The invention relates to the technical field of security analysis, in particular to a security analysis method and system based on user behavior data.
Background
With the vigorous development of the internet, most enterprises establish information systems of unequal numbers. These information systems are faced with illegal operation problems for legitimate users in addition to traditional network security problems. However, the existing technical methods for analyzing user behaviors have some problems, which are mainly expressed in the following aspects: first, existing methods rely primarily on the rule engine for pattern matching, but this requires manual setting of rule bases. Because of the complexity and variability of user behavior, static rules are difficult to cover all abnormal situations, and once new attack means occur, the rule engine fails and abnormal behavior cannot be effectively detected.
Secondly, some methods employ simple statistical analysis methods, and anomaly identification is performed by setting a threshold. However, it is difficult to balance the sensitivity and the false alarm rate, and missing and false alarms are easily caused, thereby affecting the accuracy of the analysis result. In addition, most methods focus on the detection of a single account, and it is difficult to implement global monitoring of a user population, so that abnormal modes in a network cannot be found. Finally, the existing method lacks continuous optimization and feedback mechanism for detection strategy, and is difficult to keep up with the changes of attack techniques.
Disclosure of Invention
The invention is provided in view of the problems of the existing user behavior analysis technical method in terms of rule engine dependency, limitation of a statistical analysis method, single account detection and policy optimization.
Therefore, the problem to be solved by the present invention is how to detect and identify anomalies or risk patterns in user behavior data for effective security monitoring and early warning.
In order to solve the technical problems, the invention provides the following technical scheme:
in a first aspect, an embodiment of the present invention provides a method for security analysis based on user behavior data, which includes collecting user operation log data of each channel and performing preprocessing to obtain a user behavior sequence; constructing an abnormal behavior detection model based on the user behavior sequence to identify and detect abnormal behaviors; predicting a new user behavior sequence by using the trained optimal abnormal behavior detection model, and determining the final state of the user behavior; early warning is carried out according to the final state of the user behavior, and corresponding response measures are adopted; establishing a safety monitoring mechanism to continuously monitor and evaluate user behavior data, and timely discovering potential abnormality or safety risk; the construction of the abnormal behavior detection model based on the user behavior sequence comprises the following steps: dividing a user behavior sequence into behavior fragments with fixed lengths according to time sequence; establishing an abnormal behavior detection model by using a hidden Markov model, and initializing the model; training an abnormal behavior detection model by using a Baum-Welch algorithm, iteratively calculating forward probability, backward probability and state output probability, and re-estimating and updating model parameters; storing the updated model as an optimal abnormal behavior detection model; training the abnormal behavior detection model using the Baum-Welch algorithm includes the steps of: respectively calculating the forward probability and the backward probability of each moment t by using a forward algorithm and a backward algorithm; calculating the output probability of the state i at each t moment according to the forward probability and the backward probability, and simultaneously re-estimating and updating the model parameters according to the output probability; the specific formula for updating the model parameters is as follows:
,/>,/>,/>;
wherein,representing updated state probability vectors +.>Representing the updated state transition probability, +.>Representation ofAfter updating the sign +.>Is>Output probability indicating that the state is j at the initial time, t indicates time,/>Representing the length of the behavior fragment state sequence, +.>Output probability indicating that the state is i at time t, < >>Output probability indicating transition from state i to state j at time t, < >>Output probability of j representing state at time t, +.>Observation symbol indicating time t,/->Representing the observation set +.>Symbols in->Index representing observation symbol->And->All of which represent a hidden state and,representing the number of possible states,/->Representing the number of possible observations.
As a preferred embodiment of the security analysis method based on user behavior data according to the present invention, the method further comprises: the forward probability is calculated as follows:
;
wherein,indicated at t->The time status is +.>Forward probability of>Indicating a state of +.>Forward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states.
The calculation formula of the backward probability is as follows:
;
wherein,indicating a state of +.>Backward probability of>Indicated at t->The time status is +.>Backward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states.
The calculation formula of the output probability is as follows:
;
wherein,representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Indicating a state of +.>Forward probability of>Indicated at t->The time status is +.>Backward probability of>Representing the number of possible states,/->And->All represent hidden states.
As a preferred embodiment of the security analysis method based on user behavior data according to the present invention, the method further comprises:
determining the final state of the user behavior comprises the steps of: collecting new user behavior data, and performing preprocessing and segmentation steps to obtain behavior fragments with fixed lengths; for each behavior segment, predicting the behavior state of the user by using the optimal abnormal behavior detection model, and outputting the most likely state sequenceThe method comprises the steps of carrying out a first treatment on the surface of the State sequence of outputting model->Matching with a set abnormality judgment rule to judge the final state of the behavior segment; updating the final state of the behavior segment to the state label of the behavior segment to obtain the user behavior state sequence +.>The method comprises the steps of carrying out a first treatment on the surface of the The abnormality judgment rule includes the following: if the state at all times is normal behavior, the state of the whole behavior segment is judged to beNormal; if the state at any moment is abnormal behavior, directly judging that the state of the whole behavior segment is abnormal; otherwise, calculating the proportion of suspicious behaviors in the behavior segment +.>: if->Judging that the state of the whole behavior segment is normal; if->Judging the state of the whole behavior fragment to be suspicious; if it isJudging the state of the whole behavior segment to be abnormal; wherein (1)>Representing the length of the behavior fragment state sequence.
As a preferred embodiment of the security analysis method based on user behavior data according to the present invention, the method further comprises: judging the risk level of the user according to the final state of the user behavior segment comprises the following steps: construction of segment risk matrixThe method comprises the steps of carrying out a first treatment on the surface of the Calculating the risk value of the user behavior sequence +.>And action segment total->The method comprises the steps of carrying out a first treatment on the surface of the Risk value of user behavior sequence->Matching with a preset risk rule to judge the risk level of the user and taking corresponding risk response measures; calculating the risk value of the user behavior sequence +.>The method comprises the following steps: for each pairPersonal user behavior state sequence->Behavior segment->Defined->Category->And state->The method comprises the steps of carrying out a first treatment on the surface of the In the segment risk matrix->Find the corresponding risk weight value +.>The method comprises the steps of carrying out a first treatment on the surface of the Risk weight value of all behavior segments in summarized behavior sequence +.>Risk value as a sequence of user actions->。
As a preferred embodiment of the security analysis method based on user behavior data according to the present invention, the method further comprises: the risk rules include the following: if it isIf the risk level of the user is judged to be low, early warning is not needed, the user behavior is continuously monitored, and normal service is maintained; if->If the risk level of the user is low and medium, the non-key function of the account is paused in response to the first-level measure, the user is prompted to have potential safety hazard, secondary verification is added to key business operation, artificial verification is introduced, and batch recovery after no damage is determinedFunction and periodic reevaluation; if->Judging the high risk in the risk level of the user, responding to a secondary measure, suspending sensitive operation authority, prompting the user to actively check the risk, requiring the user to submit a problem improvement report, starting enhanced multi-factor identity verification, and periodically re-evaluating part of recovered business after expert checking and evaluating; if->And if the risk level of the user is high, responding to the three-level measures, immediately freezing the account number, notifying a supervision department to perform system internal inspection so as to trace back an evidence chain, requiring the user to comprehensively self-inspect and correct operation, and enabling an expert to perform comprehensive inspection and evaluation to determine that the business is recovered in batches after the risk is eliminated.
In a second aspect, an embodiment of the present invention provides a security analysis system based on user behavior data, which includes a data preprocessing module, configured to collect and preprocess user operation log data of each channel to obtain a user behavior sequence; the model construction module is used for establishing an abnormal behavior detection model by adopting a hidden Markov model and training the abnormal behavior detection model on each behavior segment by using a Baum-Welch algorithm; the state determining module is used for matching the state sequence output by the model with a set abnormality judging rule so as to judge the final state of the behavior segment; and the response measure module is used for judging the risk level of the user according to the final state of the user behavior segment and taking corresponding response measures.
The invention has the beneficial effects that: according to the invention, the user behavior sequence is segmented into the behavior segments with fixed length, so that the accurate understanding of the user behavior mode and behavior change is realized, and the accuracy of the analysis result is improved; meanwhile, the abnormal mode is identified through sequence modeling, so that the risk detection efficiency is remarkably improved, and intelligent and dynamic safety monitoring is realized; in addition, a plurality of risk levels are subdivided, and decision is made by combining a plurality of judgment rules such as fixed threshold values, confidence intervals and the like, so that the robustness of the system is effectively improved; finally, optimizing means such as monitoring feedback and matrix adjustment are added, so that the system performance can be continuously improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a method flow diagram of a security analysis method based on user behavior data.
Fig. 2 is a diagram of a computer device for a security analysis method based on user behavior data.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
Further, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic can be included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
Example 1
Referring to fig. 1 and 2, a first embodiment of the present invention provides a security analysis method based on user behavior data, including,
s1: and collecting operation log data of users of all channels and preprocessing the operation log data to acquire a user behavior sequence.
Preferably, collecting original operation logs of users in different channels, classifying the logs according to user IDs, and merging the logs into a user behavior sequence according to time sequence; cleaning the log, including filtering irrelevant records, deleting repeated records and the like; analyzing the log, extracting behavior key fields, and aggregating the extracted behavior fields to generate a standardized record describing a single behavior; connecting the behavior records of the individual users in time sequence to form a complete user behavior sequence; combining behavior sequences of different users to form a unified behavior sequence data set; and carrying out security desensitization on the data set, deleting the identity identification information, and obtaining a usable behavior sequence set.
S2: an abnormal behavior detection model is constructed based on the user behavior sequence to identify and detect abnormal behavior.
Specifically, the method comprises the following steps:
s2.1: the user behavior sequence is segmented into behavior segments of fixed length in time sequence.
Preferably, definitionRepresenting all possible behavior state sets, +.>Representing all possible observation sets, +.>Representative length is->Behavior fragment state sequence of->Representing the segment observation sequence corresponding to the state.
Specifically, in this embodiment, the user behavior sequence is divided into behavior segments with a fixed length every 10 minutes, and the user behavior states include normal behavior, suspicious behavior and abnormal behavior, and the behavior states are collectedWhere 0 represents normal behavior, 1 represents suspicious behavior, and 2 represents abnormal behavior.
S2.2: and establishing an abnormal behavior detection model by using the hidden Markov model, and initializing the model.
Specifically, the model is initializedAt this time, the state probability vector +.>Initial state transition probability matrix->Initial observation probability matrix->Wherein->Representing the number of possible states,/->Is the number of possible observations and +.>,/>,/>,/>Indicating that time t is +.>State but at t->Time shift to +.>Probability of state->Indicating that time t is +.>State generation observations->Probability of state.
Further, the abnormal behavior detection model inputs a length ofThe output is a state probability vector for each t moment +.>Selecting the state with the highest probability +.>As a result of the state recognition at time t, the entire length +.>Is finally a state result sequence +.>。
It should be noted that, the abnormal behavior detection model is established through a normal user operation sequence of learning history, for example, in an ERP system, some common operations such as making a bill, auditing, warehousing, leaving a warehouse, adding a provider, adding a client, querying a client, logging in an account number, modifying personal information and the like are all normal user behaviors; some operations are determined to be suspicious if the operations are too frequent or the amount is abnormally large, and the suspicious behaviors comprise batch inquiring and exporting of client information, batch inquiring and exporting of warehouse-in information, short-time login of an account number in different places, continuous multiple login failures and the like.
Specifically, the abnormal behavior detection model establishes a normal behavior mode by learning a historical normal operation sequence of the user, and when detecting that some operation characteristics of the user have obvious deviation from the normal behavior mode or violate preset rules, the operations are judged to be abnormal suspicious behaviors.
S2.3: training an abnormal behavior detection model by using a Baum-Welch algorithm, iteratively calculating forward probability, backward probability and state output probability, and re-estimating and updating model parameters.
Specifically, the method comprises the following steps:
s2.3.1: the forward and backward probabilities for each time t are calculated using forward and backward algorithms, respectively.
Specifically, the forward probability is calculated as follows:
;
;
wherein,indicating that the state is +_ at the initial time t=1>Forward probability of>The probability vector of the initial state is represented,in state->Down-generated observations->Is>Indicated at t->The time status is +.>Is used to determine the forward probability of (1),indicating a state of +.>Forward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states.
Preferably, the calculation formula of the backward probability is as follows:
;
;
wherein,the backward probability indicating that the last moment state is i is 1 +.>Indicating a state of +.>Backward probability of>Indicated at t->The time status is +.>Backward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states.
S2.3.2: and calculating the output probability of the state i at each t moment according to the forward probability and the backward probability, and simultaneously re-estimating and updating the model parameters according to the output probability.
Specifically, the output probability of the state i at time tThe specific formula of (2) is as follows:
;
wherein,indicating a state of +.>Backward probability of>Indicating a state of +.>Forward probability of>Representing the number of possible states,/->Representing a hidden state.
Further, at tOutput probability of transition from state i to state jThe specific formula of (2) is as follows:
;
wherein,representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Indicating a state of +.>Forward probability of>Indicated at t->The time status is +.>Backward probability of>Representing the number of possible states,/->And->All represent hidden states.
Further, the specific formula for re-estimating and updating the model parameters according to the output probability is as follows:
,/>,/>,/>;
wherein,representing updated state probability vectors +.>Representing the updated state transition probability, +.>Indicating that the sign +.>Is>Output probability indicating that the state is j at the initial time, t indicates time,/>Representing the length of the behavior fragment state sequence, +.>Indicating the time tOutput probability of state i, +.>Output probability indicating transition from state i to state j at time t, < >>Output probability of j representing state at time t, +.>Observation symbol indicating time t,/->Representing the observation set +.>Symbols in->Index representing observation symbol->And->All of which represent a hidden state and,representing the number of possible states,/->Representing the number of possible observations.
S2.4: and saving the updated model as an optimal abnormal behavior detection model.
S3: and carrying out anomaly prediction on the new user behavior sequence by using the optimal anomaly behavior detection model, and determining the final state of the user behavior.
Preferably, the method comprises the following steps:
s3.1: collecting new user behavior data, executingIs a pretreatment step of (a)Obtain a sequence of user actions and execute +.>The step of slicing the user behavior sequence into behavior segments of a fixed length in time sequence.
S3.2: for each behavior segment, predicting the behavior state of the user by using the optimal abnormal behavior detection model, and outputting the most likely state sequence。
S3.3: state sequence for outputting modelMatching with the set abnormality judgment rule to judge the final state of the behavior segment.
Preferably, the abnormality judgment rule includes the following: if the state at all moments is normal behavior 0, judging that the state of the whole behavior segment is normal; if the state at any moment is abnormal behavior 2, directly judging that the state of the whole behavior segment is abnormal; otherwise, calculating the proportion of suspicious behaviors in the behavior segment: if->Judging that the state of the whole behavior segment is normal; if->Judging the state of the whole behavior fragment to be suspicious; if->And judging the state of the whole behavior segment as abnormal.
In addition, toAnd->The reason for the threshold parting line is as follows: based on the typical distribution considerations,meaning +.>Is suspicious, which is acceptable for the overall decision to be normal; />Meaning +.>Is suspicious, which may be judged as evidence of abnormality; and use +.>And->As a threshold boundary, the final segment abnormality judgment has intuitionistic interpretation, certain robustness is ensured, and meanwhile, the adjustment and improvement space is provided.
S3.4: updating the final state of the behavior segment to the state label of the behavior segment to obtain the user behavior state sequence。
S4: judging the risk level of the user according to the user behavior state sequence, and taking corresponding response measures.
Specifically, the method comprises the following steps:
s4.2: construction of segment risk matrix。
Preferably, the segment risk matrixIs +.>,/>Representing the number of fragment status categories (normal, suspicious and abnormal in this embodiment), +.>Representing the number of fragment types (including login, billing, transfer, etc.), -, and the like>Representation type->Fragments in State->And the risk weight is generated based on knowledge graph analysis, and the range is 0-1.
S4.2: calculating risk values of user behavior sequences。
Preferably, for each user behavior state sequenceBehavior segment->Determine its category->And state->And in the segment risk matrix->Find the corresponding risk weight value +.>Summarizing what is in the behavior sequenceBehavioral segment->Risk value as a sequence of user actions->。
Specifically, user behavior sequence risk valuesThe calculation formula of (2) is as follows:
;
wherein,representing a risk value of the user behavior sequence,/->Representing the total number of behavioral segments>Representing the +.>And (5) a personal behavior segment risk weight value.
S4.3: and matching the risk value of the user behavior sequence with a preset risk rule to judge the risk level of the user, and taking corresponding risk response measures.
If it isIf the risk level of the user is judged to be low, early warning is not needed, the user behavior is continuously monitored, and normal service is maintained;
if it isIf the risk level of the user is low, medium risk is judged, if the risk level of the user is low, the non-key function of the account is suspended in response to the first-level measure, the user is prompted to have potential safety hazards, and the risk is reducedThe business operation adds secondary verification, introduces artificial verification, determines the function of batch recovery after no damage, and periodically reevaluates;
if it isJudging the high risk in the risk level of the user, responding to a secondary measure, suspending sensitive operation authority, prompting the user to actively check the risk, requiring the user to submit a problem improvement report, starting enhanced multi-factor identity verification, and periodically re-evaluating part of recovered business after expert checking and evaluating;
if it isIf the risk level of the user is high, responding to three-level measures, immediately freezing an account number, notifying a supervision department to carry out system internal inspection so as to trace back an evidence chain, requiring the user to comprehensively self-inspect and correct operation, and carrying out comprehensive inspection and evaluation by an expert to determine that the business is recovered in batches after the risk is eliminated;
s5: and (3) establishing a safety monitoring mechanism to continuously monitor and evaluate user behavior data, and timely discovering potential abnormality or safety risk.
Further, the embodiment also provides a safety analysis system based on the user behavior data, which comprises a data preprocessing module, a data analysis module and a data analysis module, wherein the data preprocessing module is used for collecting and preprocessing the user operation log data of each channel so as to acquire a user behavior sequence; the model construction module is used for establishing an abnormal behavior detection model by adopting a hidden Markov model and training the abnormal behavior detection model on each behavior segment by using a Baum-Welch algorithm; the state determining module is used for matching the state sequence output by the model with a set abnormality judging rule so as to judge the final state of the behavior segment; and the response measure module is used for judging the risk level of the user according to the final state of the user behavior segment and taking corresponding response measures.
In conclusion, the invention realizes accurate understanding of the user behavior mode and behavior change by dividing the user behavior sequence into the behavior fragments with fixed length, thereby improving the accuracy of analysis results; meanwhile, the abnormal mode is identified through sequence modeling, so that the risk detection efficiency is remarkably improved, and intelligent and dynamic safety monitoring is realized; in addition, a plurality of risk levels are subdivided, and decision is made by combining a plurality of judgment rules such as fixed threshold values, confidence intervals and the like, so that the robustness of the system is effectively improved; finally, optimizing means such as monitoring feedback and matrix adjustment are added, so that the system performance can be continuously improved.
Example 2
Referring to fig. 1 and 2, in order to verify the advantageous effects of the present invention, scientific demonstration is performed through economic benefit calculation and simulation experiments for the second embodiment of the present invention.
Specifically, taking a certain ERP system as an example, extracting operation logs of 100 users on the system in 6 months of 2022 through the background of the ERP system, classifying the logs according to user IDs to generate 100 complete behavior sequences of the users, including operations of making a bill, inquiring a provider, inquiring a client and the like, and dividing the behavior sequences of the users into behavior fragments with the time length of 10 minutes, wherein part of data are shown in table 1.
TABLE 1 partial behavior segments for users
Preferably, a set of user behavior states is definedAnd 0 represents normal behavior, 1 represents suspicious behavior, 2 represents abnormal behavior, initializing model parameters of an abnormal behavior detection model, training the model by using a Baum-Welch algorithm, and carrying out parameter convergence after 10 iterations to obtain an optimal abnormal behavior detection model.
Further, collecting new user logs, generating 12 behavior segments with the length of 10 minutes, and each behavior segment state sequenceLength of->15, with optimal abnormal behavior detectionPredicting a state sequence of each segment by the model, and selecting a state with the highest probability as a state recognition result to obtain a segment 1: />Fragment 2: />,., fragment 12: />. Each state sequence->Matching with the set abnormality judgment rule to judge the user behavior state sequence as +.>。
Further, constructing a segment risk matrixCalculating the risk value of the user behavior sequence +.>And (2) additionally->Matching a risk rule, judging that the risk level of a user is low and medium, responding to a secondary measure, suspending sensitive operation authority, prompting the user to actively check the risk, requiring the user to submit a problem improvement report, starting enhanced multi-factor identity verification, checking and evaluating part of recovered business by an expert, and periodically re-evaluating; the risk level is re-assessed until the risk level falls to a low risk.
Preferably, the comparative index of the process of the present invention to the conventional process is shown in Table 2.
TABLE 2 index comparison Table of the inventive method and the conventional method
Preferably, as can be seen from table 2, compared with the conventional method, the method of the present invention has significant advantages in terms of accuracy, sensitivity, misjudgment rate, real-time performance, expandability, system compatibility, etc.
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered in the scope of the claims of the present invention.
Claims (6)
1. A safety analysis method based on user behavior data is characterized in that: comprising the steps of (a) a step of,
collecting user operation log data of each channel and preprocessing the user operation log data to obtain a user behavior sequence;
constructing an abnormal behavior detection model based on the user behavior sequence to identify and detect abnormal behaviors;
predicting a new user behavior sequence by using the trained optimal abnormal behavior detection model, and determining the final state of the user behavior;
early warning is carried out according to the final state of the user behavior, and corresponding response measures are adopted;
establishing a safety monitoring mechanism to continuously monitor and evaluate user behavior data, and timely discovering potential abnormality or safety risk;
the construction of the abnormal behavior detection model based on the user behavior sequence comprises the following steps:
dividing a user behavior sequence into behavior fragments with fixed lengths according to time sequence;
establishing an abnormal behavior detection model by using a hidden Markov model, and initializing the model;
training an abnormal behavior detection model by using a Baum-Welch algorithm, iteratively calculating forward probability, backward probability and state output probability, and re-estimating and updating model parameters;
storing the updated model as an optimal abnormal behavior detection model;
the training of the abnormal behavior detection model by using the Baum-Welch algorithm comprises the following steps:
respectively calculating the forward probability and the backward probability of each moment t by using a forward algorithm and a backward algorithm;
calculating the output probability of the state i at each t moment according to the forward probability and the backward probability, and simultaneously re-estimating and updating the model parameters according to the output probability;
the specific formula for updating the model parameters is as follows:
,/>,/>,/>;
wherein,representing updated state probability vectors +.>Representing the updated state transition probability, +.>Indicating that the sign +.>Is>Output probability indicating that the state is j at the initial time, t indicates time,/>Representing the length of the behavior fragment state sequence, +.>Output probability indicating that the state is i at time t, < >>Output probability indicating transition from state i to state j at time t, < >>Output probability of j representing state at time t, +.>Observation symbol indicating time t,/->Representing the observation set +.>Symbols in->Index representing observation symbol->And->All of which represent a hidden state and,representing the number of possible states,/->Representing the number of possible observations.
2. The security analysis method based on user behavior data according to claim 1, wherein: the forward probability is calculated as follows:
;
wherein,indicated at t->The time status is +.>Forward probability of>Indicating a state of +.>Forward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is used to determine the transmission probability of (1),representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states;
the calculation formula of the backward probability is as follows:
;
wherein,indicating a state of +.>Backward probability of>Indicated at t->The time status is +.>Backward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is used to determine the transmission probability of (1),representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states;
the calculation formula of the output probability is as follows:
;
wherein,representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Indicating a state of +.>Forward probability of>Indicated at t->The time status is +.>Is used to determine the backward probability of (1),representing the number of possible states,/->And->All represent hidden states.
3. The security analysis method based on user behavior data according to claim 2, wherein: said determining the final state of the user behavior comprises the steps of:
collecting new user behavior data, and performing preprocessing and segmentation steps to obtain behavior fragments with fixed lengths;
for each behavior segment, predicting the behavior state of the user by using the optimal abnormal behavior detection model, and outputting the most likely state sequence;
State sequence for outputting modelMatching with a set abnormality judgment rule to judge the final state of the behavior segment;
updating the final state of the behavior segment to the state label of the behavior segment to obtain the user behavior state sequence;
The abnormality judgment rule includes the following:
if the state at all moments is normal behavior, judging that the state of the whole behavior segment is normal;
if the state at any moment is abnormal behavior, directly judging that the state of the whole behavior segment is abnormal;
otherwise, calculating the proportion of suspicious behaviors in the behavior segment:
If it isJudging that the state of the whole behavior segment is normal;
if it isJudging the state of the whole behavior fragment to be suspicious;
if it isJudging the state of the whole behavior segment to be abnormal;
wherein,representing the length of the behavior fragment state sequence.
4. A security analysis method based on user behavior data according to claim 3, wherein: the step of judging the risk level of the user according to the final state of the user behavior segment comprises the following steps:
construction of segment risk matrix;
Calculating risk values of user behavior sequencesAnd action segment total->;
Sequence risk values of user behaviorsMatching with a preset risk rule to judge the risk level of the user and taking corresponding risk response measures;
the calculation of risk values of a user behavior sequenceThe method comprises the following steps:
for each user behavior state sequenceBehavior segment->Confirm->Category of->And state->;
In-segment risk matrixFind the corresponding risk weight value +.>;
Summarizing risk weight values of all behavior fragments in behavior sequenceRisk value as a sequence of user actions->。
5. The method for security analysis based on user behavior data according to claim 4, wherein: the risk rule includes the following:
if it isIf the risk level of the user is judged to be low, early warning is not needed, the user behavior is continuously monitored, and normal service is maintained;
if it isIf the risk level of the user is low and medium, responding to a first-level measure, suspending the non-key function of the account number, prompting the user to have potential safety hazard, adding secondary verification to key business operation, introducing artificial verification, determining the function of batch recovery after no damage, and periodically reevaluating;
if it isJudging the medium and high risk of the user risk level, responding to the secondary measure, suspending the sensitive operation authority, prompting the user to actively check the risk, and asking the user to submit the problemImproving the report, enabling the enhanced multi-factor authentication, and checking part of recovered business after evaluation by an expert to periodically re-evaluate;
if it isAnd if the risk level of the user is high, responding to the three-level measures, immediately freezing the account number, notifying a supervision department to perform system internal inspection so as to trace back an evidence chain, requiring the user to comprehensively self-inspect and correct operation, and enabling an expert to perform comprehensive inspection and evaluation to determine that the business is recovered in batches after the risk is eliminated.
6. A security analysis system employing the security analysis method based on user behavior data according to any one of claims 1 to 5, characterized in that: comprising the steps of (a) a step of,
the data preprocessing module is used for collecting user operation log data of each channel and preprocessing the data so as to acquire a user behavior sequence;
the model construction module is used for establishing an abnormal behavior detection model by adopting a hidden Markov model and training the abnormal behavior detection model on each behavior segment by using a Baum-Welch algorithm;
the state determining module is used for matching the state sequence output by the model with a set abnormality judging rule so as to judge the final state of the behavior segment;
and the response measure module is used for judging the risk level of the user according to the final state of the user behavior segment and taking corresponding response measures.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410103051.0A CN117633787A (en) | 2024-01-25 | 2024-01-25 | Security analysis method and system based on user behavior data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410103051.0A CN117633787A (en) | 2024-01-25 | 2024-01-25 | Security analysis method and system based on user behavior data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117633787A true CN117633787A (en) | 2024-03-01 |
Family
ID=90023795
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410103051.0A Pending CN117633787A (en) | 2024-01-25 | 2024-01-25 | Security analysis method and system based on user behavior data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117633787A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20040012285A (en) * | 2002-08-02 | 2004-02-11 | 한국정보보호진흥원 | System And Method For Detecting Intrusion Using Hidden Markov Model |
CN108881194A (en) * | 2018-06-07 | 2018-11-23 | 郑州信大先进技术研究院 | Enterprises user anomaly detection method and device |
CN114218998A (en) * | 2021-11-02 | 2022-03-22 | 国家电网有限公司信息通信分公司 | Power system abnormal behavior analysis method based on hidden Markov model |
CN116680572A (en) * | 2023-06-29 | 2023-09-01 | 厦门她趣信息技术有限公司 | Abnormal user detection method based on time sequence behavior sequence |
-
2024
- 2024-01-25 CN CN202410103051.0A patent/CN117633787A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20040012285A (en) * | 2002-08-02 | 2004-02-11 | 한국정보보호진흥원 | System And Method For Detecting Intrusion Using Hidden Markov Model |
CN108881194A (en) * | 2018-06-07 | 2018-11-23 | 郑州信大先进技术研究院 | Enterprises user anomaly detection method and device |
CN114218998A (en) * | 2021-11-02 | 2022-03-22 | 国家电网有限公司信息通信分公司 | Power system abnormal behavior analysis method based on hidden Markov model |
CN116680572A (en) * | 2023-06-29 | 2023-09-01 | 厦门她趣信息技术有限公司 | Abnormal user detection method based on time sequence behavior sequence |
Non-Patent Citations (2)
Title |
---|
RABINER L R: "A Tutorial on Hidden Markov Models andSelected Applications in Speech Recognition", PROCEEDINGOF IEEE, vol. 77, no. 2, 28 February 1989 (1989-02-28), pages 257 - 286 * |
邬书跃;田新广;: "基于隐马尔可夫模型的用户行为异常检测新方法", 通信学报, no. 04, 15 April 2007 (2007-04-15) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111475804B (en) | Alarm prediction method and system | |
CN111614491B (en) | Power monitoring system oriented safety situation assessment index selection method and system | |
CN112804196A (en) | Log data processing method and device | |
CN117220978B (en) | Quantitative evaluation system and evaluation method for network security operation model | |
CN113918367A (en) | Large-scale system log anomaly detection method based on attention mechanism | |
CN112910859A (en) | Internet of things equipment monitoring and early warning method based on C5.0 decision tree and time sequence analysis | |
Dou et al. | Pc 2 a: predicting collective contextual anomalies via lstm with deep generative model | |
CN116366376B (en) | APT attack traceability graph analysis method | |
CN117439916A (en) | Network security test evaluation system and method | |
CN111898129B (en) | Malicious code sample screener and method based on Two-Head anomaly detection model | |
Hendry et al. | Intrusion signature creation via clustering anomalies | |
Weiss | Predicting telecommunication equipment failures from sequences of network alarms | |
CN112039907A (en) | Automatic testing method and system based on Internet of things terminal evaluation platform | |
CN117014193A (en) | Unknown Web attack detection method based on behavior baseline | |
CN115277178B (en) | Abnormality monitoring method, device and storage medium based on enterprise network flow | |
CN116187423A (en) | Behavior sequence anomaly detection method and system based on unsupervised algorithm | |
CN117633787A (en) | Security analysis method and system based on user behavior data | |
Baig et al. | One-dependence estimators for accurate detection of anomalous network traffic | |
CN114039837A (en) | Alarm data processing method, device, system, equipment and storage medium | |
Su et al. | A network anomaly detection method based on genetic algorithm | |
Prihantono et al. | Model-Based Feature Selection for Developing Network Attack Detection and Alerting System | |
Kanth | Gaussian Naıve Bayes based intrusion detection system | |
CN113221110B (en) | Remote access Trojan intelligent analysis method based on meta-learning | |
CN118468988B (en) | Terminal data leakage event prediction method and system based on horizontal federal learning | |
TWI789003B (en) | Service anomaly detection and alerting method, apparatus using the same, storage media for storing the same, and computer software program for generating service anomaly alert |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |