CN117633787A - A security analysis method and system based on user behavior data - Google Patents

A security analysis method and system based on user behavior data Download PDF

Info

Publication number
CN117633787A
CN117633787A CN202410103051.0A CN202410103051A CN117633787A CN 117633787 A CN117633787 A CN 117633787A CN 202410103051 A CN202410103051 A CN 202410103051A CN 117633787 A CN117633787 A CN 117633787A
Authority
CN
China
Prior art keywords
behavior
state
user
probability
user behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410103051.0A
Other languages
Chinese (zh)
Inventor
肖波
林森
毕岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Anling Trusted Network Technology Co ltd
Original Assignee
Beijing Anling Trusted Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Anling Trusted Network Technology Co ltd filed Critical Beijing Anling Trusted Network Technology Co ltd
Priority to CN202410103051.0A priority Critical patent/CN117633787A/en
Publication of CN117633787A publication Critical patent/CN117633787A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Analysis (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Computing Systems (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种基于用户行为数据的安全分析方法及系统,涉及安全分析技术领域,包括收集各渠道用户操作日志数据并进行预处理,以获取用户行为序列;基于用户行为序列构建异常行为检测模型,以识别和检测异常行为;利用训练好的最优异常行为检测模型对新用户行为序列进行预测,并确定用户行为的最终状态;根据用户行为的最终状态进行预警,并采取相应的响应措施;建立安全监测机制持续监测和评估用户行为数据,及时发现潜在的异常或安全风险。本发明通过将用户行为序列切分为固定长度的行为片段,实现了对用户行为模式和行为变化的精确理解,从而提高了分析结果的准确性。

The invention discloses a security analysis method and system based on user behavior data, which relates to the technical field of security analysis and includes collecting user operation log data from various channels and performing preprocessing to obtain user behavior sequences; constructing abnormal behavior detection based on user behavior sequences model to identify and detect abnormal behavior; use the trained optimal abnormal behavior detection model to predict new user behavior sequences and determine the final state of user behavior; provide early warning based on the final state of user behavior and take corresponding response measures ; Establish a security monitoring mechanism to continuously monitor and evaluate user behavior data and promptly discover potential anomalies or security risks. By dividing the user behavior sequence into fixed-length behavior segments, the present invention achieves an accurate understanding of user behavior patterns and behavior changes, thereby improving the accuracy of analysis results.

Description

一种基于用户行为数据的安全分析方法及系统A security analysis method and system based on user behavior data

技术领域Technical field

本发明涉及安全分析技术领域,特别是一种基于用户行为数据的安全分析方法及系统。The present invention relates to the technical field of security analysis, in particular to a security analysis method and system based on user behavior data.

背景技术Background technique

随着互联网的蓬勃发展,大部分企业都建立了数量不等的信息系统。这些信息系统除了要面对传统的网络安全问题外还面对着合法用户的非法操作问题。然而,现有的用户行为分析技术方法存在一些问题,主要表现在以下几个方面:首先,现有方法主要依赖规则引擎进行模式匹配,但这要求人工设定规则库。由于用户行为的复杂性和多变性,静态规则难以覆盖所有异常情况,一旦出现新的攻击手段,规则引擎就会失效,无法有效地检测出异常行为。With the booming development of the Internet, most companies have established varying numbers of information systems. In addition to traditional network security issues, these information systems also face illegal operations by legitimate users. However, there are some problems in existing user behavior analysis technology methods, which are mainly reflected in the following aspects: First, existing methods mainly rely on rule engines for pattern matching, but this requires manual setting of the rule base. Due to the complexity and variability of user behavior, it is difficult for static rules to cover all abnormal situations. Once new attack methods appear, the rule engine will fail and cannot effectively detect abnormal behaviors.

其次,一些方法采用简单的统计分析方法,通过设置阈值进行异常识别。然而,很难平衡灵敏度和误报率,容易导致漏报和误报,从而影响分析结果的准确性。另外,大多数方法侧重于单个账号的检测,很难实现对用户群体的全局监控,从而无法发现网络中的异常模式。最后,现有方法缺乏对检测策略的持续优化和反馈机制,难以跟上攻击手法的变化。Secondly, some methods use simple statistical analysis methods to identify anomalies by setting thresholds. However, it is difficult to balance sensitivity and false positive rate, which can easily lead to false negatives and false positives, thus affecting the accuracy of analysis results. In addition, most methods focus on the detection of a single account, making it difficult to achieve global monitoring of user groups, making it impossible to detect abnormal patterns in the network. Finally, existing methods lack continuous optimization and feedback mechanisms for detection strategies, making it difficult to keep up with changes in attack methods.

发明内容Contents of the invention

鉴于现有的用户行为分析技术方法在规则引擎依赖性、统计分析方法的限制、单个账号检测和策略优化方面存在问题,提出了本发明。In view of the problems existing in the existing user behavior analysis technical methods in terms of rule engine dependency, limitations of statistical analysis methods, single account detection and policy optimization, the present invention is proposed.

因此,本发明所要解决的问题在于如何检测和识别用户行为数据中的异常或风险模式,以进行有效的安全监控和预警。Therefore, the problem to be solved by the present invention is how to detect and identify abnormal or risk patterns in user behavior data to conduct effective security monitoring and early warning.

为解决上述技术问题,本发明提供如下技术方案:In order to solve the above technical problems, the present invention provides the following technical solutions:

第一方面,本发明实施例提供了一种基于用户行为数据的安全分析方法,其包括收集各渠道用户操作日志数据并进行预处理,以获取用户行为序列;基于用户行为序列构建异常行为检测模型,以识别和检测异常行为;利用训练好的最优异常行为检测模型对新用户行为序列进行预测,并确定用户行为的最终状态;根据用户行为的最终状态进行预警,并采取相应的响应措施;建立安全监测机制持续监测和评估用户行为数据,及时发现潜在的异常或安全风险;基于用户行为序列构建异常行为检测模型包括以下步骤:将用户行为序列按时间顺序切分为固定长度的行为片段;利用隐马尔可夫模型建立异常行为检测模型,并初始化模型;使用Baum-Welch算法训练异常行为检测模型,迭代计算前向概率、后向概率和状态输出概率,重新估计并更新模型参数;保存更新后的模型作为最优异常行为检测模型;使用Baum-Welch算法训练异常行为检测模型包括以下步骤:使用前向算法和后向算法分别计算每个时刻t的前向概率和后向概率;根据前向概率和后向概率计算每个t时刻状态i的输出概率,同时根据输出概率重新估计并更新模型参数;更新模型参数的具体公式如下:In the first aspect, embodiments of the present invention provide a security analysis method based on user behavior data, which includes collecting user operation log data from various channels and performing preprocessing to obtain a user behavior sequence; and constructing an abnormal behavior detection model based on the user behavior sequence. , to identify and detect abnormal behavior; use the trained optimal abnormal behavior detection model to predict new user behavior sequences and determine the final state of user behavior; provide early warning based on the final state of user behavior, and take corresponding response measures; Establish a security monitoring mechanism to continuously monitor and evaluate user behavior data, and promptly discover potential anomalies or security risks; building an abnormal behavior detection model based on user behavior sequences includes the following steps: Divide the user behavior sequence into fixed-length behavior segments in chronological order; Use the hidden Markov model to establish an abnormal behavior detection model and initialize the model; use the Baum-Welch algorithm to train the abnormal behavior detection model, iteratively calculate the forward probability, backward probability and state output probability, re-estimate and update the model parameters; save the update The latter model is used as the optimal abnormal behavior detection model; using the Baum-Welch algorithm to train the abnormal behavior detection model includes the following steps: using the forward algorithm and the backward algorithm to calculate the forward probability and backward probability at each time t respectively; according to the forward The forward probability and backward probability calculate the output probability of state i at each time t, and at the same time re-estimate and update the model parameters based on the output probability; the specific formula for updating the model parameters is as follows:

,/>,/>,/>; ,/> ,/> ,/> ;

其中,表示更新后的状态概率向量,/>表示更新后的状态转移概率,/>表示更新后在状态j下观测到符号/>的发射概率,/>表示初始时刻状态为j的输出概率,t表示时刻,/>表示行为片段状态序列的长度,/>表示t时刻状态为i的输出概率,/>表示t时刻从状态为i转移到状态j的输出概率,/>表示t时刻状态为j的输出概率,/>表示t时刻的观测符号,/>表示观测集合/>中的符号,/>表示观测符号的索引,/>和/>均表示隐状态,表示可能的状态数,/>表示可能的观测数。in, Represents the updated state probability vector,/> Represents the updated state transition probability,/> Indicates that the symbol /> is observed in state j after the update The emission probability of ,/> Represents the output probability of state j at the initial moment, t represents the time,/> Represents the length of the behavior fragment state sequence, /> Represents the output probability of state i at time t,/> Represents the output probability of transitioning from state i to state j at time t,/> Represents the output probability of state j at time t,/> Represents the observation symbol at time t,/> Represents an observation set/> The symbol in ,/> Represents the index of the observation symbol, /> and/> Both represent hidden states, Indicates the number of possible states,/> represents the number of possible observations.

作为本发明所述基于用户行为数据的安全分析方法的一种优选方案,其中:前向概率的计算公式如下:As a preferred solution of the security analysis method based on user behavior data of the present invention, the calculation formula of forward probability is as follows:

; ;

其中,表示在t/>时刻状态为/>的前向概率,/>表示在t时刻状态为/>的前向概率,/>表示从状态/>到状态/>的转移概率,/>在状态/>下生成观测/>的发射概率,/>表示可能的状态数,/>表示行为片段状态序列的长度,/>和/>均表示隐状态。in, Expressed in t/> The time status is/> The forward probability of ,/> Indicates that the state at time t is/> The forward probability of ,/> Represents slave status/> to status/> The transition probability of ,/> In status/> Next generate observations/> The emission probability of ,/> Indicates the number of possible states,/> Represents the length of the behavior fragment state sequence, /> and/> Both represent hidden states.

后向概率的计算公式如下:The calculation formula for backward probability is as follows:

; ;

其中,表示在t时刻状态为/>的后向概率,/>表示在t/>时刻状态为/>的后向概率,/>表示从状态/>到状态/>的转移概率,/>在状态/>下生成观测/>的发射概率,/>表示可能的状态数,/>表示行为片段状态序列的长度,/>和/>均表示隐状态。in, Indicates that the state at time t is/> The backward probability of ,/> Expressed in t/> The time status is/> The backward probability of ,/> Represents slave status/> to status/> The transition probability of ,/> In status/> Next generate observations/> The emission probability,/> Indicates the number of possible states,/> Represents the length of the behavior fragment state sequence, /> and/> Both represent hidden states.

输出概率的计算公式如下:The calculation formula of the output probability is as follows:

; ;

其中,表示从状态/>到状态/>的转移概率,/>在状态/>下生成观测/>的发射概率,/>表示在t时刻状态为/>的前向概率,/>表示在t/>时刻状态为/>的后向概率,/>表示可能的状态数,/>和/>均表示隐状态。in, Represents slave status/> to status/> The transition probability of ,/> In status/> Next generate observations/> The emission probability,/> Indicates that the state at time t is/> The forward probability of ,/> Expressed in t/> The time status is/> The backward probability of ,/> Indicates the number of possible states,/> and/> Both represent hidden states.

作为本发明所述基于用户行为数据的安全分析方法的一种优选方案,其中:As a preferred solution of the security analysis method based on user behavior data of the present invention, wherein:

确定用户行为的最终状态包括以下步骤:收集新用户行为数据,执行预处理和切分步骤以获得固定长度的行为片段;对于每个行为片段,利用最优异常行为检测模型预测用户行为状态,并输出最有可能状态序列;将模型输出的状态序列/>与设定的异常判断规则进行匹配,以判断行为片段的最终状态;将行为片段的最终状态更新至行为片段的状态标签,以获得用户行为状态序列/>;异常判断规则包括以下内容:若所有时刻的状态均为正常行为,则判定整个行为片段状态为正常;若任意时刻状态为异常行为,则直接判定整个行为片段状态为异常;否则,计算行为片段中可疑行为的比例/>:若/>,则判定整个行为片段状态为正常;若/>,则判定整个行为片段状态为可疑;若,则判定整个行为片段状态为异常;其中,/>表示行为片段状态序列的长度。Determining the final state of user behavior includes the following steps: collect new user behavior data, perform preprocessing and segmentation steps to obtain fixed-length behavior segments; for each behavior segment, use the optimal abnormal behavior detection model to predict the user behavior state, and Output the most likely sequence of states ;The state sequence output by the model/> Match the set exception judgment rules to determine the final status of the behavior fragment; update the final status of the behavior fragment to the status label of the behavior fragment to obtain the user behavior status sequence/> ; Abnormality judgment rules include the following: if the status at all times is normal behavior, then the status of the entire behavior segment is determined to be normal; if the status at any time is abnormal behavior, then the status of the entire behavior segment is directly determined to be abnormal; otherwise, the behavior segment is calculated Proportion of suspicious behavior/> :if/> , then the entire behavior segment is judged to be normal; if/> , then the status of the entire behavioral segment is determined to be suspicious; if , then the status of the entire behavioral segment is determined to be abnormal; where,/> Represents the length of the behavior fragment state sequence.

作为本发明所述基于用户行为数据的安全分析方法的一种优选方案,其中:根据用户行为片段的最终状态判断用户的风险等级包括以下步骤:构建片段风险矩阵;计算用户行为序列风险值/>和行为片段总数/>;将用户行为序列风险值/>与预设的风险规则相匹配,以判断用户风险等级,并采取对应的风险响应措施;计算用户行为序列风险值/>包括以下步骤:对每个用户行为状态序列/>中的行为片段/>,确定的/>类别/>和状态/>;在片段风险矩阵/>中查找对应的风险权重值/>;汇总行为序列中所有行为片段的风险权重值/>作为用户行为序列风险值/>As a preferred solution of the security analysis method based on user behavior data of the present invention, judging the user's risk level according to the final state of the user behavior segment includes the following steps: constructing a segment risk matrix ;Calculate user behavior sequence risk value/> and total number of behavioral clips/> ;Convert user behavior sequence risk value/> Match the preset risk rules to determine the user's risk level and take corresponding risk response measures; calculate the user behavior sequence risk value/> Includes the following steps: For each user behavior status sequence/> Behavioral snippets in/> , confirmed/> Category/> and status/> ;In Fragment Risk Matrix/> Find the corresponding risk weight value/> ; Summarize the risk weight values of all behavioral segments in the behavioral sequence/> As user behavior sequence risk value/> .

作为本发明所述基于用户行为数据的安全分析方法的一种优选方案,其中:风险规则包括以下内容:若,则判断用户风险等级低危,则无需进行预警,继续监测用户行为,保持正常服务;若/>,则判断用户风险等级低中危,则响应一级措施,暂停账号非关键功能,提示用户存在安全隐患,对关键业务操作加入二次验证,引入人工核查,确定无损害后分批恢复功能,并定期重新评估;若/>,则判断用户风险等级中高危,则响应二级措施,暂停敏感操作权限,提示用户主动排查风险,要求用户提交问题改进报告,启用强化的多因素身份验证,专家核查评估后部分恢复业务,定期重新评估;若/>,则判断用户风险等级高危,则响应三级措施,立即冻结账号,通知监管部门进行系统内部稽查以追溯证据链,要求用户全面自查并修正操作,专家进行全面核查评估,确定风险消除后分批恢复业务。As a preferred solution of the security analysis method based on user behavior data of the present invention, the risk rules include the following: If , then it is judged that the user risk level is low, no early warning is required, user behavior will continue to be monitored and normal services will be maintained; if/> , then it is judged that the user's risk level is low to medium, then first-level measures will be responded to, non-critical functions of the account will be suspended, and the user will be prompted that there are security risks. Secondary verification will be added to key business operations, manual verification will be introduced, and functions will be restored in batches after confirming that there is no damage. and reassess regularly; if/> , it is judged that the user's risk level is medium to high, then secondary measures will be responded to, sensitive operation permissions will be suspended, the user will be prompted to proactively investigate the risk, the user will be required to submit a problem improvement report, enhanced multi-factor authentication will be enabled, and business will be partially resumed after expert review and evaluation, and the business will be partially restored on a regular basis. Reevaluate; if/> , it is judged that the user's risk level is high, then Level 3 measures will be responded to, the account will be frozen immediately, the regulatory department will be notified to conduct an internal audit of the system to trace the evidence chain, and the user will be required to conduct a comprehensive self-examination and correct the operation. Experts will conduct a comprehensive verification and assessment to determine if the risk has been eliminated. Resume business in batches.

第二方面,本发明实施例提供了一种基于用户行为数据的安全分析系统,其包括数据预处理模块,用于收集各渠道用户操作日志数据并进行预处理,以获取用户行为序列;模型构建模块,用于采用隐马尔可夫模型建立异常行为检测模型,并使用Baum-Welch算法训练每个行为片段上异常行为检测模型;状态确定模块,用于将模型输出的状态序列与设定的异常判断规则进行匹配,以判断行为片段的最终状态;响应措施模块,用于根据用户行为片段的最终状态判断用户的风险等级,并采取相应的响应措施。In the second aspect, embodiments of the present invention provide a security analysis system based on user behavior data, which includes a data preprocessing module for collecting user operation log data from various channels and performing preprocessing to obtain user behavior sequences; model construction The module is used to establish an abnormal behavior detection model using hidden Markov models, and uses the Baum-Welch algorithm to train the abnormal behavior detection model on each behavioral segment; the state determination module is used to compare the state sequence output by the model with the set abnormality The judgment rules are matched to determine the final state of the behavior fragment; the response measure module is used to determine the user's risk level based on the final state of the user's behavior fragment and take corresponding response measures.

本发明有益效果为:本发明通过将用户行为序列切分为固定长度的行为片段,实现了对用户行为模式和行为变化的精确理解,从而提高了分析结果的准确性;同时,通过序列建模识别异常模式,显著提高了对风险的检测效率,实现了智能化和动态的安全监测;此外,细分多个风险等级,并结合固定阈值和置信区间等多种判断规则进行决策,有效提高了系统的鲁棒性;最后,加入监测反馈和矩阵调整等优化手段,能够持续改进系统性能。The beneficial effects of the present invention are: by dividing the user behavior sequence into fixed-length behavior segments, the present invention achieves an accurate understanding of user behavior patterns and behavior changes, thereby improving the accuracy of the analysis results; at the same time, through sequence modeling Identifying abnormal patterns significantly improves the efficiency of risk detection and realizes intelligent and dynamic security monitoring; in addition, it subdivides multiple risk levels and combines multiple judgment rules such as fixed thresholds and confidence intervals for decision-making, effectively improving The robustness of the system; finally, adding optimization methods such as monitoring feedback and matrix adjustment can continuously improve system performance.

附图说明Description of drawings

为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to explain the technical solutions of the embodiments of the present invention more clearly, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present invention. Those of ordinary skill in the art can also obtain other drawings based on these drawings without exerting creative efforts.

图1为基于用户行为数据的安全分析方法的方法流程图。Figure 1 is a method flow chart of the security analysis method based on user behavior data.

图2为基于用户行为数据的安全分析方法的计算机设备图。Figure 2 is a computer equipment diagram of the security analysis method based on user behavior data.

具体实施方式Detailed ways

为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合说明书附图对本发明的具体实施方式作详细的说明。In order to make the above objects, features and advantages of the present invention more obvious and understandable, the specific implementation modes of the present invention will be described in detail below with reference to the accompanying drawings.

在下面的描述中阐述了很多具体细节以便于充分理解本发明,但是本发明还可以采用其他不同于在此描述的其它方式来实施,本领域技术人员可以在不违背本发明内涵的情况下做类似推广,因此本发明不受下面公开的具体实施例的限制。Many specific details are set forth in the following description to fully understand the present invention. However, the present invention can also be implemented in other ways different from those described here. Those skilled in the art can do so without departing from the connotation of the present invention. Similar generalizations are made, and therefore the present invention is not limited to the specific embodiments disclosed below.

其次,此处所称的“一个实施例”或“实施例”是指可包含于本发明至少一个实现方式中的特定特征、结构或特性。在本说明书中不同地方出现的“在一个实施例中”并非均指同一个实施例,也不是单独的或选择性的与其他实施例互相排斥的实施例。Second, reference herein to "one embodiment" or "an embodiment" refers to a specific feature, structure, or characteristic that may be included in at least one implementation of the present invention. "In one embodiment" appearing in different places in this specification does not all refer to the same embodiment, nor is it a separate or selective embodiment that is mutually exclusive with other embodiments.

实施例1Example 1

参照图1和图2,为本发明第一个实施例,该实施例提供了一种基于用户行为数据的安全分析方法,包括,Referring to Figures 1 and 2, a first embodiment of the present invention is shown. This embodiment provides a security analysis method based on user behavior data, including:

S1:收集各渠道用户操作日志数据并进行预处理,以获取用户行为序列。S1: Collect user operation log data from each channel and perform preprocessing to obtain user behavior sequences.

优选的,收集不同渠道用户原始操作日志,并按用户ID归类日志,同时按时间顺序合并成用户行为序列;对日志进行清洗处理包括过滤无关记录、删除重复记录等;解析日志,提取行为关键字段,并对提取后的行为字段进行聚合,以生成描述单个行为的标准化记录;将单个用户的行为记录按时间顺序连接,以构成完整的用户行为序列;合并不同用户的行为序列,构成统一的行为序列数据集;对数据集进行安全脱敏,删除身份识别信息,获得可用的行为序列集。Preferably, collect the original operation logs of users from different channels, classify the logs by user ID, and merge them into user behavior sequences in chronological order; clean the logs, including filtering irrelevant records, deleting duplicate records, etc.; parse the logs and extract the key points of the behavior. fields, and aggregate the extracted behavior fields to generate standardized records describing a single behavior; connect the behavior records of individual users in chronological order to form a complete user behavior sequence; merge the behavior sequences of different users to form a unified Behavior sequence data set; perform security desensitization on the data set, delete identification information, and obtain a usable behavior sequence set.

S2:基于用户行为序列构建异常行为检测模型,以识别和检测异常行为。S2: Build an abnormal behavior detection model based on user behavior sequences to identify and detect abnormal behaviors.

具体的,包括以下步骤:Specifically, it includes the following steps:

S2.1:将用户行为序列按时间顺序切分为固定长度的行为片段。S2.1: Divide the user behavior sequence into fixed-length behavior segments in chronological order.

优选的,定义代表所有可能的行为状态集合,/>代表所有可能的观测集合,/>代表长度为/>的行为片段状态序列,/>代表状态对应的片段观测序列。preferred, definition Represents the set of all possible behavioral states,/> represents the set of all possible observations,/> The representative length is/> Behavior fragment state sequence, /> Represents the fragment observation sequence corresponding to the state.

具体的,在本实施例中,将用户行为序列按每10分钟切分为固定长度的行为片段,且用户行为状态包括正常行为、可疑行为和异常行为,行为状态集合,其中0表示正常行为,1表示可疑行为,2表示异常行为。Specifically, in this embodiment, the user behavior sequence is divided into fixed-length behavior segments every 10 minutes, and the user behavior status includes normal behavior, suspicious behavior and abnormal behavior, and the behavior status set , where 0 represents normal behavior, 1 represents suspicious behavior, and 2 represents abnormal behavior.

S2.2:利用隐马尔可夫模型建立异常行为检测模型,并初始化模型。S2.2: Use the hidden Markov model to establish an abnormal behavior detection model and initialize the model.

具体的,初始化模型,此时状态概率向量/>,初始状态转移概率矩阵/>,初始观测概率矩阵/>,其中/>表示可能的状态数,/>是可能的观测数,且/>,/>,/>,/>表示t时刻处于/>状态但在t/>时刻转移到/>状态的概率,/>表示t时刻处于/>状态生成观测/>状态的概率。Specifically, initialize the model , at this time the state probability vector/> , initial state transition probability matrix/> , initial observation probability matrix/> , of which/> Indicates the number of possible states,/> is the number of possible observations, and/> ,/> ,/> ,/> Indicates that time t is in/> status but in t/> Time transfer to/> The probability of the state,/> Indicates that time t is in/> State generation observation/> state probability.

进一步的,异常行为检测模型输入的是长度为的行为片段,输出的是每个t时刻的状态概率向量/>,选择概率最大的状态/>作为时刻t的状态识别结果,经过整个长度/>的序列,最终得到状态结果序列/>Further, the abnormal behavior detection model input is a length of Behavior fragment, the output is the state probability vector at each time t/> , select the state with the highest probability/> As the state recognition result at time t, after the entire length/> sequence, and finally get the status result sequence/> .

需要说明的是,异常行为检测模型是通过学习历史的正常用户操作序列来建立的,例如在ERP系统中,一些常见操作比如制单、审核、入库、出库、添加供应商、添加客户、查询客户、登录账号、修改个人信息等都是正常的用户行为;而某些操作如果过于频繁或数额异常地大,则会被判定为可疑行为,这些可疑行为包括批量查询客户信息并导出、批量查询入库信息并导出、账号在不同地点短时间内登录、连续多次登录失败等。It should be noted that the abnormal behavior detection model is established by learning historical normal user operation sequences. For example, in the ERP system, some common operations such as order making, review, warehousing, outgoing, adding suppliers, adding customers, Querying customers, logging in to accounts, modifying personal information, etc. are all normal user behaviors; if certain operations are too frequent or the amount is abnormally large, they will be judged as suspicious behaviors. These suspicious behaviors include querying and exporting customer information in batches, batch Query and export warehousing information, log in to accounts in different locations within a short period of time, fail to log in multiple times in a row, etc.

具体的,异常行为检测模型通过学习用户的历史正常操作序列,建立起正常行为的模式,当检测到用户的某些操作特征与正常行为模式有明显偏差或违反预设规则时,这些操作则会被判断为异常可疑行为。Specifically, the abnormal behavior detection model establishes a pattern of normal behavior by learning the user's historical normal operation sequence. When it is detected that some of the user's operational characteristics deviate significantly from the normal behavior pattern or violate the preset rules, these operations will It was judged to be unusual and suspicious behavior.

S2.3:使用Baum-Welch算法训练异常行为检测模型,迭代计算前向概率、后向概率和状态输出概率,并重新估计并更新模型参数。S2.3: Use the Baum-Welch algorithm to train the abnormal behavior detection model, iteratively calculate the forward probability, backward probability and state output probability, and re-estimate and update the model parameters.

具体的,包括以下步骤:Specifically, it includes the following steps:

S2.3.1:使用前向算法和后向算法分别计算每个时刻t的前向概率和后向概率。S2.3.1: Use the forward algorithm and the backward algorithm to calculate the forward probability and backward probability at each time t respectively.

具体的,前向概率的计算公式如下:Specifically, the calculation formula of forward probability is as follows:

; ;

; ;

其中,表示在初始时刻t=1状态为/>的前向概率,/>表示初始状态概率向量,在状态/>下生成观测/>的发射概率,/>表示在t/>时刻状态为/>的前向概率,表示在t时刻状态为/>的前向概率,/>表示从状态/>到状态/>的转移概率,/>在状态/>下生成观测/>的发射概率,/>表示可能的状态数,/>表示行为片段状态序列的长度,/>和/>均表示隐状态。in, Indicates that the state is/> at the initial time t=1 The forward probability of ,/> represents the initial state probability vector, In status/> Next generate observations/> The emission probability,/> Expressed in t/> The time status is/> The forward probability of , Indicates that the state at time t is/> The forward probability of ,/> Represents slave status/> to status/> The transition probability of ,/> In status/> Next generate observations/> The emission probability,/> Indicates the number of possible states,/> Represents the length of the behavior fragment state sequence, /> and/> Both represent hidden states.

优选的,后向概率的计算公式如下:Preferably, the calculation formula of backward probability is as follows:

; ;

; ;

其中,表示最后时刻状态为i的后向概率为1,/>表示在t时刻状态为/>的后向概率,/>表示在t/>时刻状态为/>的后向概率,/>表示从状态/>到状态/>的转移概率,/>在状态/>下生成观测/>的发射概率,/>表示可能的状态数,/>表示行为片段状态序列的长度,/>和/>均表示隐状态。in, Indicates that the backward probability of state i at the last moment is 1,/> Indicates that the state at time t is/> The backward probability of ,/> Expressed in t/> The time status is/> The backward probability of ,/> Represents slave status/> to status/> The transition probability of ,/> In status/> Next generate observations/> The emission probability,/> Indicates the number of possible states,/> Represents the length of the behavior fragment state sequence, /> and/> Both represent hidden states.

S2.3.2:根据前向概率和后向概率计算每个t时刻状态i的输出概率,同时根据输出概率重新估计并更新模型参数。S2.3.2: Calculate the output probability of state i at each time t based on the forward probability and backward probability, and re-estimate and update the model parameters based on the output probability.

具体的,t时刻状态为i的输出概率的具体公式如下:Specifically, the output probability of state i at time t The specific formula is as follows:

; ;

其中,表示在t时刻状态为/>的后向概率,/>表示在t时刻状态为/>的前向概率,/>表示可能的状态数,/>表示隐状态。in, Indicates that the state at time t is/> The backward probability of ,/> Indicates that the state at time t is/> The forward probability of ,/> Indicates the number of possible states,/> Represents hidden state.

进一步的,t时刻从状态为i转移到状态j的输出概率的具体公式如下:Furthermore, the output probability of transitioning from state i to state j at time t The specific formula is as follows:

; ;

其中,表示从状态/>到状态/>的转移概率,/>在状态/>下生成观测/>的发射概率,/>表示在t时刻状态为/>的前向概率,/>表示在t/>时刻状态为/>的后向概率,/>表示可能的状态数,/>和/>均表示隐状态。in, Represents slave status/> to status/> The transition probability of ,/> In status/> Next generate observations/> The emission probability,/> Indicates that the state at time t is/> The forward probability of ,/> Expressed in t/> The time status is/> The backward probability of ,/> Indicates the number of possible states,/> and/> Both represent hidden states.

进一步的,根据输出概率重新估计并更新模型参数的具体公式如下:Furthermore, the specific formula for re-estimating and updating model parameters based on the output probability is as follows:

,/>,/>,/>; ,/> ,/> ,/> ;

其中,表示更新后的状态概率向量,/>表示更新后的状态转移概率,/>表示更新后在状态j下观测到符号/>的发射概率,/>表示初始时刻状态为j的输出概率,t表示时刻,/>表示行为片段状态序列的长度,/>表示t时刻状态为i的输出概率,/>表示t时刻从状态为i转移到状态j的输出概率,/>表示t时刻状态为j的输出概率,/>表示t时刻的观测符号,/>表示观测集合/>中的符号,/>表示观测符号的索引,/>和/>均表示隐状态,表示可能的状态数,/>表示可能的观测数。in, Represents the updated state probability vector,/> Represents the updated state transition probability,/> Indicates that the symbol /> is observed in state j after the update The emission probability,/> Represents the output probability of state j at the initial moment, t represents the time,/> Represents the length of the behavior fragment state sequence, /> Represents the output probability of state i at time t,/> Represents the output probability of transitioning from state i to state j at time t,/> Represents the output probability of state j at time t,/> Represents the observation symbol at time t,/> Represents an observation set/> The symbol in /> Represents the index of the observation symbol, /> and/> Both represent hidden states, Indicates the number of possible states,/> represents the number of possible observations.

S2.4:保存更新后的模型作为最优异常行为检测模型。S2.4: Save the updated model as the optimal abnormal behavior detection model.

S3:利用最优异常行为检测模型对新用户行为序列进行异常预测,并确定用户行为的最终状态。S3: Use the optimal abnormal behavior detection model to predict abnormality in new user behavior sequences and determine the final state of user behavior.

优选的,包括以下步骤:Preferably, it includes the following steps:

S3.1:收集新用户行为数据,执行的预处理步骤获得用户行为序列,并执行/>的切分步骤将用户行为序列按时间顺序切分为固定长度的行为片段。S3.1: Collect new user behavior data and execute The preprocessing step obtains the user behavior sequence and executes/> The segmentation step divides the user behavior sequence into fixed-length behavior segments in chronological order.

S3.2:对于每个行为片段,利用最优异常行为检测模型预测用户行为状态, 并输出最有可能状态序列S3.2: For each behavior segment, use the optimal abnormal behavior detection model to predict the user's behavior state and output the most likely state sequence .

S3.3:将模型输出的状态序列与设定的异常判断规则进行匹配,以判断行为片段的最终状态。S3.3: State sequence output by the model Match the set exception judgment rules to determine the final status of the behavior fragment.

优选的,异常判断规则包括以下内容:若所有时刻的状态均为正常行为0,则判定整个行为片段状态为正常;若任意时刻状态为异常行为2,则直接判定整个行为片段状态为异常;否则,计算行为片段中可疑行为的比例:若/>,则判定整个行为片段状态为正常;若/>,则判定整个行为片段状态为可疑;若/>,则判定整个行为片段状态为异常。Preferably, the abnormality judgment rules include the following: if the status at all times is normal behavior 0, then the status of the entire behavior segment is determined to be normal; if the status at any time is abnormal behavior 2, then the status of the entire behavior segment is directly determined to be abnormal; otherwise, the status of the entire behavior segment is determined to be abnormal. , calculate the proportion of suspicious behavior in the behavior fragments :if/> , then the entire behavior segment is judged to be normal; if/> , then the status of the entire behavior segment is determined to be suspicious; if/> , then the status of the entire behavior segment is determined to be abnormal.

需要说明的是,以和/>为阈值分界线的原因如下:基于典型分布考虑,意味着片段中/>的时刻为可疑状态,这对于整体判定为正常是可以接受的;/>意味着片段中/>的时刻为可疑状态,这可以判定为异常的迹象;且使用/>和/>作为阈值分界线,可以使得最后的片段异常判断既有直观解释性,也保证了一定的稳健性,同时也有调整改进的空间。It should be noted that, with and/> The reason for the threshold dividing line is as follows: based on typical distribution considerations, means in fragment/> The moment is in a suspicious state, which is acceptable for the overall judgment to be normal;/> means in fragment/> The moment is in a suspicious state, which can be judged as an abnormal sign; and use/> and/> As a threshold dividing line, the final segment abnormality judgment can be intuitively interpretable and ensure a certain degree of robustness, while also leaving room for adjustment and improvement.

S3.4:将行为片段的最终状态更新至行为片段的状态标签,以获得用户行为状态序列S3.4: Update the final status of the behavior fragment to the status label of the behavior fragment to obtain the user behavior status sequence .

S4:根据用户行为状态序列判断用户的风险等级,并采取相应的响应措施。S4: Determine the user's risk level based on the user's behavior status sequence, and take corresponding response measures.

具体的,包括以下步骤:Specifically, it includes the following steps:

S4.2:构建片段风险矩阵S4.2: Construct a fragment risk matrix .

优选的,片段风险矩阵的大小为/>,/>表示片段状态类别数(在本实施例中为正常、可疑和异常),/>表示片段类型数(包括登录、制单、转账等等),/>表示类型/>的片段在状态/>下的风险权重,权重基于知识图谱分析产生,范围为0~1。Preferred, fragment risk matrix The size is/> ,/> Indicates the number of fragment status categories (normal, suspicious and abnormal in this embodiment), /> Indicates the number of fragment types (including login, order making, transfer, etc.),/> Representation type/> The fragment is in status/> The risk weight under , the weight is generated based on knowledge graph analysis, ranging from 0 to 1.

S4.2:计算用户行为序列风险值S4.2: Calculate user behavior sequence risk value .

优选的,对每个用户行为状态序列中的行为片段/>,确定其类别/>和状态/>,并在片段风险矩阵/>中查找对应的风险权重值/>,汇总行为序列中所有行为片段的/>作为用户行为序列风险值/>Preferably, for each user behavior status sequence Behavioral snippets in/> , determine its category/> and status/> , and in the fragment risk matrix/> Find the corresponding risk weight value/> , summarizing all behavior fragments in the behavior sequence/> As user behavior sequence risk value/> .

具体的,用户行为序列风险值的计算公式如下:Specifically, user behavior sequence risk value The calculation formula is as follows:

; ;

其中,表示用户行为序列风险值,/>表示行为片段总数,/>表示用户行为序列中第/>个行为片段风险权重值。in, Represents the user behavior sequence risk value,/> Represents the total number of behavioral fragments,/> Represents the user behavior sequence/> The risk weight value of each behavioral segment.

S4.3:将用户行为序列风险值与预设的风险规则相匹配,以判断用户风险等级,并采取对应的风险响应措施。S4.3: Match user behavior sequence risk values with preset risk rules to determine user risk levels and take corresponding risk response measures.

,则判断用户风险等级低危,则无需进行预警,继续监测用户行为,保持正常服务;like , then it is judged that the user's risk level is low, no early warning is required, user behavior will continue to be monitored, and normal services will be maintained;

,则判断用户风险等级低中危,则响应一级措施,暂停账号非关键功能,提示用户存在安全隐患,对关键业务操作加入二次验证,引入人工核查,确定无损害后分批恢复功能,并定期重新评估;like , then it is judged that the user's risk level is low to medium, then first-level measures will be responded to, non-critical functions of the account will be suspended, and the user will be prompted that there are security risks. Secondary verification will be added to key business operations, manual verification will be introduced, and functions will be restored in batches after confirming that there is no damage. and reassess regularly;

,则判断用户风险等级中高危,则响应二级措施,暂停敏感操作权限,提示用户主动排查风险,要求用户提交问题改进报告,启用强化的多因素身份验证,专家核查评估后部分恢复业务,定期重新评估;like , it is judged that the user's risk level is medium to high, then secondary measures will be responded to, sensitive operation permissions will be suspended, the user will be prompted to proactively investigate risks, the user will be required to submit a problem improvement report, enhanced multi-factor authentication will be enabled, and business will be partially resumed after expert review and assessment, and the business will be partially resumed on a regular basis. Reassess;

,则判断用户风险等级高危,则响应三级措施,立即冻结账号,通知监管部门进行系统内部稽查以追溯证据链,要求用户全面自查并修正操作,专家进行全面核查评估,确定风险消除后分批恢复业务;like , it is judged that the user's risk level is high, then Level 3 measures will be responded to, the account will be frozen immediately, the regulatory department will be notified to conduct an internal audit of the system to trace the evidence chain, and the user will be required to conduct a comprehensive self-examination and correct the operation. Experts will conduct a comprehensive verification and assessment to determine if the risk has been eliminated. Resume business in batches;

S5:建立安全监测机制持续监测和评估用户行为数据,及时发现潜在的异常或安全风险。S5: Establish a security monitoring mechanism to continuously monitor and evaluate user behavior data and discover potential anomalies or security risks in a timely manner.

进一步的,本实施例还提供一种基于用户行为数据的安全分析系统,包括数据预处理模块,用于收集各渠道用户操作日志数据并进行预处理,以获取用户行为序列;模型构建模块,用于采用隐马尔可夫模型建立异常行为检测模型,并使用Baum-Welch算法训练每个行为片段上异常行为检测模型;状态确定模块,用于将模型输出的状态序列与设定的异常判断规则进行匹配,以判断行为片段的最终状态;响应措施模块,用于根据用户行为片段的最终状态判断用户的风险等级,并采取相应的响应措施。Further, this embodiment also provides a security analysis system based on user behavior data, including a data preprocessing module for collecting and preprocessing user operation log data from various channels to obtain user behavior sequences; a model building module for The hidden Markov model is used to establish an abnormal behavior detection model, and the Baum-Welch algorithm is used to train the abnormal behavior detection model on each behavioral segment; the state determination module is used to compare the state sequence output by the model with the set abnormality judgment rules. matching to determine the final state of the behavior fragment; the response measure module is used to determine the user's risk level based on the final state of the user's behavior fragment and take corresponding response measures.

综上,本发明通过将用户行为序列切分为固定长度的行为片段,实现了对用户行为模式和行为变化的精确理解,从而提高了分析结果的准确性;同时,通过序列建模识别异常模式,显著提高了对风险的检测效率,实现了智能化和动态的安全监测;此外,细分多个风险等级,并结合固定阈值和置信区间等多种判断规则进行决策,有效提高了系统的鲁棒性;最后,加入监测反馈和矩阵调整等优化手段,能够持续改进系统性能。In summary, the present invention achieves an accurate understanding of user behavior patterns and behavior changes by dividing user behavior sequences into fixed-length behavior segments, thereby improving the accuracy of analysis results; at the same time, identifying abnormal patterns through sequence modeling , significantly improves the efficiency of risk detection and realizes intelligent and dynamic security monitoring; in addition, it subdivides multiple risk levels and combines multiple judgment rules such as fixed thresholds and confidence intervals for decision-making, which effectively improves the system's robustness. stickiness; finally, adding optimization methods such as monitoring feedback and matrix adjustment can continuously improve system performance.

实施例2Example 2

参照图1和图2,为本发明第二个实施例,为了验证本发明的有益效果,通过经济效益计算和仿真实验进行科学论证。Referring to Figures 1 and 2, a second embodiment of the present invention is shown. In order to verify the beneficial effects of the present invention, scientific demonstration is carried out through economic benefit calculations and simulation experiments.

具体的,以某ERP系统为例,通过ERP系统后台提取出2022年6月中100个用户在系统上的操作日志,按用户ID对日志进行归类生成100个用户完整的行为序列,包括制单、查询供应商、查询客户等操作,将用户行为序列切分为时间长度为10分钟的行为片段,部分数据如表1所示。Specifically, taking an ERP system as an example, the operation logs of 100 users on the system in June 2022 were extracted through the ERP system background, and the logs were classified according to user IDs to generate complete behavior sequences of 100 users, including system For operations such as ordering, querying suppliers, and querying customers, the user behavior sequence is divided into behavioral segments with a length of 10 minutes. Part of the data is shown in Table 1.

表1 用户的部分行为片段Table 1 Some behavioral fragments of users

优选的,定义用户行为状态集合,0表示正常行为,1表示可疑行为,2表示异常行为,初始化异常行为检测模型的模型参数,使用Baum-Welch算法训练模型,迭代10次后参数收敛,得到最优异常行为检测模型。Preferably, define a collection of user behavior states , 0 represents normal behavior, 1 represents suspicious behavior, and 2 represents abnormal behavior. Initialize the model parameters of the abnormal behavior detection model and use the Baum-Welch algorithm to train the model. After 10 iterations, the parameters converge and the optimal abnormal behavior detection model is obtained.

进一步的,收集新用户日志,生成12个长度为10分钟的行为片段,且每个行为片段状态序列的长度/>为15,利用最优异常行为检测模型预测每个片段的状态序列,并选择概率最大的状态作为状态识别结果,可得片段1:/>,片段2:/>,...,片段12:/>。将每个状态序列/>与设定的异常判断规则进行匹配,以判断用户行为状态序列为/>Further, new user logs are collected to generate 12 behavioral segments with a length of 10 minutes, and the status sequence of each behavioral segment is length/> is 15, use the optimal abnormal behavior detection model to predict the state sequence of each fragment, and select the state with the highest probability as the state identification result, we can get fragment 1:/> , fragment 2:/> ,...,Fragment 12:/> . Convert each status sequence/> Match with the set exception judgment rules to judge the user behavior status sequence as/> .

进一步的,构建片段风险矩阵,计算用户行为序列风险值/>,又/>,匹配风险规则,判断用户风险等级低中危,响应二级措施,暂停敏感操作权限,提示用户主动排查风险,要求用户提交问题改进报告,启用强化的多因素身份验证,专家核查评估后部分恢复业务,定期重新评估;重新评估风险等级,直至风险等级降至低危。Further, construct a fragment risk matrix , calculate the user behavior sequence risk value/> , and/> , match the risk rules, determine the user's risk level as low to medium, respond to secondary measures, suspend sensitive operation permissions, prompt the user to actively investigate risks, require the user to submit a problem improvement report, enable enhanced multi-factor authentication, and partially restore after expert verification and evaluation Business, re-evaluate regularly; re-evaluate the risk level until the risk level is reduced to low risk.

优选的,本发明方法与传统方法的对比指标如表2所示。Preferably, the comparison indexes between the method of the present invention and the traditional method are shown in Table 2.

表2 本发明方法与传统方法的指标对比表Table 2 Comparison table of indicators between the method of the present invention and the traditional method

优选的,由表2可以得出,本发明方法相较传统方法在准确率、敏感度、误判率、实时性、可扩展性和系统兼容性等方面表现出显著的优越性。Preferably, as can be seen from Table 2, the method of the present invention shows significant advantages over traditional methods in terms of accuracy, sensitivity, false positive rate, real-time performance, scalability and system compatibility.

应说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。It should be noted that the above embodiments are only used to illustrate the technical solution of the present invention rather than to limit it. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solution of the present invention can be carried out. Modifications or equivalent substitutions without departing from the spirit and scope of the technical solution of the present invention shall be included in the scope of the claims of the present invention.

Claims (6)

1. A safety analysis method based on user behavior data is characterized in that: comprising the steps of (a) a step of,
collecting user operation log data of each channel and preprocessing the user operation log data to obtain a user behavior sequence;
constructing an abnormal behavior detection model based on the user behavior sequence to identify and detect abnormal behaviors;
predicting a new user behavior sequence by using the trained optimal abnormal behavior detection model, and determining the final state of the user behavior;
early warning is carried out according to the final state of the user behavior, and corresponding response measures are adopted;
establishing a safety monitoring mechanism to continuously monitor and evaluate user behavior data, and timely discovering potential abnormality or safety risk;
the construction of the abnormal behavior detection model based on the user behavior sequence comprises the following steps:
dividing a user behavior sequence into behavior fragments with fixed lengths according to time sequence;
establishing an abnormal behavior detection model by using a hidden Markov model, and initializing the model;
training an abnormal behavior detection model by using a Baum-Welch algorithm, iteratively calculating forward probability, backward probability and state output probability, and re-estimating and updating model parameters;
storing the updated model as an optimal abnormal behavior detection model;
the training of the abnormal behavior detection model by using the Baum-Welch algorithm comprises the following steps:
respectively calculating the forward probability and the backward probability of each moment t by using a forward algorithm and a backward algorithm;
calculating the output probability of the state i at each t moment according to the forward probability and the backward probability, and simultaneously re-estimating and updating the model parameters according to the output probability;
the specific formula for updating the model parameters is as follows:
,/>,/>,/>;
wherein,representing updated state probability vectors +.>Representing the updated state transition probability, +.>Indicating that the sign +.>Is>Output probability indicating that the state is j at the initial time, t indicates time,/>Representing the length of the behavior fragment state sequence, +.>Output probability indicating that the state is i at time t, < >>Output probability indicating transition from state i to state j at time t, < >>Output probability of j representing state at time t, +.>Observation symbol indicating time t,/->Representing the observation set +.>Symbols in->Index representing observation symbol->And->All of which represent a hidden state and,representing the number of possible states,/->Representing the number of possible observations.
2. The security analysis method based on user behavior data according to claim 1, wherein: the forward probability is calculated as follows:
;
wherein,indicated at t->The time status is +.>Forward probability of>Indicating a state of +.>Forward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is used to determine the transmission probability of (1),representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states;
the calculation formula of the backward probability is as follows:
;
wherein,indicating a state of +.>Backward probability of>Indicated at t->The time status is +.>Backward probability of>Representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is used to determine the transmission probability of (1),representing the number of possible states,/->Representing the length of the behavior fragment state sequence, +.>And->All represent hidden states;
the calculation formula of the output probability is as follows:
;
wherein,representing slave status +.>To state->Transition probability of->In state->Down-generated observations->Is>Indicating a state of +.>Forward probability of>Indicated at t->The time status is +.>Is used to determine the backward probability of (1),representing the number of possible states,/->And->All represent hidden states.
3. The security analysis method based on user behavior data according to claim 2, wherein: said determining the final state of the user behavior comprises the steps of:
collecting new user behavior data, and performing preprocessing and segmentation steps to obtain behavior fragments with fixed lengths;
for each behavior segment, predicting the behavior state of the user by using the optimal abnormal behavior detection model, and outputting the most likely state sequence
State sequence for outputting modelMatching with a set abnormality judgment rule to judge the final state of the behavior segment;
updating the final state of the behavior segment to the state label of the behavior segment to obtain the user behavior state sequence
The abnormality judgment rule includes the following:
if the state at all moments is normal behavior, judging that the state of the whole behavior segment is normal;
if the state at any moment is abnormal behavior, directly judging that the state of the whole behavior segment is abnormal;
otherwise, calculating the proportion of suspicious behaviors in the behavior segment
If it isJudging that the state of the whole behavior segment is normal;
if it isJudging the state of the whole behavior fragment to be suspicious;
if it isJudging the state of the whole behavior segment to be abnormal;
wherein,representing the length of the behavior fragment state sequence.
4. A security analysis method based on user behavior data according to claim 3, wherein: the step of judging the risk level of the user according to the final state of the user behavior segment comprises the following steps:
construction of segment risk matrix
Calculating risk values of user behavior sequencesAnd action segment total->
Sequence risk values of user behaviorsMatching with a preset risk rule to judge the risk level of the user and taking corresponding risk response measures;
the calculation of risk values of a user behavior sequenceThe method comprises the following steps:
for each user behavior state sequenceBehavior segment->Confirm->Category of->And state->
In-segment risk matrixFind the corresponding risk weight value +.>
Summarizing risk weight values of all behavior fragments in behavior sequenceRisk value as a sequence of user actions->
5. The method for security analysis based on user behavior data according to claim 4, wherein: the risk rule includes the following:
if it isIf the risk level of the user is judged to be low, early warning is not needed, the user behavior is continuously monitored, and normal service is maintained;
if it isIf the risk level of the user is low and medium, responding to a first-level measure, suspending the non-key function of the account number, prompting the user to have potential safety hazard, adding secondary verification to key business operation, introducing artificial verification, determining the function of batch recovery after no damage, and periodically reevaluating;
if it isJudging the medium and high risk of the user risk level, responding to the secondary measure, suspending the sensitive operation authority, prompting the user to actively check the risk, and asking the user to submit the problemImproving the report, enabling the enhanced multi-factor authentication, and checking part of recovered business after evaluation by an expert to periodically re-evaluate;
if it isAnd if the risk level of the user is high, responding to the three-level measures, immediately freezing the account number, notifying a supervision department to perform system internal inspection so as to trace back an evidence chain, requiring the user to comprehensively self-inspect and correct operation, and enabling an expert to perform comprehensive inspection and evaluation to determine that the business is recovered in batches after the risk is eliminated.
6. A security analysis system employing the security analysis method based on user behavior data according to any one of claims 1 to 5, characterized in that: comprising the steps of (a) a step of,
the data preprocessing module is used for collecting user operation log data of each channel and preprocessing the data so as to acquire a user behavior sequence;
the model construction module is used for establishing an abnormal behavior detection model by adopting a hidden Markov model and training the abnormal behavior detection model on each behavior segment by using a Baum-Welch algorithm;
the state determining module is used for matching the state sequence output by the model with a set abnormality judging rule so as to judge the final state of the behavior segment;
and the response measure module is used for judging the risk level of the user according to the final state of the user behavior segment and taking corresponding response measures.
CN202410103051.0A 2024-01-25 2024-01-25 A security analysis method and system based on user behavior data Pending CN117633787A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410103051.0A CN117633787A (en) 2024-01-25 2024-01-25 A security analysis method and system based on user behavior data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410103051.0A CN117633787A (en) 2024-01-25 2024-01-25 A security analysis method and system based on user behavior data

Publications (1)

Publication Number Publication Date
CN117633787A true CN117633787A (en) 2024-03-01

Family

ID=90023795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410103051.0A Pending CN117633787A (en) 2024-01-25 2024-01-25 A security analysis method and system based on user behavior data

Country Status (1)

Country Link
CN (1) CN117633787A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040012285A (en) * 2002-08-02 2004-02-11 한국정보보호진흥원 System And Method For Detecting Intrusion Using Hidden Markov Model
CN108881194A (en) * 2018-06-07 2018-11-23 郑州信大先进技术研究院 Enterprises user anomaly detection method and device
CN114218998A (en) * 2021-11-02 2022-03-22 国家电网有限公司信息通信分公司 Power system abnormal behavior analysis method based on hidden Markov model
CN116680572A (en) * 2023-06-29 2023-09-01 厦门她趣信息技术有限公司 Abnormal user detection method based on time sequence behavior sequence

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040012285A (en) * 2002-08-02 2004-02-11 한국정보보호진흥원 System And Method For Detecting Intrusion Using Hidden Markov Model
CN108881194A (en) * 2018-06-07 2018-11-23 郑州信大先进技术研究院 Enterprises user anomaly detection method and device
CN114218998A (en) * 2021-11-02 2022-03-22 国家电网有限公司信息通信分公司 Power system abnormal behavior analysis method based on hidden Markov model
CN116680572A (en) * 2023-06-29 2023-09-01 厦门她趣信息技术有限公司 Abnormal user detection method based on time sequence behavior sequence

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
RABINER L R: "A Tutorial on Hidden Markov Models andSelected Applications in Speech Recognition", PROCEEDINGOF IEEE, vol. 77, no. 2, 28 February 1989 (1989-02-28), pages 257 - 286 *
邬书跃;田新广;: "基于隐马尔可夫模型的用户行为异常检测新方法", 通信学报, no. 04, 15 April 2007 (2007-04-15) *

Similar Documents

Publication Publication Date Title
CN112235283B (en) A network attack assessment method for power industrial control system based on vulnerability description attack graph
CN108596229A (en) Online abnormal monitoring, diagnosing method and system
CN112910859B (en) Internet of things equipment monitoring and early warning method based on C5.0 decision tree and time sequence analysis
CN113497726B (en) Alarm monitoring method, system, computer readable storage medium and electronic device
Hu et al. Time-series event prediction with evolutionary state graph
CN103581186A (en) Network security situation awareness method and system
CN110636066B (en) Network security threat situation assessment method based on unsupervised generative reasoning
CN118200019B (en) Network event safety monitoring method and system
CN117439916A (en) Network security test evaluation system and method
CN117421735A (en) Mining evaluation method based on big data vulnerability mining
CN111695823A (en) Industrial control network flow-based anomaly evaluation method and system
CN118279067B (en) Information data management method based on process mining technology
CN102158372B (en) Distributed system abnormity detection method
CN118536093B (en) Data security tracing method, system and device based on artificial intelligence
CN111651652B (en) Emotion tendency identification method, device, equipment and medium based on artificial intelligence
CN116882756B (en) Blockchain-based power safety management and control method
CN117633787A (en) A security analysis method and system based on user behavior data
CN117952598A (en) Energy system safety evaluation analysis method and device based on fault rate approximation
CN115065539B (en) Data security monitoring method, device, equipment and storage medium
CN117061254A (en) Abnormal flow detection method, device and computer equipment
Wang et al. KGroot: A knowledge graph-enhanced method for root cause analysis
Yin et al. A network security situation assessment model based on BP neural network optimized by DS evidence theory
CN118760845B (en) An intelligent data management system and method based on 5G communication
TWI789003B (en) Service anomaly detection and alerting method, apparatus using the same, storage media for storing the same, and computer software program for generating service anomaly alert
CN118378131B (en) Intelligent ammeter data analysis and anomaly detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20240301