CN108388969A

CN108388969A - Inside threat personage's Risk Forecast Method based on personal behavior temporal aspect

Info

Publication number: CN108388969A
Application number: CN201810233655.1A
Authority: CN
Inventors: 罗森林; 陈骋; 潘丽敏; 曲乐炜; 张笈
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2018-03-21
Filing date: 2018-03-21
Publication date: 2018-08-10

Abstract

The present invention relates to inside threat personage's Risk Forecast Methods based on temporal aspect, belong to computer and information science technical field.The present invention is pre-processed first to the accumulated history behavioural information of personage and feature extraction, includes quantization sampling, preemphasis and the adding window to people information, and will be extracted as corresponding numerical characteristic from multiple field heterogeneous character data；Then inside threat personage's prediction model compound training is carried out, inside threat personage's risk forecast model based on LSTM is built；Inside threat personage's risk forecast model based on LSTM is finally used to carry out risk profile assessment and early warning.The present invention has higher accuracy rate compared to other common methods, provides a kind of foundation of quantization inside threat for enterprise, provides a kind of inside threat personage's modes of warning of layering.

Description

Inside threat personage's Risk Forecast Method based on personal behavior temporal aspect

Technical field

The present invention relates to inside threat personage's Risk Forecast Method based on personal behavior temporal aspect, belong to computer with Information science technology field.

Background technology

In tissue or enterprises, generally require to monitor inside threat personage in real time and behavior prediction.By filling Divide the accumulated history behavioural information using personage to predict personage's behavior, and inside threat assessment is carried out to personage, to dangerous people Object carries out early warning.Therefore, the present invention comes a kind of inside threat personage Risk Forecast Method is provided to enterprise or organization internal Personage carries out risk assessment and early warning.

Inside threat personage's Risk Forecast Method needs the basic problem that solves to be：The historical behavior letter added up by personage It ceases to predict personage's behavior, detects the improper behavior of personage, and prediction result is quantified, formulate inside threat grade and draw Minute mark is accurate, and carries out inside threat assessment to personage, carries out early warning to the personage that assessment result is high-risk and middle danger, reduces tissue Or the Information Security Risk of enterprise.It can be classified as two classes usually using method：

1. inside threat personage's prediction technique based on figure digging technology

Inside threat character data, is considered as by inside threat personage prediction technique-GBAD based on figure digging technology for the first time Unlimited data flow, and propose that data flow can be separated into a series of discontinuous pieces, such as each block can include one The data in week mainly consider three kinds of graphic operations, are the modification, insertion and deletion of figure respectively.Although this method is equally based on Sequence characteristics, but GBAD is that inlet flow is divided into different blocks, each block is a subgraph, and what is considered is several before Conditional probability of the information of block to current block.And if as a block, it will cause model instructions by daily behavioural characteristic by GBAD Practice slowly, it is ineffective.In addition, GBAD can not consider all historical informations.Therefore the universality of GBAD is poor, and prediction is accurate True rate is relatively low.

2. being based on Hidden Markov Model (HMM) and inside threat personality resource's abuse prediction technique

Based on Hidden Markov Model (HMM) and inside threat personality resource's abuse prediction technique be one kind with information State of the file of system as model can improve life using the transaction operation of inside threat personage as observation symbol Middle rate reduces rate of false alarm.And the state of personage is indicated using Malcov models, and united using the transition probability matrix of Malcov Transfer number of the personage between different conditions is counted, predicts the abnormal variation of personage on this basis.But Malcov models and HMM is not particularly suited for inside threat scene.Malcov assumes to illustrate that current state is only related with preceding state, and before State is unrelated.Therefore it does not make full use of historical information, can theoretically be solved certainly using high-order Malcov such Problem, but the problem of bringing is then the loss of calculation amount and performance.

In conclusion existing inside threat personage Risk Forecast Method is difficult to adequately utilize historical information, it can not be more Risk assessment accurately is carried out to inside threat personage, so the present invention proposes the inside threat based on personal behavior temporal aspect Personage's Risk Forecast Method.

Invention content

The purpose of the present invention is obtaining a kind of foundation of quantization inside threat, a kind of inside threat personage of layering is provided Early-warning Model improves the comprehensive performance of inside threat personage's risk forecast model.

The present invention design principle be：The accumulated history behavioural information of personage is pre-processed first and feature extraction, Include quantization sampling, preemphasis and the adding window to people information, and will be extracted as accordingly from multiple field heterogeneous character data Numerical characteristic；Then inside threat personage's prediction model compound training is carried out, structure is based on personal behavior temporal aspect and LSTM The inside threat personage's risk forecast model being combined；It finally uses based on personal behavior temporal aspect in LSTM is combined Portion threatens personage's risk forecast model to carry out risk profile assessment.

The technical scheme is that be achieved by the steps of：

Step 1, pretreatment and feature extraction are carried out to people information.

Step 1.1, to people information quantization, sampling, preemphasis and adding window.

Step 1.2,3 kinds of features are extracted from data source：The attributive character of personage, " counting " feature of personage, personage Psychological characteristics.

Step 1.3, further feature extraction and quantization then are carried out to the feature extracted, obtains more fine granularity behavior Feature.

Step 2, inside threat risk forecast model compound training.

Step 2.1, LSTM models are trained using 80% people information characteristic.

Step 2.2, the training data based on people information builds inside threat personage's risk profile archetype.

Step 3, LSTM model risks forecast assessment.

Step 3.1, inside threat personage's risk is carried out on test set using the method based on personal behavior temporal aspect Prediction.

Step 3.2, according to risk profile as a result, being divided according to personage's threat level, to personage's progress inside threat risk Assessment carries out early warning to the personage that assessment result is high-risk and middle danger.

Advantageous effect

Compared to inside threat personage's prediction technique based on figure digging technology, the present invention can be with day for a time point It is handled, historical information all before being utilized LSTM, therefore its effect is more preferable.

Compared to based on Hidden Markov Model (HMM) and inside threat personality resource's abuse prediction technique, this hair It is bright to be more suitable for inside threat scene, more abundant, the loss smaller of calculation amount and performance is utilized to historical information, is had good Universality.

Description of the drawings

Fig. 1 is inside threat personage's risk profile principle framework figure.

Fig. 2 is LSTM and other methods accuracy rate, recall rate and F value comparison diagrams in test experiments.

Specific implementation mode

In order to better illustrate objects and advantages of the present invention, the embodiment of the method for the present invention is done with reference to example It is further described.

Detailed process is：

Step 1.1, audio data quantified first, sampled；Then by character data vacancy value, exceptional value into Row is screened, and is rejected to exceptional value, and completion is carried out to vacancy value.

Step 1.2,3 kinds of features are extracted from CERT-IT (v6.2) data source：The attributive character of personage, the psychology of personage Feature, " counting " feature of personage.

Step 1.3, further feature extraction and quantization are carried out to the feature extracted, it is special obtains more fine granularity behavior Sign.Data set after treatment is 521 days character datas, and eliminates the data at weekend, because the data at weekend may Workaday Behavior law is not met.

Step 2, inside threat risk forecast model compound training.

Step 2.1, it was used as training set by the 1st day to the 417th day, for training LSTM models, by the 418th day to the 521st It is as test set.During training LSTM, the quantity (from 1 to 6) of its hidden layer is adjusted, the nerve of its hidden layer is adjusted First quantity (from 20-500), adjusts its time step (from 3 to 40), sets the batch of each sample input as 260, learning rate It is set as 0.01, selection mean square error is loss function, and optimization method is using ADAM (a kind of mutation that gradient declines).

Step 2.2, when obtaining the output h of t-1 moment hidden layers_t-1, it is inside threat personage's to define personage in t moment Score T_t=-1000logP_θ(x_t|h_t-1).θ is the output of model, and meaning is the observation vector x of subsequent time personage's behavior_t Conditional probability distribution.x_tIt is the observation vector of t moment personage's behavioural characteristic, h_t-1It is the output vector of t-1 moment hidden layers, Personage's behavior historical information before t-1 lies in h_t-1In.Therefore, according to h_t-1With observation vector x_tConditional probability P_θ(x_t| h_t-1) it there has been specific meaning.T_tIt is smaller, illustrate that personage's behavior is not abnormal variation, then it is assumed that internal prestige occurs for personage The probability of the side of body is low.T_tIt is bigger, illustrate that personage's behavior is abnormal variation and does not meet personage's normal behaviour trend, then it is assumed that Ren Wufa The probability of raw inside threat is high.

Step 3, LSTM model risks forecast assessment.

Step 3.1, the conditional probability P in risk prediction model_θ(x_t|h_t-1)P_θ(Y)(V_t|h_t-1) can be expressed as

Wherein,θ (V), θ (Y) are then the output of LSTM hidden layers,The calculation formula of θ (V), θ (Y) are as follows：

θ (V)=o (V)_t⊙tanh(c(V)_t)

θ (Y)=o (Y)_t⊙tanh(c(Y)_t)

Wherein trained parameter is weight matrix W and bigoted matrix b, the two parameters are shared to all persons.It is logical It crosses LSTM and obtains conditional probability distribution, then can carry out the calculating of conditional probability, its hair can be calculated by conditional probability distribution Raw conditional probability finally obtains inside threat risk profile probability, and is translated into inside threat personage scoring.

Step 3.2, personage's threat level can be carried out according to inside threat personage's risk profile result and threat scoring to draw Point.And according to division result, inside threat risk assessment is carried out to personage, the personage that assessment result is high-risk and middle danger is carried out Early warning.

Test result：Experiment is using the inside threat personage risk forecast model based on personal behavior temporal aspect to processing Test set afterwards is predicted, and is compared with other several common methods, the results showed that the present invention compares other models Effect is more preferable, accuracy rate 89.65%, recall rate 90.75%, and F values are 90.20%, can carry out risk profile to personage. Effect is shown in Fig. 2, provides a kind of foundation of quantization inside threat for enterprise, provides a kind of inside threat personage's early warning of layering Pattern.

Above-described specific descriptions have carried out further specifically the purpose, technical solution and advantageous effect of invention It is bright, it should be understood that the above is only a specific embodiment of the present invention, the protection model being not intended to limit the present invention It encloses, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should be included in the present invention Protection domain within.

Claims

1. inside threat personage's Risk Forecast Method based on personal behavior temporal aspect, it is characterised in that the method includes such as Lower step：

Step 1, pretreatment and feature extraction are carried out to people information, including：Abnormal data is rejected, to AFR control into Row completion, character numerical value is standardized, then from 3 kinds of features of extracting data：The attributive character of personage, personage Psychological characteristics, " counting " feature of personage, finally carry out fine-grained feature extraction and merge to obtain feature vector again；

Step 2, inside threat risk forecast model is built using LSTM algorithms, the characteristic of personage is trained, finally It obtains inside threat risk profile probability, and is translated into inside threat personage scoring；

Step 3, personage's threat level division can be carried out according to inside threat personage's risk profile result and threat scoring, according to Personage's threat level criteria for classifying carries out inside threat assessment to personage, is carried out to the personage that assessment result is high-risk and middle danger Early warning.

2. inside threat personage's Risk Forecast Method according to claim 1 based on personal behavior temporal aspect, special Sign is：The scoring formula of inside threat risk forecast model based on LSTM algorithms structure in step 2 is T_t=-1000logP_θ (x_t|h_t-1), θ is the output of model, and meaning is the observation vector x of subsequent time personage's behavior_tConditional probability distribution, x_tIt is The observation vector of t moment personage's behavioural characteristic, h_t-1It is the output vector of t-1 moment hidden layers, personage's behavior before t-1 is gone through History embodying information is in h_t-1In.

3. inside threat personage's Risk Forecast Method according to claim 1 based on personal behavior temporal aspect, special Sign is：Conditional probability P in step 2_θ(x_t|h_t-1) calculation formula it is as follows：

Andθ (V), θ (Y) are then the output of LSTM hidden layers,The calculation formula of θ (V), θ (Y) are as follows：

θ (V)=o (V)_t⊙tanth(c(V)_t)

θ (Y)=o (Y)_t⊙tanth(c(Y)_t)

Wherein trained parameter is weight matrix W and bigoted matrix b；Conditional probability distribution is obtained by LSTM risk evaluation models The conditional probability for calculating its generation, finally obtains inside threat risk profile probability, and be translated into inside threat personage and comment Point.