CN112966732B - Multi-factor interactive behavior anomaly detection method with periodic attribute - Google Patents

Multi-factor interactive behavior anomaly detection method with periodic attribute Download PDF

Info

Publication number
CN112966732B
CN112966732B CN202110228567.4A CN202110228567A CN112966732B CN 112966732 B CN112966732 B CN 112966732B CN 202110228567 A CN202110228567 A CN 202110228567A CN 112966732 B CN112966732 B CN 112966732B
Authority
CN
China
Prior art keywords
user
behavior
time
attribute
interactive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110228567.4A
Other languages
Chinese (zh)
Other versions
CN112966732A (en
Inventor
章昭辉
王鹏伟
刘霄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Donghua University
Original Assignee
Donghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Donghua University filed Critical Donghua University
Priority to CN202110228567.4A priority Critical patent/CN112966732B/en
Publication of CN112966732A publication Critical patent/CN112966732A/en
Application granted granted Critical
Publication of CN112966732B publication Critical patent/CN112966732B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Abstract

The invention relates to a multi-factor interactive behavior abnormity detection method with periodic attributes, which is characterized in that each user is considered independently, the historical normal interactive behavior of the user is analyzed, and the current interactive behavior of the user is detected according to the historical normal interactive behavior mode of the user. Not only the login time attribute, the working time login attribute, the login interval and the key page dwell time attribute are considered, but also the user interaction duration and the key path trigger attribute are considered, and the system interaction behavior of the user is more fully described; the provided interactive behavior period division algorithm analyzes the period characteristics of the user behaviors, meanwhile, the interactive behaviors are measured by using the adjusted cosine similarity in the abnormal behavior detection model, and the depiction of behavior direction characteristics is increased on the basis of ensuring that the numerical characteristics of the behavior vectors are not damaged. And technical support is provided for abnormal judgment and detection of the interactive behaviors.

Description

Multi-factor interactive behavior anomaly detection method with periodic attribute
Technical Field
The invention relates to the technical field of information, in particular to a multi-factor interactive behavior abnormity detection method with periodic attributes.
Background
In recent years, the economy of China is rapidly developed, computer technology is continuously applied to the field of financial transactions, online payment is more and more popular with the arrival of the 'internet +' era, and internet finance becomes the mainstream trend of the development of the financial industry. Also, network payments and cardless payments (e.g., payPal and AliPay) are becoming more popular, with the attendant growth in transaction fraud being quite rapid.
Most of the existing identity authentication technologies are based on the account name and the password of the user. The identity of the user is authenticated in a short time, and then all behaviors done by the user are regarded as legal behaviors no matter what the real identity of the user is. In order to make up for the defects caused by the identity authentication mode of a single user name and password, in recent years, many scholars also tend to adopt a data characteristic mining and behavior analysis method in the field of identity recognition. For example, the behavior modeling and prediction are carried out on the user Web logs by adopting methods such as association rule mining, a hidden Markov process, a semi Markov process, a Bayesian network, a neural network, a random forest and the like. Despite the current efforts to solve the user identification problem, difficulties remain.
At present, an individual behavior portrait is mainly applied to the fields of intelligent marketing, click prediction, software system optimization and the like, statistical characteristics such as interaction frequency, interaction time delay, browsing path and other information are extracted by analyzing historical interaction behavior data of a user, a label is marked on the operation behavior of the user, and advertisement recommendation, marketing, prediction and the like are realized according to the label to which the user belongs. However, in the field of interactive behavior anomaly detection, according to the fact that each user has own unique interactive behavior habit, such as different system login time, interactive time length, clicking frequency and the like, a user behavior model is constructed by analyzing the interactive behavior pattern of the user, then the model is used for detecting the matching degree of the interactive characteristics of the user, and further whether the operation of the user is triggered by the user is identified.
However, due to the stimulation of different external scenes, it is difficult for a user to generate interactive behaviors in a stable period all the time, for example, in ticket buying scenes of a double eleven shopping festival and a hot holiday, the interactive behaviors of the user are often greatly different from those in a common interactive scene. Due to the randomness and the discreteness of the scene and the behavior difference of the user, the time sequence characteristics of the interaction behavior of the user often have certain periodicity, so that the methods for quickly calculating the time domain sequence period by using fast Fourier transform and the like cannot be well applied to the analysis of the interaction behavior of the user, and the periodic characteristics of the behavior are often ignored in the conventional interaction behavior anomaly detection research, so that the judgment of the model on the interaction behavior in the scene often has deviation.
Disclosure of Invention
The invention provides a multi-factor interactive behavior abnormity detection method with periodic attributes, aiming at the problem of interactive behavior abnormity detection in the Internet, and starting from user individuals, differences among different users and periodic characteristics of interactive behaviors are fully considered, and legality judgment is carried out on the interactive behaviors of the users.
The technical scheme of the invention is a multi-factor interactive behavior anomaly detection method with periodic attributes, which specifically comprises the following steps:
1) Establishing a normal user interaction behavior portrait: extracting normal transaction data of the user from a user historical transaction database, establishing a login time attribute, a working time login attribute, a login interval, a key page dwell time attribute, a user interaction duration attribute and a key path trigger attribute, and constructing an interactive behavior portrait IBC of the user with multi-dimensional attributes u
2) On the basis of the step 1), generating a behavior interval sequence of the user according to the user behavior record, and calculating a periodic stability threshold of the user; secondly, according to a behavior period division method, sequentially comparing whether adjacent elements in a behavior interval sequence meet a period stability threshold value, outputting an interactive behavior period sequence of a user, and finally calculating a normal interactive behavior portrait UCP with a period attribute u
3) Calculating the maximum deviation benchmark of the interactive behavior: repeating the steps 1) and 2) to obtain an interactive behavior portrait UCP 'of which the user has a periodic attribute according to all transaction data of the user' u As user historical interaction behavior, for UCP' u Each interactive behavior record in the set of interactive behavior records, and a normal user interactive behavior profile UCP with a periodic attribute u Matching, calculating the similarity between each historical interactive behavior of the user and the normal interactive behavior of the user in sequence, and calculating the maximum similarity Max sim And minimum similarity Min sim The range of (4) is sequentially valued from the range, the historical interactive behaviors of the user are divided into normal behaviors and abnormal behaviors, the division effect DB is calculated, the value with the best division effect is taken as the maximum deviation Benchmark of the interactive behaviors of the user and is marked as Benchmark u
4) Establishing a multi-factor interactive behavior recognition method: calculating the UCP of the user current interactive behavior portrait according to the step 1) now Calculating the deviation degree of the current interactive behavior from the user normal interactive behavior portrait obtained in step 3), and thus deviatingDegree of the above-mentioned method in Benchmark u Is within the acceptable range, the interaction is judged to be normal, if the deviation degree is not in the Benchmark u Is within the acceptable range, the abnormal interaction is judged.
Preferably, the specific implementation method of step 1) is as follows:
1.1 Extract user historical normal interaction behavior records:
collecting historical interaction behavior data of a user, marking positive and negative fields for a sample according to normal interaction and abnormal interaction of the user, and extracting normal interaction data of the user as positive sample data;
1.2 Calculate login time attribute:
extracting the login time set of the user from the positive sample data, and dividing one day into a plurality of time intervals according to a daily hour division method 1 ,time 2 ,...,time n Calculating the probability of the login occurrence of the user in each interval, calculating the login time attribute of the user by using the following formula,
Figure BDA0002957877240000031
wherein, time n For n time interval attributes, | lta n L is the number of logins in the nth time interval,
Figure BDA0002957877240000032
the total number of logins for user u per day. Further, the log-in time attribute LTA of the user u is obtained u =(time 1 ,time 2 ,...,time n );
1.3 Compute a work time login attribute:
extracting a set of transaction time, respectively calculating the transaction probability of the transaction occurring in working time and non-working time, and obtaining whether the transaction of the user u is a working time login attribute WTA u =(isworktime,noworktime);
1.4 Calculate login interval attribute:
Figure BDA0002957877240000033
wherein
Figure BDA0002957877240000034
Elements in the set of login intervals;
Figure BDA0002957877240000035
the time of logging in the system for the ith time of the user u;
Figure BDA0002957877240000036
the time of logging in the system for the i-1 st time of the user u;
obtaining a time interval change amplitude set of two adjacent logins of the user by using the formula, extracting the login time interval set of the user, obtaining a first quartile, a second quartile and a third quartile of the set by using a quantile analysis method, obtaining the upper limit and the lower limit of the set, wherein the first quartile, the second quartile and the third quartile are variable values which are positioned at the 25 th position, the 50 th position and the 75 th position after all data of the set are arranged according to the size sequence, and dividing the set into 5 sub-sets period 1 ,period 2 ,...,period n The 5 user login interval attributes are calculated by the following formula:
Figure BDA0002957877240000041
wherein period n For entry in the Interval Attribute, | lia n The number of times the user login interval time is within the nth subset,
Figure BDA0002957877240000042
logging in times for a user u; further, the login interval attribute LIA of the user u is obtained u =(period 1 ,period 2 ,period 3 ,period 4 ,period 5 );
1.5 Calculate user key page dwell time attribute:
sequentially calculating the key page a of the user in the normal interaction behavior log of the user u page_no Sum of residence times of = key
Figure BDA0002957877240000043
Wherein
Figure BDA0002957877240000044
Calculating to obtain the key page retention time attribute KSA of the user u by using quantile analysis method according to the same calculation method in 1.4) u =(distance 1 ,distance 2 ,distance 3 ,distance 4 ,distance 5 );
1.6 Calculate user interaction duration attribute:
in the normal interaction behavior log of the user u, calculating the sum of the browsing time of each page in one interaction operation of the user u to obtain a set
Figure BDA0002957877240000045
Calculating to obtain the user u interaction duration time attribute IDA by using quantile analysis method according to the same calculation method in 1.4) u =(duration 1 ,duration 2 ,...,duration n );
1.7 Computing user critical path trigger attributes:
sequentially calculating the retention time of a key page and a non-key page of a system in one interactive operation of a user in a normal interactive behavior log of the user u; calculating to obtain the user critical path trigger attribute CTA by using quantile analysis method according to the same calculation method in 1.4) u =(ratio 1 ,ratio 2 );
1.8 Construct a user interaction behavior portrait:
obtaining the attribute of each dimension of the user u, and constructing an interactive behavior portrait IBC of the user u ,IBC u =(LTA u ,WTA u ,LIA u ,KSA u ,IDA u ,CTA u )。
Preferably, the step 2) is implemented by the following steps:
2.1 Extract the login interval sequence: the sequence of the login interval calculated in the step 1.4) is lis u ={t 1 ,t 2 ,...,t n },t n The nth login interval time is used, and n +1 is the number of all interaction behavior records of the user; login interval lis u Is represented by lis' u ={t' 1 ,t' 2 ,...,t' n }, subsequence lis' u I.e. in the original sequence lis u A sequence consisting of any one of the moieties in (a);
2.2 ) sequentially traverse the sequence of log-in intervals:
initializing a null array C, slave lis u ={t 1 ,t 2 ,...,t n Beginning from head to tail, sequentially traversing all subsequences, and sequentially calculating a period stability threshold value mu corresponding to a subsequence and a stability state TPF of a subsequence of a user u for each subsequence u Periodic stability threshold μ and subsequence stability status TPF for user u u The calculation is as follows:
μ=1/length(list),
Figure BDA0002957877240000051
wherein list represents a certain subsequence of the sequence of login time intervals, and length (list) represents the length of the subsequence; TPF u Middle t i Represents lis' u Each of the elements of (a) to (b),
Figure BDA0002957877240000052
is lis' u Mean value of all elements in (1), mu represents a partition threshold value, and the larger mu is s' u The fewer the middle elements are, the more discrete and sparse the user behavior cycle is; conversely, the smaller mu is, lis' u The more elements in the user, the more continuous the user behavior period is;
2.3 Dividing sequence:
according to the period stability threshold value and the sub-sequence stability state TPF of the user u u The log interval sequence is entered according to the following formulaThe division into lines is carried out in such a way that,
Figure BDA0002957877240000053
storing the subsequence which meets the formula into an array C, and meeting the following requirements in the traversal process: traversing the longer subsequence preferentially, and if the calculated value in the longer subsequence meets the period stability threshold value mu, all the subsequences in the subsequence are not judged; similarly, if the current subsequence is a subsequence of any sequence in the periodic behavior sequence set C, the judgment is not performed;
2.4 Output a periodic sequence of interactive behaviors:
and outputting an array C, namely a periodic behavior sequence set of the user:
2.5 Construct an interactive behavioral profile with periodic attributes:
obtaining the interactive behavior images pbc in different periods according to the period sequence output in the step 2.4) and the description method of the interactive behaviors in the step 1.2) to the step 1.8) u
Figure BDA0002957877240000061
By using
Figure BDA0002957877240000062
Representing an interactive behavior image set corresponding to j behavior periods of a user u; finally defining the combined normal interactive behavior portrait with periodic attribute as
Figure BDA0002957877240000063
Wherein
Figure BDA0002957877240000064
Figure BDA0002957877240000065
And the interactive behavior portrait corresponding to the latest k cycles in the normal user cycle interactive behavior portrait collection.
The invention has the beneficial effects that: the invention discloses a multi-factor interactive behavior abnormity detection method with periodic attributes. Not only are the login time attribute, the working time login attribute, the login interval and the key page retention time attribute considered, but also the user interaction duration and the key path trigger attribute are considered, and the system interaction behavior of the user is more fully described; the provided interactive behavior period division algorithm analyzes the period characteristics of the user behaviors, meanwhile, the interactive behaviors are measured by using the adjusted cosine similarity in the abnormal behavior detection model, and the depiction of behavior direction characteristics is increased on the basis of ensuring that the numerical characteristics of the behavior vectors are not damaged. Technical support is provided for abnormal judgment and detection of the interactive behaviors.
Drawings
FIG. 1 is a general framework diagram of the interactive behavior multi-factor anomaly detection method with periodic attributes according to the present invention;
fig. 2 is a flowchart of an implementation of the method for detecting abnormal multi-factor interactive behavior with periodic attributes according to the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.
The method is mainly characterized in that the interaction behaviors of users can show certain fluctuation and mutation due to the influence of different scenes, the description of the mutation of the behaviors is often ignored in the existing research, and in order to better describe the characteristics of the interaction behaviors of the users, the method for dividing the interaction behavior periods provides an interaction behavior period dividing method, and the interaction behaviors meeting the threshold are divided into different behavior periods by calculating the period stability threshold of the users; on the basis, a method for depicting the maximum deviation of the user interaction behavior from the reference is provided, and the depiction of the directional characteristic of the behavior reference vector is enhanced on the basis of ensuring that the numerical characteristic of the behavior reference vector is not damaged; and finally, providing a multi-factor interactive behavior anomaly detection model with periodic attributes.
The invention discloses a multi-factor interactive behavior detection method with periodic attributes, which is an integral frame diagram of the multi-factor abnormal detection method with the periodic attributes shown in figure 1, and an interactive behavior model is constructed through the following three steps: firstly, on the basis of normal user interaction behavior data, establishing a user interaction behavior portrait with periodic attributes; secondly, calculating the maximum deviation benchmark of the interactive behavior; and thirdly, establishing a multi-factor interactive behavior recognition method.
1. Establishing a normal user interaction behavior portrait: normal transaction data of the user are extracted from a user historical transaction database, not only are login time attributes, working time login attributes, login intervals and key page stay time attributes considered, but also user interaction duration attributes and key path trigger attributes are considered, and on the basis, interaction behaviors of the user are described, and a user interaction behavior portrait is constructed. The method is mainly realized by the following steps as shown in figure 2.
S101: extracting a user historical normal interaction behavior record:
concentrating historical interactive behavior data of a user, and extracting positive sample data of the user according to positive and negative (normal transaction and abnormal transaction) sample mark fields;
s102: calculating the login time attribute:
in the normal interaction behavior record of the user obtained in the step S101, a login time set of the user is further extracted, and one day is divided into a plurality of time intervals according to a daily hour division method 1 ,time 2 ,...,time n And calculating the probability of login occurrence of the user in each interval, and calculating the login time attribute of the user by using the following formula.
Figure BDA0002957877240000081
Wherein, time n Is n time interval attributes, | lta n | is the number of logins in the nth time interval,
Figure BDA0002957877240000082
the total number of logins for user u per day. Further, the log-in time attribute LTA of the user u is obtained u =(time 1 ,time 2 ,...,time n )。
S103: calculating a working time login attribute:
extracting a set of transaction time, respectively calculating the transaction probability of the transaction occurring in working time and non-working time (working day off duty time, double break and holiday), and obtaining whether the transaction of the user u is a working time login attribute WTA u =(isworktime,noworktime)。
S104: calculating the login interval attribute:
Figure BDA0002957877240000083
wherein
Figure BDA0002957877240000084
Elements in the set of login intervals;
Figure BDA0002957877240000085
the time of logging in the system for the ith time of the user u;
Figure BDA0002957877240000086
the time of logging in the system for the i-1 st time of the user u.
And obtaining a time interval change amplitude set of two adjacent logins of the user by using the formula, extracting the login time interval set of the user, obtaining a first quartile, a second quartile and a third quartile of the set by using a quantile analysis method, and obtaining the upper limit and the lower limit of the set. The first, second, and third quartiles are values of variables at the 25 th, 50 th, and 75 th positions after all data of the whole set are arranged in order of size. The set is divided into 5 sub-sets period 1 ,period 2 ,...,period n These 5 constitute the user login interval attribute. Calculated using the following formulaAnd the user login interval attribute is output.
Figure BDA0002957877240000087
Wherein period n For entry in the Interval Attribute, | lia n The number of times the user login interval time is within the nth subset,
Figure BDA0002957877240000088
the number of logins for user u. Further, the login interval attribute LIA of the user u is obtained u =(period 1 ,period 2 ,period 3 ,period 4 ,period 5 )。
S105: calculating the stay time attribute of the key page of the user:
sequentially calculating the key page a of the user in the normal interaction behavior log of the user u page_no Sum of residence times of = key
Figure BDA0002957877240000091
Wherein
Figure BDA0002957877240000092
Calculating to obtain the key page residence time attribute KSA of the user u by using a quantile analysis method according to the same calculation method in the S104 u =(distance 1 ,distance 2 ,distance 3 ,distance 4 ,distance 5 )。
S106: calculating a user interaction duration attribute:
in the normal interaction behavior log of the user u, the sum of the browsing time of each page in one interaction operation of the user u is calculated to obtain a set
Figure BDA0002957877240000093
Calculating to obtain the user u interaction duration time attribute IDA by using quantile analysis method according to the same calculation method in S104 u =(duration 1 ,duration 2 ,...,duration n )。
S107: calculating the triggering attribute of the user critical path:
and sequentially calculating the retention time of the key page and the retention time of the non-key page of the system in one interactive operation of the user in the normal interactive behavior log of the user u. Calculating to obtain the user critical path trigger attribute CTA by using quantile analysis method according to the same calculation method in S104 u =(ratio 1 ,ratio 2 )。
S108: constructing a user interaction behavior portrait:
according to the attributes of the user u in each dimension obtained in the last step, an interactive behavior portrait IBC of the user is constructed u ,IBC u =(LTA u ,WTA u ,LIA u ,KSA u ,IDA u ,CTA u )。
2. Constructing an interactive behavior portrait with a period attribute: the periodic attribute characteristics of the user are extracted on the basis of the interactive behavior of the user, and the system interactive behavior of the user is more fully described. Firstly, generating a behavior interval sequence of a user according to a user behavior record, and calculating a periodic stability threshold of the user; secondly, according to a behavior period division method, sequentially comparing whether adjacent elements in a behavior interval sequence meet a period stability threshold value, outputting an interactive behavior period sequence of a user, and finally calculating an interactive behavior portrait UCP with a period attribute u The method comprises the following steps:
s201: extracting a login interval sequence:
the sequence of the log-in interval calculated in step S104 is lis u ={t 1 ,t 2 ,...,t n },t n The nth login interval time is set, and n +1 is the number of all interaction behavior records of the user; logging Interval lis u Is represented by lis' u ={t' 1 ,t' 2 ,...,t' n }, subsequence lis' u I.e. in the original sequence lis u Any part of (a).
S202: sequentially traversing the login interval sequence
Initializing a null array C, slave lis u ={t 1 ,t 2 ,...,t n Beginning to end, sequentially traversing all subsequences, and sequentially calculating a period stability threshold value mu corresponding to a subsequence and a stable state TPF of a subsequence of a user u for each subsequence u . Periodic stability threshold μ and subsequence steady state TPF for user u u The calculation is as follows:
μ=1/length(list)
Figure BDA0002957877240000101
wherein list represents a certain subsequence of the login time interval sequence, and length (list) represents the length of the subsequence; TPF u Middle t i Represents lis' u Each of the elements of (a) to (b),
Figure BDA0002957877240000102
is lis' u Mean value of all elements in (1), mu represents a partition threshold value, and the larger mu is s' u The fewer the middle elements are, the more discrete and sparse the user behavior cycle is; otherwise, the smaller mu is, lis' u The more elements in (a), the more continuous the user behavior cycle.
S203: partitioning sequences
According to the period stability threshold value and the stable state TPF of the subsequence of the user u u The log interval sequence is divided as follows.
Figure BDA0002957877240000103
The subsequences that satisfy the above formula are stored in array C. And in the traversing process, the following conditions are satisfied: traversing the longer subsequence preferentially, and if the calculated value in the longer subsequence meets the period stability threshold value mu, all the subsequences in the subsequence are not judged; similarly, if the current subsequence is a subsequence of any one of the periodic behavior sequence sets C, it is not judged any more.
S204: outputting a periodic sequence of interactive actions
And outputting an array C, namely a periodic behavior sequence set of the user.
S205: constructing an interactive behavior portrait with periodic attributes:
obtaining the interactive behavior images pbc in different periods according to the period sequence output in S204 and the method for describing the interactive behavior in S102-S108 u
Figure BDA0002957877240000111
By using
Figure BDA0002957877240000112
And representing the corresponding interactive behavior image set in j behavior periods of the user u. Finally, defining the combined normal interactive behavior portrait with periodic attribute as
Figure BDA0002957877240000113
Wherein
Figure BDA0002957877240000114
Figure BDA0002957877240000115
And the interactive behavior portrait corresponding to the latest k cycles in the normal user cycle interactive behavior portrait collection. Since the extracted data is only normal interaction behavior data of user u, the resulting UCP u Portraying only the normal interactive behavior of the user.
3. Calculating the maximum deviation benchmark of the interactive behavior: the method comprises the following steps of determining the maximum deviation standard of the interactive behaviors of each user according to the interactive behavior portrait of the user and the interactive behavior record of the user.
S301: extracting a historical interaction record:
extracting all historical interactive behavior data sets of the user, wherein all the historical interactive behavior data sets comprise all positive samples and all negative samples;
s302: generating a user interaction behavior portrait:
using in S301According to the step of 'establishing a user interactive behavior portrait with period attribute', obtaining a user interactive behavior portrait UCP 'with period attribute' u
S303: calculating the similarity between the normal interaction behaviors of the user and the user interaction behavior portrait:
recording UCP 'for each interaction behavior' u It will be associated with a normal user interaction behavior representation UCP with periodic properties u And matching, and sequentially calculating the similarity between each piece of historical interactive behavior of the user and the normal interactive behavior portrait according to the following calculation method:
Figure BDA0002957877240000116
in the formula A i And B i Respectively represent normal interaction behavior vectors UCP consisting of n components u And a historical Interactive behavior Picture UCP 'consisting of n components' u
Figure BDA0002957877240000121
And
Figure BDA0002957877240000122
respectively representing the mean values of the two vector components; the cosine similarity, i.e. the value in all dimensions of each component of the vector, is adjusted minus the mean of the component. The similarity set between the normal interaction behavior of the user and the historical interaction behavior portrait can be calculated in sequence by using a formula
Figure BDA0002957877240000123
Can calculate S in the set u Max of maximum similarity sim And minimum similarity Min sim
S304: calculating the partitioning effect according to the historical transaction of the user:
according to maximum similarity Max sim And minimum similarity Min sim The range of (1) is sequentially valued from the range, the historical interactive behaviors of the user are divided into normal behaviors and abnormal behaviors, and the normal behaviors and the abnormal behaviors are calculatedThe effect DB is divided.
Figure BDA0002957877240000124
Figure BDA0002957877240000125
DB=λ*PP+(1-λ)*NN
PP in the formula represents the proportion of the actual normal behavior in the normal behavior; the NN representation model result in the formula is the proportion of the actual abnormal behavior in the abnormal behavior; DB represents the partitioning effect, and is the sum of different weights of PP and NN, and lambda is the weight. It can be seen that the larger the λ value is, the higher the attention of the model to the normal behavior is, and conversely, the smaller the λ value is, the higher the attention of the model to the abnormal behavior is.
S305: calculating the maximum deviation reference:
taking the value with the best division effect as the maximum deviation reference of the interactive behavior of the user, and recording the value as Benchmark u
4. Establishing a multi-factor interactive behavior recognition method: the normal interaction behavior vector UCP of the user u can be obtained through calculation in the steps u Interaction behavior with user u deviates maximally from reference Benchmark u . The maximum deviation reference of the user is the optimal dividing parameter of normal and abnormal behaviors in the historical interactive behaviors of the user, so that the deviation degree of the current interactive behavior and the historical interactive behavior image of the user can be calculated, and whether the deviation degree is in Benchmark is judged u Within an acceptable range of.
S401: calculating the current interaction behavior portrait:
calculating the UCP (user-to-be-judged) interactive behavior portrait of the user according to the step one now
S402: judging a model:
calculating the deviation degree of the current interactive behavior and the user normal interactive behavior portrait, and judging whether the deviation degree is in the Benchmark u Within an acceptable range. The calculation method is as followsThe following:
f(u)=similarity[UCP u ,UCP now ]-Benchmark u
the model f (u) divides the interactive behavior space into two parts, i.e., f (u) > 0 and f (u) ≦ 0. Wherein the space where f (u) ≦ 0 is considered the user normal trading behavior space and the space where f (u) > 0 is considered the user abnormal behavior space. Therefore, if f (u) is less than or equal to 0, the current interaction behavior of the user u is normal; otherwise, if f (u) > 0, the current interaction behavior of the user u is abnormal.

Claims (1)

1. A multi-factor interactive behavior anomaly detection method with periodic attributes is characterized by specifically comprising the following steps:
1) Establishing a normal user interaction behavior portrait: extracting normal transaction data of the user from a user historical transaction database, establishing a login time attribute, a working time login attribute, a login interval, a key page dwell time attribute, a user interaction duration attribute and a key path trigger attribute, and constructing an interactive behavior portrait IBC of the user comprising a multi-dimensional attribute u
2) On the basis of the step 1), generating a behavior interval sequence of the user according to the user behavior record, and calculating a periodic stability threshold of the user; secondly, according to a behavior period division method, sequentially comparing whether adjacent elements in a behavior interval sequence meet a period stability threshold value, outputting an interactive behavior period sequence of a user, and finally calculating a normal interactive behavior portrait UCP with a period attribute u
3) Calculating the maximum deviation benchmark of the interactive behavior: extracting all historical interactive behavior data of a detected user, and acquiring an interactive behavior portrait UCP with periodic attribute of the user according to the steps 1) and 2) u ' As user historical interaction behavior, for UCP u ' Each interactive behavior record in `, with a normal user interaction behavior profile UCP having a periodic attribute u Matching, calculating the similarity between each historical interactive behavior image of the user and the interactive behavior image of the normal user in sequence, and calculating according to the maximum similarity Max sim And minimum similarity Min sim From which in turn the value will be takenThe user's historical interactive behavior is divided into normal behavior and abnormal behavior, and a division effect DB is calculated,
taking the value with the best partitioning effect as the maximum deviation reference of the interactive behavior of the user, and recording the value as Benchmark u
4) Establishing a multi-factor interactive behavior recognition method: calculating the UCP of the user current interactive behavior portrait according to the step 1) now Calculating the figure UCP of the current interactive behavior and the normal interactive behavior of the user u Such degree of deviation is at Benchmark u If the deviation degree is not within the acceptable range of (2), the interaction is judged to be normal, if the deviation degree is not in the Benchmark u If the received data is within the acceptable range, judging the interaction is abnormal; the specific implementation method of the step 1) is as follows:
1.1 Extract user historical normal interaction behavior records:
collecting historical interaction behavior data of a user, marking positive and negative fields for a sample according to normal interaction and abnormal interaction of the user, and extracting normal interaction data of the user as positive sample data;
1.2 Calculate login time attribute:
extracting the login time set of the user from the positive sample data, and dividing one day into a plurality of time intervals according to a daily hour division method 1 ,time 2 ,...,time n Calculating the probability of login occurrence of the user in each interval, calculating the attribute of the login time of the user by using the following formula,
Figure FDA0003769015360000021
wherein, time n Is n time interval attributes, | lta n | is the number of logins in the nth time interval,
Figure FDA0003769015360000022
the total times of logging in for the user u in one day are obtained, and the log-in time attribute LTA of the user u is further obtained u =(time 1 ,time 2 ,...,time n );
1.3 Compute work time login attribute:
extracting a set of transaction time, respectively calculating the transaction probability of the transaction occurring in working time and non-working time, and obtaining whether the transaction of the user u is a working time login attribute WTA u =(isworktime,noworktime);
1.4 Calculate login interval attribute:
Figure FDA0003769015360000023
wherein
Figure FDA0003769015360000024
Elements in the set of login intervals;
Figure FDA0003769015360000025
the time of logging in the system for the ith time of the user u;
Figure FDA0003769015360000026
the time of logging in the system for the i-1 st time of the user u;
obtaining a time interval change amplitude set of two adjacent logins of the user by using the formula, extracting the login time interval set of the user, obtaining a first quartile, a second quartile and a third quartile of the set by using a quantile analysis method, obtaining the upper limit and the lower limit of the set, wherein the first quartile, the second quartile and the third quartile are variable values which are positioned at the 25 th position, the 50 th position and the 75 th position after all data of the set are arranged according to the size sequence, and dividing the set into 5 sub-sets period 1 ,period 2 ,...,period 5 The 5 configuration user login interval attributes are calculated by using the following formula:
Figure FDA0003769015360000027
wherein period n Is for climbingEntry, | lia, in the record Interval Attribute n The number of times the user login interval time is within the nth subset,
Figure FDA0003769015360000031
logging in for the user u; further, the login interval attribute of the user u is obtained
LIA u =(period 1 ,period 2 ,period 3 ,period 4 ,period 5 );
1.5 Calculate user key page dwell time attribute:
sequentially calculating the key page a of the user in the normal interaction behavior log of the user u page_no Sum of residence times of = keys result in set
Figure FDA0003769015360000032
Wherein
Figure FDA0003769015360000033
Calculating to obtain the key page residence time attribute KSA of the user u by using a quantile analysis method according to the same calculation method in the step 1.4) u =(distance 1 ,distance 2 ,distance 3 ,distance 4 ,distance 5 );
1.6 Calculate user interaction duration attribute:
in the normal interaction behavior log of the user u, the sum of the browsing time of each page in one interaction operation of the user u is calculated to obtain a set
Figure FDA0003769015360000034
Calculating to obtain the user u interaction duration attribute IDA by using quantile analysis method according to the same calculation method in 1.4) u =(duration 1 ,duration 2 ,...,duration 5 );
1.7 Computing user critical path trigger attributes:
in the normal interaction behavior log of the user u, sequentially calculating the system key page stop in one-time interaction operation of the userThe retention time and the retention time of the non-key page are collected, and the user key path trigger attribute CTA is obtained by calculation u =(ratio 1 ,ratio 2 );
1.8 Build a user interaction behavior profile:
obtaining the attribute of each dimension of the user u, and constructing an interactive behavior portrait IBC of the user u ,IBC u =(LTA u ,WTA u ,LIA u ,KSA u ,IDA u ,CTA u ) (ii) a The step 2) is realized by the following steps:
2.1 Extract the login interval sequence: the sequence of the login interval calculated in the step 1.4) is lis u ={t 1 ,t 2 ,...,t n },t n The nth login interval time is set, and n +1 is the number of all interaction behavior records of the user; logging Interval lis u Is represented by lis' u ={t 1 ',t' 2 ,...,t' n }, subsequence lis' u I.e. in the original sequence lis u A sequence consisting of any one of the moieties in (a);
2.2 ) sequentially traverse the sequence of log-in intervals:
initializing a null array C, slave lis u ={t 1 ,t 2 ,...,t n Starting from the first position, sequentially traversing all the subsequences, and sequentially calculating a period stability threshold value mu corresponding to each subsequence and a stability state TPF of the subsequence of the user u for each subsequence u Periodic stability threshold μ and subsequence stability status TPF for user u u The calculation is as follows:
μ=1/length(list),
Figure FDA0003769015360000041
wherein list represents a certain subsequence of the sequence of login time intervals, and length (list) represents the length of the subsequence; TPF u Middle t i Represents lis' u Each of the elements of (a) to (b),
Figure FDA0003769015360000042
is lis' u Mean value of all elements in (1), mu represents a partition threshold value, and the larger mu is s' u The fewer the middle elements are, the more discrete and sparse the user behavior cycle is; conversely, the smaller mu is, lis' u The more elements in the user, the more continuous the user behavior period is;
2.3 Dividing sequence:
according to the period stability threshold value and the sub-sequence stability state TPF of the user u u The login interval sequence is divided according to the following formula,
Figure FDA0003769015360000043
storing the subsequence which meets the formula into an array C, and meeting the following requirements in the traversal process: traversing the longer subsequence preferentially, and if the calculated value in the longer subsequence meets the period stability threshold value mu, all the subsequences in the subsequence are not judged; similarly, if the current subsequence is a subsequence of any sequence in the periodic behavior sequence set C, the judgment is not performed;
2.4 Output a periodic sequence of interactive behaviors:
and outputting an array C, namely a periodic behavior sequence set of the user:
2.5 Construct an interactive behavioral portrait with periodic attributes:
obtaining the interactive behavior images pbc in different periods according to the period sequence output in 2.4) and the description method of the interactive behaviors in 1.2) -1.8) above u
Figure FDA0003769015360000051
By using
Figure FDA0003769015360000052
Representing an interactive behavior image set corresponding to j behavior periods of a user u; finally defining the combined normal interactive behavior portrait with periodic attribute as
Figure FDA0003769015360000053
Wherein
Figure FDA0003769015360000054
Figure FDA0003769015360000055
And the interactive behavior portrait corresponding to the latest k cycles in the normal user cycle interactive behavior portrait collection.
CN202110228567.4A 2021-03-02 2021-03-02 Multi-factor interactive behavior anomaly detection method with periodic attribute Active CN112966732B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110228567.4A CN112966732B (en) 2021-03-02 2021-03-02 Multi-factor interactive behavior anomaly detection method with periodic attribute

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110228567.4A CN112966732B (en) 2021-03-02 2021-03-02 Multi-factor interactive behavior anomaly detection method with periodic attribute

Publications (2)

Publication Number Publication Date
CN112966732A CN112966732A (en) 2021-06-15
CN112966732B true CN112966732B (en) 2022-11-18

Family

ID=76276385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110228567.4A Active CN112966732B (en) 2021-03-02 2021-03-02 Multi-factor interactive behavior anomaly detection method with periodic attribute

Country Status (1)

Country Link
CN (1) CN112966732B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106789885A (en) * 2016-11-17 2017-05-31 国家电网公司 User's unusual checking analysis method under a kind of big data environment
CN108881194A (en) * 2018-06-07 2018-11-23 郑州信大先进技术研究院 Enterprises user anomaly detection method and device
CN110163618A (en) * 2019-05-31 2019-08-23 深圳前海微众银行股份有限公司 Extremely detection method, device, equipment and the computer readable storage medium traded
CN110611684A (en) * 2019-09-27 2019-12-24 国网电力科学研究院有限公司 Method, system and storage medium for detecting periodic Web access behavior
CN110992041A (en) * 2019-06-18 2020-04-10 东华大学 Individual behavior hypersphere construction method for online fraud detection

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103532797B (en) * 2013-11-06 2017-07-04 网之易信息技术(北京)有限公司 A kind of User logs in method for monitoring abnormality and device
CN107481090A (en) * 2017-07-06 2017-12-15 众安信息技术服务有限公司 A kind of user's anomaly detection method, device and system
CN110519208B (en) * 2018-05-22 2021-11-30 华为技术有限公司 Anomaly detection method, device and computer readable medium
CN111400357A (en) * 2020-02-21 2020-07-10 中国建设银行股份有限公司 Method and device for identifying abnormal login
CN111611519B (en) * 2020-05-28 2023-07-11 上海观安信息技术股份有限公司 Method and device for detecting personal abnormal behaviors

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106789885A (en) * 2016-11-17 2017-05-31 国家电网公司 User's unusual checking analysis method under a kind of big data environment
CN108881194A (en) * 2018-06-07 2018-11-23 郑州信大先进技术研究院 Enterprises user anomaly detection method and device
CN110163618A (en) * 2019-05-31 2019-08-23 深圳前海微众银行股份有限公司 Extremely detection method, device, equipment and the computer readable storage medium traded
CN110992041A (en) * 2019-06-18 2020-04-10 东华大学 Individual behavior hypersphere construction method for online fraud detection
CN110611684A (en) * 2019-09-27 2019-12-24 国网电力科学研究院有限公司 Method, system and storage medium for detecting periodic Web access behavior

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Abnormal Behavior Detection Scheme of UAV Using Recurrent Neural Networks;KE XIAO et al;《SPECIAL SECTION ON ARTIFICIAL INTELLIGENCE IN CYBERSECURITY》;20190822;第110293-110305页 *
基于内网用户异常行为安全管理研究;匡石磊等;《邮电设计技术》;20190420(第04期);第22-26页 *
基于用户画像的异常行为检测模型;赵刚和姚兴仁;《技术研究》;20171231(第7期);第18-24页 *
多维时间序列异常检测算法综述;胡珉等;《计算机应用》;20200610;第40卷;第1553-1564页 *

Also Published As

Publication number Publication date
CN112966732A (en) 2021-06-15

Similar Documents

Publication Publication Date Title
US8676726B2 (en) Automatic variable creation for adaptive analytical models
CN109447099B (en) PCA (principal component analysis) dimension reduction-based multi-classifier fusion method
CN108470052B (en) Anti-trust attack recommendation algorithm based on matrix completion
CN110084609B (en) Transaction fraud behavior deep detection method based on characterization learning
Lei et al. HNN: a novel model to study the intrusion detection based on multi-feature correlation and temporal-spatial analysis
Cai et al. Mitigating behavioral variability for mouse dynamics: A dimensionality-reduction-based approach
CN113378160A (en) Graph neural network model defense method and device based on generative confrontation network
Wanda et al. URLDeep: Continuous Prediction of Malicious URL with Dynamic Deep Learning in Social Networks.
CN111143838A (en) Database user abnormal behavior detection method
CN113221104A (en) User abnormal behavior detection method and user behavior reconstruction model training method
CN103530312A (en) User identification method and system using multifaceted footprints
CN110929525A (en) Network loan risk behavior analysis and detection method, device, equipment and storage medium
CN110290101B (en) Deep trust network-based associated attack behavior identification method in smart grid environment
CN111310185B (en) Android malicious software detection method based on improved stacking algorithm
Pandey et al. A metaheuristic autoencoder deep learning model for intrusion detector system
Liu et al. Automatic feature extraction and selection for machine learning based intrusion detection
CN112966732B (en) Multi-factor interactive behavior anomaly detection method with periodic attribute
Yang et al. An academic social network friend recommendation algorithm based on decision tree
Wang et al. Conscience online learning: an efficient approach for robust kernel-based clustering
Song et al. Isolated forest in keystroke dynamics-based authentication: Only normal instances available for training
CN110197066B (en) Virtual machine monitoring method and system in cloud computing environment
CN114519605A (en) Advertisement click fraud detection method, system, server and storage medium
Chandrasekar et al. A dexterous feature selection artificial immune system algorithm for keystroke dynamics
CN113438239A (en) Network attack detection method and device based on depth k nearest neighbor
CN112463964A (en) Text classification and model training method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant